Parallelizing symbol table/string table scan

Thomas Schatzl thomas.schatzl at oracle.com
Mon Nov 11 08:31:12 PST 2013


Hi Karen,

On Mon, 2013-11-11 at 10:46 -0500, Karen Kinnear wrote:
> Thank you for asking and for filing RFEs - this is so much better than a webrev
> as the first sighting :-)

:)

> On Nov 11, 2013, at 8:56 AM, Thomas Schatzl wrote:
> 
> > Hi all,
> > 
> >  recently we (the gc team) noticed severe performance issues with
> > symbol table and string table scan during remark.
> > 
> > Basically, in G1 these pauses are the largest pauses on a reasonably
> > tuned system. Also, in particular, symbol table scan alone takes 50% of
> > total remark time. String table scan takes another 13%.
> > 
> > At least symbol table scan is a pretty big issue.
> > 
> > The simple approach to those is to parallelize these tasks of course,
> > however I would like to query you for comments or suggestions :)
> > (I am simply throwing some ideas on the wall, in the hope something
> > sticks...)
> I don't see any reason not to parallelize the scanning.

That's on the agenda anyway - as soon as there is class unloading during
remark, we need to look at the symbol table more quickly than it is done
right now.

> > One idea that came up to optimize that further has been to not do string
> > table or symbol table scrubbing after gc at all if no class unloading
> > has been done, assuming that the amount of dead entries are zero anyway.
>
> Just to clarify - there are temp Symbols in the symbol table - so the number
> of dead entries with no class unloading will be close to zero, i.e. small enough
> that your suggestion of not doing scrubbing unless there has been class loading
> makes sense - just don't assume zero.

Thanks for the information, that's exactly what we need.

Do you have any idea (statistics) about the number of these temp symbols
compared to ones generated by class loading? Is there a way to
distinguish them easily? (So that I can implement statistics myselves).

I.e. would it make sense to split the symbol table into two tables, one
for temp symbols and one for symbols generated by class loading? So that
we could always have a look at the temp symbols, but only at others
during class loading?

Are these temp symbols separate from the class loading symbols, e.g. is
there a possibility to have temp symbols to look like "class loading"
symbols, and the other way around?

> > This is (imo) true for the string table at least (because they are
> > strong roots if not doing class unloading), but I am not so sure about
> > the symbol table.
> > You probably have more experience about the use of the symbol table, so
> > any ideas what could cause symbol table entries to get stale other than
> > class unloading, and if so, is this a big concern?
> > 
> > Another option would be to do this symbol table scrubbing only after a
> > certain amount of operations on the symbols, not sure if there is an
> > indicator (that does not decrease perf for retrieving too much) for
> > that.
> > 
> > Another idea, again for the symbol table is to scrub it either
> > incrementally (eg. depending on available time), or concurrently. I.e.
> > some background task periodically waking up and scrubbing (parts of) the
> > symbol table.
>
> The tables are read lock-free, with the assumption that they only have
> entries removed at a safepoint. If you keep that assumption, you should be
> able to do incremental scrubbing.

Thanks,
  Thomas



More information about the hotspot-runtime-dev mailing list