Parallelizing symbol table/string table scan

Mon Nov 11 13:56:50 UTC 2013

Hi all,

  recently we (the gc team) noticed severe performance issues with
symbol table and string table scan during remark.

Basically, in G1 these pauses are the largest pauses on a reasonably
tuned system. Also, in particular, symbol table scan alone takes 50% of
total remark time. String table scan takes another 13%.

At least symbol table scan is a pretty big issue.

The simple approach to those is to parallelize these tasks of course,
however I would like to query you for comments or suggestions :)
(I am simply throwing some ideas on the wall, in the hope something
sticks...)

One idea that came up to optimize that further has been to not do string
table or symbol table scrubbing after gc at all if no class unloading
has been done, assuming that the amount of dead entries are zero anyway.

This is (imo) true for the string table at least (because they are
strong roots if not doing class unloading), but I am not so sure about
the symbol table.
You probably have more experience about the use of the symbol table, so
any ideas what could cause symbol table entries to get stale other than
class unloading, and if so, is this a big concern?

Another option would be to do this symbol table scrubbing only after a
certain amount of operations on the symbols, not sure if there is an
indicator (that does not decrease perf for retrieving too much) for
that.

Another idea, again for the symbol table is to scrub it either
incrementally (eg. depending on available time), or concurrently. I.e.
some background task periodically waking up and scrubbing (parts of) the
symbol table.

Comments, suggestions?

I also created a few RFEs for these issues, see

Symbol table:
https://bugs.openjdk.java.net/browse/JDK-8027455
https://bugs.openjdk.java.net/browse/JDK-8027543
String table:
https://bugs.openjdk.java.net/browse/JDK-8027454
https://bugs.openjdk.java.net/browse/JDK-8027476

Thomas