RFR(S): 8015237: Parallelize string table scanning during strong root processing
Thomas Schatzl
thomas.schatzl at oracle.com
Mon May 27 08:04:17 UTC 2013
Hi John,
On Fri, 2013-05-24 at 15:36 -0700, John Cuthbertson wrote:
> Hi Everyone,
>
> On 5/24/2013 3:19 PM, John Cuthbertson wrote:
>
> > Hi Everyone,
>>
> > Can I have a couple of reviewers look over these changes - the
> > webrev is: http://cr.openjdk.java.net/~johnc/8015237/webrev.0/
>>
> > Summary:
> > On some workloads we are seeing that the scan of the intern string
> > table (among others) can sometimes take quite a while. This showed
> > up on some FMW workloads with G1 where the scan of the string table
> > dominated the pause time for some pauses. G1 was particularly
> > affected since it doesn't do class unloading (and hence pruning of
> > the string table) except at full GCs. The solution was to change the
> > string table from being considered a single root task and treat
> > similarly to the Java thread stacks: each GC worker claims a given
> > number of buckets and scans the entries in those buckets.
>>
> > Testing
> > Kitchensink; jprt; GC test suite. With all collectors.
Looks good. A few minor nits:
- in StringTable::possibly_parallel_oops_do() there is some commented
debugging code left.
- one suggestion for refactoring: the worker loop for
StringTable::possibly_parallel_oops_do() and ::oops_do() are the same,
except for the difference in the bounds for that loop.
So maybe this could be factored out into a helper routine (e.g.
StringTable::process_entries(OopClosure * f, uint start_idx, uint
end_idx) {
for (uint i = start_idx; i < end_idx; i += 1) {
HashtableEntry<oop, mtSymbol>** p = the_table()->bucket_addr(i);
HashtableEntry<oop, mtSymbol>* entry = the_table()->bucket(i);
while (entry != NULL) {
f->do_oop((oop*)entry->literal_addr());
// Did the closure remove the literal from the table?
if (entry->literal() == NULL) {
assert(!entry->is_shared(), "immutable hashtable entry?");
*p = entry->next();
the_table()->free_entry(entry);
} else {
p = entry->next_addr();
}
entry = (HashtableEntry<oop, mtSymbol>*)HashtableEntry<oop,
mtSymbol>::make_ptr(*p);
}
}
}
and called by both StringTable::possibly_parallel_oops_do()
and ::oops_do() with appropriate parameters?
(maybe find a better name for that method :)
- in SharedHeap.cpp:163 there is a misplaced colon in the call to
Threads::possibly_parallel_oops_do(); this has been the case before as
well, so it's not really an issue.
>>
> > Overhead:
> > Not real performance numbers but I did some measurement of the
> > synchronization overhead of using 1 GC worker thread. They are
> > summarized here:
>>
>>
> >
> >
> > 0-threads (ms)
> >
> > 1-thread-chunked (ms)
> >
> > Min
> > 0.200
> >
> > 0.300
> >
> > Max
> > 6.900
> >
> > 8.800
> >
> > Avg
> > 0.658
> >
> > 0.794
> >
> >
> > These were from 1 hour long runs of Kitchensink with around ~2800
> > GCs in each run.
- this is just a suggestion: if this is a problem (it does not seem so),
and of course it also depends on the size of the table (the number of
synchronizations), why not make the size of the chunks a function of the
number of threads and the size of the string table?
Thomas
More information about the hotspot-gc-dev
mailing list