RFR (S): 8152438: Threads may do significant work out of the non-shared overflow buffer
Thomas Schatzl
thomas.schatzl at oracle.com
Wed May 11 13:05:29 UTC 2016
Hi all,
On Wed, 2016-05-11 at 09:35 +0200, Thomas Schatzl wrote:
> Hi all,
>
> can I have reviews for the following change that fixes signficiant
> imbalance in work performed during GC?
>
> We observed that threads in certain situations put a lot of
> references
> into the thread-private overflow queue. Since this overflow queue is
> thread-local, and other threads may not steal work from this queue,
> it
> happens that sometimes the parallel copying phase is severely
> serialized.
>
> This is particularly problematic on large machines with many threads
> but comparatively little work to do. In one situation on a real world
> application, a 200% overall throughput improvement has been observed
> by
> this change.
>
> The CR mentions a micro-benchmark that exhibits the same problems.
>
> The proposed solution is to, for a particular thread, if it processes
> entries from the overflow queue, first try to just move the work item
> to the shared queue where other threads can steal from. Only process
> the item directly if the shared queue is full already.
>
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8152438
> Webrev:
> http://cr.openjdk.java.net/~tschatzl/8152438/webrev/
> Testing:
> jprt, regular perf testing with no apparent pause time regressions,
> some benchmarks
had some offline discussion with Erik about the change, and the main
gripe with the change has mainly been putting additional policy code
about how the OverflowTaskQueue works somewhere completely unrelated,
in this case G1ParScanThreadState::trim_queue().
Looking at other uses, we found that all of these do the same, and it
would be much better to have collector/case specific
"OverflowTaskQueue"s that encapsulate the behavior. This would also
allow potential sharing or at least comparison of them.
We felt that looking into that is out of scope for this change (apart
from one minor renaming of try_push()), and I filed JDK-8156754 for
this other work.
Another point came up was why not doing bulk-move of the queue elements
from the overflow queue to the task queue. Due to already being late
for FC, we also moved this to an RFE, JDK-8156739.
http://cr.openjdk.java.net/~tschatzl/8152438/webrev.1 (full)
http://cr.openjdk.java.net/~tschatzl/8152438/webrev.0_to_1 (diff)
Thanks,
Thomas
More information about the hotspot-gc-dev
mailing list