G1's parallel full GC significantly increases wasted space in Old regions

Sat Feb 17 01:54:39 UTC 2018

Hi,

We (Java platform team at Google) are comparing G1's performance in JDK9u
and JDK10. We expect JDK10's G1 to perform better because of JEP 307
(Parallel Full GC) <http://openjdk.java.net/jeps/307>.
However, we found a performance regression in JDK10 with DaCapo benchmarks.
We set the heap size small (about 2-4 times of minimum heap) so they
trigger interesting GC activities.

We found JDK10's full GC results in significantly more wasted space in Old
regions, which leads to a more fragmented heap and fewer Eden regions. We
also found the amount of wasted space after a full GC is proportional to
the number of ParallelGCThread. As a result, several benchmarks trigger
more Young, Mixed and concurrent collections, leading to increased CPU
usage and pause time. One reason that makes these benchmarks sensitive to
full GC is that DaCapo harness performs a System.gc() in-between each
iteration of the benchmarks. So a more fragmented heap hurts the benchmark
from the beginning of every iteration.

We are aware this is probably a known issue as described in JEP 307:
<http://openjdk.java.net/jeps/307>
"Risks and Assumptions: The fact that G1 uses regions will most likely lead
to more wasted space after a parallel full GC than for a single threaded
one."
However, it is not impossible to optimize the full GC to reduce wasted
space. After all, a stop-the-world parallel mark-sweep-compact algorithm
should be able to efficiently compact the heap.

We did not find any RFE or discussion on JBS regarding this. Is there any
ongoing effort to reduce wasted space in parallel full GC?

-Man
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20180216/9f3cbb66/attachment.htm>