RFR [0/7]: Rebuild remembered sets concurrently

Thomas Schatzl thomas.schatzl at oracle.com
Mon Mar 5 14:09:45 UTC 2018


Hi all,

  I would like to ask for reviews for a significant change in how G1
manages its remembered sets by rebuilding its remembered sets
concurrently.

Currently G1 maintains the remembered sets for all regions all the
time. This has several drawbacks:

- memory consumption for remembered sets is very high all the time, not
only due to actually storing these remembered sets, but also due to
additional memory wastage due to fragmentation.
- the mechanism to combat this memory fragmentation (somewhat) has been
to periodically "scrub" remembered sets, at high costs during the
"Cleanup" pause.
- maintenance of ultimately unused remembered sets decreases overall
throughput.

With these changes, G1 will only maintain remembered sets that will
ever be considered for evacuation ("candidate collection set regions")
only during marking and the following mixed gc phase.

This not only to a very significant degree decreases average memory
consumption (e.g. during the young-only phase mainly remembered sets
for young-only regions are maintained), but also maximum memory
consumption, as there is little fragmentation while remembered sets are
maintained.

There is no need for scrubbing the remembered sets any more either,
making the Cleanup pause very fast. This allows us to actually remove
some previously (very small) concurrent phase.

As a beneficial side-effect we noticed that particularly mixed gc
performance may increase significantly (up to like 40%-50% faster) due
to improved remembered set memory locality.

Particularly "big data" workloads where the remembered sets take a lot
of memory and maintenance cpu resources benefit most, but we have also
noticed significant throughput improvements when running with G1 in
other cases.

The only drawback we are aware of is that the delay from start of
marking to the first mixed gc is longer. G1's mechanism to adaptively
start marking seems to be able to manage this change nicely, but some
people that already set a very high specific IHOP manually may
experience full gcs.

Due to the large benefits we do not intend to provide the original
behavior, although it could be implemented as the decisions which
regions to maintain the remembered sets are encapsulated into a single
class.

For more information look at my somewhat recent FOSDEM talk explaining
this change in more details [1].

For further review purposes this change will be split into seven
distinct changesets that will be organized into a thread below this
post, each tagged with a running number. Obviously they are dependent
on each other.

These are in order:

8197569: Refactor eager reclaim for concurrent remembered set
rebuilding
8197570: Make rules for choosing collection set candidates more
explicit
8197850: Calculate liveness in regions during marking
8180415: Rebuild remembered sets during concurrent mark (main change)
8197573: Remove concurrent cleanup and secondary free list handling
8197928: Only enqueue deferred cards with references into regions that
have a tracked remembered set during GC
8197932: Better split work in rebuild remembered sets phase

Each of those changesets will have its own webrev, and there will be
notes about each of the changes. However, for reference a webrev
containing all the changes can be found here:

http://cr.openjdk.java.net/~tschatzl/rebuild-all/webrev/

Testing: hs-tier1-5, internal perf tests

Enjoy! :)

  Thomas

[1] https://fosdem.org/2018/schedule/event/g1/




More information about the hotspot-gc-dev mailing list