RFR: 8279241: G1 Full GC does not always slide memory to bottom addresses

Fri Jan 21 16:32:52 UTC 2022

On Mon, 17 Jan 2022 11:00:04 GMT, Thomas Schatzl <tschatzl at openjdk.org> wrote:

> Hi all,
> 
>   can I get reviews for this change that completely fixes the possibility that during g1 full gc memory will not be slid to bottom addresses consistently in presence of threads.
> 
> The problem is that multiple thread compete for regions to compact into - it could happen that given live data somewhere "high up" the heap gets to be the bottom region for a particular thread, so all data will be compacted starting from that region.
> The problem with that is region level fragmentation, i.e. that after gc there is not enough free contiguous space for a humongous object, leading to OOME.
> 
> The change splits the phase where determining the compaction point queue (the set of regions a particular thread compacts into) from the part of that phase where the new locations of the objects is determined (i.e. putting a forwarding pointer into the live objects in these regions) and other stuff best done in parallel.
> This makes determining the compaction point queue deterministic (by distributing these regions we can compact into in a round-robin fashion) in a way that always slides live data consistently into the bottom heap area.
> 
> This change also makes it easier to, in the future, improve work distribution of the compaction phase (which directly uses the compaction point queues) by distributing them  according to live data, and also incorporate last-ditch moves of humongous objects.
> 
> The most important thing about this split is probably the changes in the parallel part: every thread must make sure that some work is done on the compaction point queue (i.e. the forwarding), and other work on all regions (clearing metadata, updating the BOT of young regions that are not moved).
> 
> Testing: tier1-5, checking performance on some simple full gc benchmarks with no particular difference
> 
> Thanks,
>   Thomas

Some minor comments/suggestions.

src/hotspot/share/gc/g1/g1FullCollector.cpp line 307:

> 305: 
> 306:   // To avoid OOM when there is memory left.
> 307:   if (!found_new_empty_regions && !has_free_compaction_targets) {

`has_free_compaction_targets` itself should be enough.

src/hotspot/share/gc/g1/g1FullGCPrepareTask.inline.hpp line 68:

> 66: inline G1FullGCCompactionPoint* G1DetermineCompactionQueueClosure::next_compaction_point() {
> 67:   return _collector->compaction_point(next_worker());
> 68: }

I don't see much benefit of having such separated in two methods; this is the sole call site of `next_worker()` after all. Instead of `%`, manually setting it to zero on reaching `workers()` is fairly readable as well.

src/hotspot/share/gc/g1/g1FullGCPrepareTask.inline.hpp line 77:

> 75:   // Add region to the compaction queue.
> 76:   cp->add(hr);
> 77: }

Instead of relying on all callers passing in the correct `cp`, how about accepting `hr` only and `auto cp = next_compaction_point();` inside this method?

-------------

Marked as reviewed by ayang (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7106