RFR: 8279241: G1 Full GC does not always slide memory to bottom addresses

Mon Jan 24 09:48:11 UTC 2022

On Fri, 21 Jan 2022 16:15:29 GMT, Albert Mingkun Yang <ayang at openjdk.org> wrote:

>> Hi all,
>> 
>>   can I get reviews for this change that completely fixes the possibility that during g1 full gc memory will not be slid to bottom addresses consistently in presence of threads.
>> 
>> The problem is that multiple thread compete for regions to compact into - it could happen that given live data somewhere "high up" the heap gets to be the bottom region for a particular thread, so all data will be compacted starting from that region.
>> The problem with that is region level fragmentation, i.e. that after gc there is not enough free contiguous space for a humongous object, leading to OOME.
>> 
>> The change splits the phase where determining the compaction point queue (the set of regions a particular thread compacts into) from the part of that phase where the new locations of the objects is determined (i.e. putting a forwarding pointer into the live objects in these regions) and other stuff best done in parallel.
>> This makes determining the compaction point queue deterministic (by distributing these regions we can compact into in a round-robin fashion) in a way that always slides live data consistently into the bottom heap area.
>> 
>> This change also makes it easier to, in the future, improve work distribution of the compaction phase (which directly uses the compaction point queues) by distributing them  according to live data, and also incorporate last-ditch moves of humongous objects.
>> 
>> The most important thing about this split is probably the changes in the parallel part: every thread must make sure that some work is done on the compaction point queue (i.e. the forwarding), and other work on all regions (clearing metadata, updating the BOT of young regions that are not moved).
>> 
>> Testing: tier1-5, checking performance on some simple full gc benchmarks with no particular difference
>> 
>> Thanks,
>>   Thomas
>
> src/hotspot/share/gc/g1/g1FullGCPrepareTask.inline.hpp line 68:
> 
>> 66: inline G1FullGCCompactionPoint* G1DetermineCompactionQueueClosure::next_compaction_point() {
>> 67:   return _collector->compaction_point(next_worker());
>> 68: }
> 
> I don't see much benefit of having such separated in two methods; this is the sole call site of `next_worker()` after all. Instead of `%`, manually setting it to zero on reaching `workers()` is fairly readable as well.

This is a fairly trivial implementation for `next_worker()` to fix the issue at hand. As suggested by the description, there needs to be some more thought put into it to handle e.g. imbalance of region occupancies or humongous regions.
Unless you insist I would like to keep the current implementation as is.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7106