RFR: 8268163: Change the order of fallback full GCs in G1

Tue Jun 8 13:19:18 UTC 2021

On Thu, 3 Jun 2021 18:25:00 GMT, Stefan Johansson <sjohanss at openjdk.org> wrote:

> Please review this change to make the order of G1 Full GCs a bit more straight forward.
> 
> **Summary**
> In [JDK-8233822](https://bugs.openjdk.java.net/browse/JDK-8233822) the way Full GCs were scheduled was changed a bit to support a use-case introduced by [JDK-8202286](https://bugs.openjdk.java.net/browse/JDK-8202286) (which allowed the heap to have the old generation on an alternative memory device). This feature has been removed ([JDK-8256181](https://bugs.openjdk.java.net/browse/JDK-8256181)), but the Full GC scheduling has not been reverted. This can lead to situations where we do three Full GCs in a row. Doing more than two seems a bit over the top, so this change more or less reverts back to the old behavior 
> * Young collection requesting to allocate when done will cause at most two Full GCs, the first will not clear any soft references and allow dead wood to be left. The second one, if still not able to satisfy the allocation will clear soft references and compact everything not allowing any dead wood.
> * For concurrent start collections and young collections not requesting any allocation, one Full GC will be scheduled if no regions were freed up by the initial collection.
> 
> A change compared to current behavior is that a concurrent collection started because of the metadata threshold not being met, will no longer be upgraded to a Full GC. This is still equal to how things were handled prior to [JDK-8233822](https://bugs.openjdk.java.net/browse/JDK-8233822).
> 
> **Testing**
> Tier 1-3, plus manual testing looking at output in near OOM situations.

Apart from these pre-existing issues, lgtm.

I also asked @kstefanj to consider renaming "maximum compaction/collection" to "maximal compaction" or at least streamline the naming of that term in a separate RFR too.

src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 1105:

> 1103:   // do_full_collection only fails if blocked by GC locker and that can't
> 1104:   // be the case here since we only call this when already completed one gc.
> 1105:   assert(success, "invariant");

... this when we already completed a gc.

src/hotspot/share/gc/g1/g1VMOperations.cpp line 103:

> 101:     // the GCLocker check in the prologue.)
> 102:     _transient_failure = true;
> 103:   } else if (g1h->should_upgrade_to_full_gc(_gc_cause)) {

I think there is a pre-existing bug here: `should_upgrade_to_full_gc` always returns `false` here because the called `should_do_concurrent_full_gc()` used there (and returning true) is always true for `VM_G1TryInitiateConcMark`.

Afaict this has been introduced in (https://bugs.openjdk.java.net/browse/JDK-8232588) where `VM_G1CollectForAllocation` has been split into `VM_G1TryInitiateConcMark` and `VM_G1CollectForAllocation`, and `should_upgrade_to_full_gc()` not updated.

Tote that JDK-8232588 suggests to *not* upgrade for a `System.gc` call with `-XX:+ExplicitGCInvokesConcurrent`, but strangely adds the code to do so. Later, JDK-8232588, due to some refactoring error, disables the upgrade again.

I'll file a bug to investigate this (mess).

-------------

Marked as reviewed by tschatzl (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/4342