RFR: 8366865: G1: Allocation GC Pauses Triggered after JVM has started shutdown [v2]

Mon Sep 15 13:19:48 UTC 2025

On Fri, 12 Sep 2025 10:38:08 GMT, Ivan Walulya <iwalulya at openjdk.org> wrote:

>> Please review this patch to skip VM_GC_Collect_Operations  if initiated after the VM shutdown process has begun. We add a _is_shutting_down flag to CollectedHeap, which is set while holding the Heap_lock. This ensures mutual exclusion with VM_GC_Collect_Operations, which also require the Heap_lock.
>> 
>> Skipping VM_GC_Collect_Operation would otherwise cause allocation requests to fail (resulting in OutOfMemoryError) if requesting daemon threads were allowed to continue, we instead block these threads on a monitor. They remain stalled until they are terminated as part of the VM shutdown sequence.
>> 
>> Testing: Tier 1-7
>
> Ivan Walulya has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains nine additional commits since the last revision:
> 
>  - return on timeout
>  - Merge remote-tracking branch 'upstream/master' into shutting_down_gcs
>  - timed wait
>  - Merge remote-tracking branch 'upstream/master' into shutting_down_gcs
>  - space
>  - remove debug logs
>  - remove debug logs
>  - log_cpu_time
>  - init

I think this is mostly good, just some minor nitpicks.

src/hotspot/share/gc/shared/collectedHeap.cpp line 616:

> 614:   // If the VM is shutting down, we may have skipped VM_CollectForAllocation.
> 615:   // To avoid returning nullptr (which could cause premature OOME), we stall
> 616:   // allocation requests here allow the VM shutdown is complete.

This sentence is confusing and does not seem to be proper English. I do not completely understand what is meant, (maybe `s/allow/until`?) so no particular suggestions from me.

src/hotspot/share/gc/shared/collectedHeap.cpp line 624:

> 622:   //     sequence completes in the common case.
> 623:   //   - short enough to avoid excessive stall time if the shutdown itself
> 624:   //     triggers a GC.

I think this should be put into the .hpp file, or at least the comment `// Stall allocation requests until the VM shutdown is complete.` seems a bit misleading if in reality the VM has a timeout.

src/hotspot/share/gc/shared/collectedHeap.cpp line 626:

> 624:   //     triggers a GC.
> 625:   MonitorLocker ml(VMExit_lock);
> 626:   ml.wait(2 * 1000);

Maybe log a warning or such in case this times out - it seems to be only possible at that point if we are continuing for too long, and the VM will OOME soon anyway (the code will return `nullptr`).
However I'm fine if we think that is not necessary.

Also maybe use the constant `MILLIUNITS` instead of the hardcoded `1000`.

src/hotspot/share/runtime/java.cpp line 511:

> 509:   #endif
> 510: 
> 511:   // Run before exit and then stop concurrent GC threads

I think this comment should be improved - it looks like a straggler from the many changes in this file. It describes what `Universe::before_exit()` does, but maybe such documentation (what it does) is better put in the documentation for that method.

-------------

Changes requested by tschatzl (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/27190#pullrequestreview-3224510575
PR Review Comment: https://git.openjdk.org/jdk/pull/27190#discussion_r2348932834
PR Review Comment: https://git.openjdk.org/jdk/pull/27190#discussion_r2348941360
PR Review Comment: https://git.openjdk.org/jdk/pull/27190#discussion_r2348962032
PR Review Comment: https://git.openjdk.org/jdk/pull/27190#discussion_r2348952363