RFR: 8335126: Shenandoah: Improve OOM handling
Y. Srinivas Ramakrishna
ysr at openjdk.org
Wed Jun 26 23:28:10 UTC 2024
On Wed, 26 Jun 2024 17:56:28 GMT, Kelvin Nilsen <kdnilsen at openjdk.org> wrote:
>> 1. Throw OOM after failed allocation request following a Full GC (rather
>> than retrying as long as Full GC makes good progress because
>> repeatedly retrying the allocation request creates brown-out behavior
>> with no identified benefits on real-world workloads)
>>
>> 2. Count a successful allocation following a blocking
>> handle_allocation_failure() request to be good GC progress.
>> Otherwise, we increment gc_no_progress_count in full GCs that
>> have bad progress but successful allocations, and this causes
>> unwanted failure to even try a full GC in a different thread after
>> an out-of-memory condition might have been resolved in this thread.
>>
>> 3. Count a completed concurrent GC cycle as good progress, regardless
>> of how much memory it might have been able to reclaim. The fact that
>> concurrent GC succeeded without allocation failure and without
>> degeneration is considered good progress. Successful concurrent
>> GCs between Full GCs will reset the gc_no_progress_count to zero.
>>
>> 4. Do not count degenerated cycles as having no-progress. If a
>> degenerated cycle has no progress, it will upgrade to full GC.
>> The upgraded full GC will evaluate its own progress. We don't
>> want to count this "same [upgraded] cycle" twice.
>>
>> These changes have been tested over a variety of workloads and standard tests. These changes have also been tested with the generational mode of Shenandoah. It appears these changes provide more robust and consistent handling across a diversity of scenarios than the original implementation.
>
> src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp line 969:
>
>> 967: // So the second thread experiences OOMError even through another GC would have reclaimed the memory it wanted
>> 968: // to allocate.
>> 969: // 2. A GLOBAL GC won't necessarily reclaim all garbage. Following a concurrent Generational GLOBAL GC, we may
>
> Hmmm. I let Generational concepts slip into this comment. Maybe harmless. Maybe inappropriate.
> Reviewers: feel free to ask that I remove this mention...
I think this is OK for now because we expect to integrate generational shenandoah soon, but am fine with defering it to GenShen integration in the future if other reviewers feel that would be better.
One question I had was about the policy for the treatment/reclamation of soft references, and whether that changes and/or escalates to "clear all" upon reaching a "less than good progress" state prior to an OOM (reflecting the best efforts promise prior to OOM).
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/19912#discussion_r1655626477
More information about the hotspot-gc-dev
mailing list