RFR: 8335126: Shenandoah: Improve OOM handling

Kelvin Nilsen kdnilsen at openjdk.org
Mon Jul 8 16:22:34 UTC 2024


On Thu, 27 Jun 2024 19:02:04 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

>> 1. Throw OOM after failed allocation request following a Full GC (rather
>>    than retrying as long as Full GC makes good progress because
>>    repeatedly retrying the allocation request creates brown-out behavior
>>    with no identified benefits on real-world workloads)
>> 
>> 2. Count a successful allocation following a blocking
>>    handle_allocation_failure() request to be good GC progress.
>>    Otherwise, we increment gc_no_progress_count in full GCs that
>>    have bad progress but successful allocations, and this causes
>>    unwanted failure to even try a full GC in a different thread after
>>    an out-of-memory condition might have been resolved in this thread.
>> 
>> 3. Count a completed concurrent GC cycle as good progress, regardless
>>    of how much memory it might have been able to reclaim.  The fact that
>>    concurrent GC succeeded without allocation failure and without
>>    degeneration is considered good progress.  Successful concurrent
>>    GCs between Full GCs will reset the gc_no_progress_count to zero.
>> 
>> 4. Do not count degenerated cycles as having no-progress.  If a
>>    degenerated cycle has no progress, it will upgrade to full GC.
>>    The upgraded full GC will evaluate its own progress.  We don't
>>    want to count this "same [upgraded] cycle" twice.
>> 
>> These changes have been tested over a variety of workloads and standard tests.  These changes have also been tested with the generational mode of Shenandoah.  It appears these changes provide more robust and consistent handling across a diversity of scenarios than the original implementation.
>
> src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp line 960:
> 
>> 958:       //   b) We experienced at least one Full GC (whether or not it had good progress)
>> 959:       //
>> 960:       // TODO: Rather than require a Full GC before throwing OOMError, it might be more appropriate for handle_alloc_failure()
> 
> Pro-tip: If you find yourself writing a large TODO comment, it should probably be transplanted straight into a new issue.

Thanks.  I will create an issue for this.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/19912#discussion_r1668933461


More information about the hotspot-gc-dev mailing list