RFR: 8335126: Shenandoah: Improve OOM handling

Kelvin Nilsen kdnilsen at openjdk.org
Wed Jun 26 17:59:18 UTC 2024


On Wed, 26 Jun 2024 17:51:36 GMT, Kelvin Nilsen <kdnilsen at openjdk.org> wrote:

> 1. Throw OOM after failed allocation request following a Full GC (rather
>    than retrying as long as Full GC makes good progress because
>    repeatedly retrying the allocation request creates brown-out behavior
>    with no identified benefits on real-world workloads)
> 
> 2. Count a successful allocation following a blocking
>    handle_allocation_failure() request to be good GC progress.
>    Otherwise, we increment gc_no_progress_count in full GCs that
>    have bad progress but successful allocations, and this causes
>    unwanted failure to even try a full GC in a different thread after
>    an out-of-memory condition might have been resolved in this thread.
> 
> 3. Count a completed concurrent GC cycle as good progress, regardless
>    of how much memory it might have been able to reclaim.  The fact that
>    concurrent GC succeeded without allocation failure and without
>    degeneration is considered good progress.  Successful concurrent
>    GCs between Full GCs will reset the gc_no_progress_count to zero.
> 
> 4. Do not count degenerated cycles as having no-progress.  If a
>    degenerated cycle has no progress, it will upgrade to full GC.
>    The upgraded full GC will evaluate its own progress.  We don't
>    want to count this "same [upgraded] cycle" twice.
> 
> These changes have been tested over a variety of workloads and standard tests.  These changes have also been tested with the generational mode of Shenandoah.  It appears these changes provide more robust and consistent handling across a diversity of scenarios than the original implementation.

src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp line 969:

> 967:       //           So the second thread experiences OOMError even through another GC would have reclaimed the memory it wanted
> 968:       //           to allocate.
> 969:       //        2. A GLOBAL GC won't necessarily reclaim all garbage.  Following a concurrent Generational GLOBAL GC, we may

Hmmm.  I let Generational concepts slip into this comment.  Maybe harmless.  Maybe inappropriate.  
Reviewers: feel free to ask that I remove this mention...

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/19912#discussion_r1655305604


More information about the hotspot-gc-dev mailing list