RFR: 8316632: Shenandoah: Raise OOME when gc threshold is exceeded
Y. Srinivas Ramakrishna
ysr at openjdk.org
Thu Sep 21 21:51:12 UTC 2023
On Wed, 20 Sep 2023 22:41:52 GMT, William Kemper <wkemper at openjdk.org> wrote:
> Shenandoah will run back-to-back full GCs and _almost_ grind mutator progress to a halt before eventually exhausting memory. This change will have Shenandoah raise a gc threshold exceeded exception if the collector fails to make progress after `ShenandoahNoProgressThreshold` full GC cycles (default is 3).
I have a few questions; thanks for clearing up my confusion! :-)
src/hotspot/share/gc/shenandoah/shenandoahControlThread.hpp line 124:
> 122: ~ShenandoahControlThread();
> 123:
> 124: // Handle allocation failure from normal allocation.
Re "normal allocation" : what is not a normal (i.e. abnormal?) allocation?
src/hotspot/share/gc/shenandoah/shenandoahControlThread.hpp line 125:
> 123:
> 124: // Handle allocation failure from normal allocation.
> 125: // Optionally blocks while collector is handling the failure.
Who is allowed to call the non-blocking version? What is the semantics of block or don't block? Can the documentation be extended to elaborate on its use?
src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp line 947:
> 945: control_thread()->handle_alloc_failure(req, false);
> 946: notify_gc_progress();
> 947: *gc_overhead_limit_was_exceeded = true;
I am confused. We have noted that we did more than `ShenandoahNoProgressThreshold` collections at this point. So we will fail the allocation request, set `gc_overhead_limit_was_exceeded`, and return `null` to the requester.
What happens at the caller, in response to this set of conditions?
Why then does it make sense to `notify_gc_progress()`, which clears the counter of `_gc_no_progress_count`? Where was the ostensible progress in this case?
src/hotspot/share/gc/shenandoah/shenandoahHeap.hpp line 292:
> 290: ShenandoahSharedFlag _concurrent_strong_root_in_progress;
> 291:
> 292: size_t _no_gc_progress_count;
Can we call this `_gc_no_progress_count` instead, so it's consistent with the corresponding `get_` method? The difference in naming is unnecessary at best and potentially confusing for maintenance in the long run, at worst.
-------------
PR Review: https://git.openjdk.org/jdk/pull/15852#pullrequestreview-1638713120
PR Review Comment: https://git.openjdk.org/jdk/pull/15852#discussion_r1333628772
PR Review Comment: https://git.openjdk.org/jdk/pull/15852#discussion_r1333629455
PR Review Comment: https://git.openjdk.org/jdk/pull/15852#discussion_r1333633859
PR Review Comment: https://git.openjdk.org/jdk/pull/15852#discussion_r1333608047
More information about the hotspot-gc-dev
mailing list