RFR: 8348594: Shenandoah: Do not penalize for degeneration when not the fault of triggering heuristic [v2]

Paul Hohensee phh at openjdk.org
Tue Feb 4 16:19:20 UTC 2025


On Mon, 27 Jan 2025 02:05:02 GMT, Kelvin Nilsen <kdnilsen at openjdk.org> wrote:

>> Shenandoah heuristics use a penalty mechanism to cause earlier GC triggers when recent concurrent GC cycles degenerate.  Degeneration is a stop-the-world remediation that allows GC to catch up when mutator allocations fail during concurrent GC.  The fact that we needed to degenerate indicates that we were overly optimistic in delaying the trigger that starts concurrent GC.
>> 
>> We have observed that it is common for degenerated GC cycles to cascade upon each other.  The condition that caused an initial degenerated cycle is often not fully resolved by the end of that degenerated cycle.  For example, the application may be experiencing a phase change and the GC heuristics are not yet attuned to the new behavior.  Furthermore, a degenerated GC may exacerbate the problem condition.  During the stop-the-world pause imposed by the first degenerated GC, work continues to accumulate in the form of new client requests that are buffered in network sockets until the end of that degenerated GC.
>> 
>> As originally implemented, each degeneration would "pile on" additional penalties.  These penalties cause the GC frequency to continue to increase.  And the expanding CPU load of GC makes it increasingly difficult for mutator threads to catchup.  The large penalties accumulated while we are trying to resolve the problem linger long after the problem condition has been resolved.
>> 
>> This change does not add further to the degeneration penalties if a new degenerated cycle occurs through no fault of the triggering mechanism.  We only add the degeneration penalty if the reason we are now degenerating can be attributed to a consciously late trigger by the heuristic.
>
> Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Respond to reviewer feedback

src/hotspot/share/gc/shenandoah/heuristics/shenandoahAdaptiveHeuristics.cpp line 318:

> 316: 
> 317:   if (ShenandoahHeuristics::should_start_gc()) {
> 318:     _start_gc_is_pending = true;

I assume there's no race here, i.e., only one thread reads/writes _start_gc_is_pending. If there's a race, make sure it's benign. In either case, _start_gc_is_pending is made "sticky" by this code.

src/hotspot/share/gc/shenandoah/heuristics/shenandoahHeuristics.cpp line 261:

> 259: 
> 260: void ShenandoahHeuristics::record_success_concurrent() {
> 261:   _start_gc_is_pending = false;

The name _start_gc_is_pending implies that it should be set false as soon as a gc cycle starts, not when it finishes. Maybe _gc_pending? Or maybe setting it false at the end of a gc cycle is a bug? :)

src/hotspot/share/gc/shenandoah/heuristics/shenandoahHeuristics.hpp line 87:

> 85:   size_t _declined_trigger_count;         // This counts how many times since previous GC finished that this
> 86:                                           //  heuristic has answered false to should_start_gc().
> 87:   size_t _previous_trigger_declinations;  // This represents the value of _declined_trigger_count as captured at the

Maybe the name should be _most_recent_declined_trigger_count, which relates it directly to _declined_trigger_count.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/23305#discussion_r1941486248
PR Review Comment: https://git.openjdk.org/jdk/pull/23305#discussion_r1941462312
PR Review Comment: https://git.openjdk.org/jdk/pull/23305#discussion_r1941468695


More information about the shenandoah-dev mailing list