RFR: 8321149: GenShen: Test for is_good_progress() following degen needs to sum all available memory

Fri Dec 1 23:53:21 UTC 2023

On Fri, 1 Dec 2023 18:12:05 GMT, William Kemper <wkemper at openjdk.org> wrote:

>> This reduces the sensitivity of the trigger that upgrades to Full GC following a completed degenerated GC.  The change was motivated by examination of a test workload that required one more full GC and one more degenerated GC than was expected.  Upon examination of the GC log, it was determined that the extra Full GC was triggered because the mutator free set following degen was approximately 10% below the critical threshold, even though the total available memory within heap was more than three fold the critical threshold.  
>> 
>> Following this change, no Full GCs were required and one fewer degenerated cycle was required (which had occurred immediately following the Full GC due to the long unproductive STW the pause caused by Full GC).  P50 latency improved by 15%, and p100 latencies improved by over 100 fold.
>> 
>> More comprehensive testing over a broader set of workloads reveals this change is not "universally better".  Of particular concern is degradation of specjbb numbers on x86, but not aarch64, tests.  I'm inclined to believe this change represents net improvement, but it would be best to delay integration until we have a better understanding of specjbb performance issues, and how they might be impacted by this change.
>> 
>> 
>> 
>>    Control: shenandoah-x86-template
>> Experiment: fix-is-good-progress-gh-x86
>> 
>>                           Most impacted benchmarks |                              Most impacted metrics
>> -------------------------------------------------------------------------------------------------------
>>                                Genshen/specjbb2015 |                                         cpu_system
>>                                      Shenandoah/h2 |                              concurrent_evacuation
>>                             Genshen/diluvian_large |                             transfer_old_from_satb
>>                                  Shenandoah/serial |                                      critical_jops
>>                         Genshen/extremem-large-45g |                                      trigger_learn
>> 
>> 
>>                                 Only in experiment |                                    Only in control
>> -------------------------------------------------------------------------------------------------------
>>                        extremem-ff/trigger_failure |            scimark.fft.small/concurrent_evacuation
>>                                   tomcat/cwr_total |           scimark.lu.large/concurrent...
>
> src/hotspot/share/gc/shenandoah/shenandoahMetrics.cpp line 1:
> 
>> 1: /*
> 
> We need to think about what _good progress_ means... In my mind, it means the _mutators_ have enough memory to last them through a concurrent cycle. Maybe we ought to have this code evaluate the allocation rate and gc times as the heuristics do? Alternatively, we could also reduce `ShenandoahCriticalFreeThreshold` to achieve a similar end? Or, for generational mode, we should look at how much memory the degenerated cycle made available in the _young_ generation?

Thanks.  These are good insights.  I will explore further.

-------------

PR Review Comment: https://git.openjdk.org/shenandoah/pull/364#discussion_r1412669834