RFR: 8321149: GenShen: Test for is_good_progress() following degen needs to sum all available memoryChange behavior of is_good_progerss

Kelvin Nilsen kdnilsen at openjdk.org
Fri Dec 1 15:56:13 UTC 2023


This reduces the sensitivity of the trigger that upgrades to Full GC following a completed degenerated GC.  The change was motivated by examination of a test workload that required one more full GC and one more degenerated GC than was expected.  Upon examination of the GC log, it was determined that the extra Full GC was triggered because the mutator free set following degen was approximately 10% below the critical threshold, even though the total available memory within heap was more than three fold the critical threshold.  

Following this change, no Full GCs were required and one fewer degenerated cycle was required (which had occurred immediately following the Full GC due to the long unproductive STW the pause caused by Full GC).  P50 latency improved by 15%, and p100 latencies improved by over 100 fold.

More comprehensive testing over a broader set of workloads reveals this change is not "universally better".  Of particular concern is degradation of specjbb numbers on x86, but not aarch64, tests.  I'm inclined to believe this change represents net improvement, but it would be best to delay integration until we have a better understanding of specjbb performance issues, and how they might be impacted by this change.



   Control: shenandoah-x86-template
Experiment: fix-is-good-progress-gh-x86

                          Most impacted benchmarks |                              Most impacted metrics
-------------------------------------------------------------------------------------------------------
                               Genshen/specjbb2015 |                                         cpu_system
                                     Shenandoah/h2 |                              concurrent_evacuation
                            Genshen/diluvian_large |                             transfer_old_from_satb
                                 Shenandoah/serial |                                      critical_jops
                        Genshen/extremem-large-45g |                                      trigger_learn


                                Only in experiment |                                    Only in control
-------------------------------------------------------------------------------------------------------
                       extremem-ff/trigger_failure |            scimark.fft.small/concurrent_evacuation
                                  tomcat/cwr_total |           scimark.lu.large/concurrent_strong_roots
                    extremem-ff/degen_update_roots |          scimark.fft.small/concurrent_thread_roots
                       extremem-ff/du_thread_roots |   scimark.fft.small/concurrent_update_thread_roots
                              extremem-ff/du_total |            scimark.fft.large/concurrent_weak_roots

Shenandoah
-------------------------------------------------------------------------------------------------------
+40.97% serial/concurrent_evacuation p=0.00668
  Control:     16.703ms (+/-  6.30ms)         45
  Test:        23.547ms (+/-  6.06ms)          8

+8.77% scimark.sor.large/cpu_system p=0.04575
  Control:      2.293s  (+/-  0.29s )         60
  Test:         2.494s  (+/-  0.25s )         10

+6.02% mpegaudio/cpu_system p=0.03141
  Control:      5.712s  (+/-  0.44s )         60
  Test:         6.056s  (+/-  0.54s )         10

-11.37% h2/dacapo_metered_latency_max p=0.04808
  Control:     31.523ms (+/-  4.40ms)         60
  Test:        28.305ms (+/-  3.24ms)         10

-11.37% h2/dacapo_simple_latency_max p=0.04811
  Control:     31.523ms (+/-  4.40ms)         60
  Test:        28.305ms (+/-  3.24ms)         10

-6.67% hyperalloc_a3072_o4096/trigger_learn p=0.01965
  Control:     11.333   (+/-  0.90  )         60
  Test:        10.625   (+/-  0.67  )         10


Genshen
-------------------------------------------------------------------------------------------------------
+31.13% extremem-large-45g/transfer_old_from_satb p=0.01612
  Control:      1.444ms (+/-267.77us)         25
  Test:         1.893ms (+/-560.32us)          4

-44.61% diluvian_large/pause_degenerated_gc_n p=0.00862
  Control:     23.097s  (+/-  6.86s )         36
  Test:        15.972s  (+/-  5.04s )         11

-44.61% diluvian_large/pause_degenerated_gc_g p=0.00862
  Control:     23.097s  (+/-  6.86s )         36
  Test:        15.972s  (+/-  5.04s )         11

-13.37% crypto.signverify/pause_init_update_refs_g p=0.04082
  Control:      2.807ms (+/-728.62us)         89
  Test:         2.477ms (+/-714.60us)         20

-13.04% specjbb2015/trigger_failure p=0.04891
  Control:    122.083   (+/- 22.98  )         60
  Test:       108.000   (+/- 28.48  )         10

-11.70% specjbb2015/sla_100000_jops p=0.00312
  Control:   6284.875   (+/-517.36  )         60
  Test:      5626.500   (+/-549.73  )         10

-10.18% extremem-phased/trigger_spike p=0.01099
  Control:     32.917   (+/-  3.76  )         60
  Test:        29.875   (+/-  2.64  )         10

-8.36% specjbb2015/sla_75000_jops p=0.04222
  Control:   6005.917   (+/-539.64  )         60
  Test:      5542.625   (+/-564.13  )         10

-6.26% specjbb2015/critical_jops p=0.02604
  Control:   5202.917   (+/-348.10  )         60
  Test:      4896.500   (+/-411.36  )         10

-----------------------------------------


Pipeline: fix-is-good-progress-gh-aarch64
 Elapsed: 8:31:03.815824

[kdnilsen at amazon.com](mailto:kdnilsen at amazon.com)  : openjdk              : fix-is-goo : 2023-11-30 23:37:57+00:00 : e62dd2 : Change behavior of is_good_progerss
[kemperw at amazon.com](mailto:kemperw at amazon.com)   : codepipeline-helpers : mainline   : 2023-11-21 10:20:23-08:00 : f42398 : Add more detail from tlab messages

b05cfa4a-66b5-4d94-81b6-35f965db87c0: Smoke: 0:00:00 Integration: 6:38:22.240000

      294: Passed
      294: Total




   Control: shenandoah-aarch64-template
Experiment: fix-is-good-progress-gh-aarch64

                          Most impacted benchmarks |                              Most impacted metrics
-------------------------------------------------------------------------------------------------------
                            Genshen/diluvian_large |                                         cpu_system
                         Shenandoah/diluvian_large |                                  trigger_threshold
                                     Genshen/xalan |                               context_switch_count
                               Genshen/specjbb2015 |                                    fu_thread_roots
                           Genshen/diluvian_medium |                                calculate_addresses


                                Only in experiment |                                    Only in control
-------------------------------------------------------------------------------------------------------
                          avrora/jhiccup_max_pause |                  scimark.lu.large/cmr_thread_roots
                  mpegaudio/concurrent_update_refs |              scimark.sparse.large/cmr_thread_roots
                crypto.rsa/concurrent_thread_roots |                    mpegaudio/concurrent_mark_roots
                    diluvian_large/dcu_unlink_clds | scimark.sparse.small/concurrent_update_thread_roots
         scimark.lu.large/pause_init_update_refs_g |                           diluvian_large/cwr_total

Shenandoah
-------------------------------------------------------------------------------------------------------
+175.00% diluvian_large/trigger_threshold p=0.00004
  Control:      1.000   (+/-  0.00  )         40
  Test:         2.750   (+/-  0.48  )         10

-20.30% jme/cpu_system p=0.01368
  Control:      0.328s  (+/-  0.06s )         40
  Test:         0.273s  (+/-  0.04s )         10


Genshen
-------------------------------------------------------------------------------------------------------
-23.83% diluvian_large/pause_degenerated_gc_n p=0.00866
  Control:     25.834s  (+/-  4.48s )         32
  Test:        20.861s  (+/-  6.63s )         13

-23.83% diluvian_large/pause_degenerated_gc_g p=0.00866
  Control:     25.834s  (+/-  4.48s )         32
  Test:        20.861s  (+/-  6.63s )         13

-8.07% diluvian_large/calculate_addresses p=0.03028
  Control:      2.854s  (+/-284.07ms)         28
  Test:         2.641s  (+/-194.48ms)         10

-6.96% diluvian_medium/cpu_system p=0.03454
  Control:      2.345s  (+/-  0.21s )         50
  Test:         2.192s  (+/-  0.19s )         10

-5.73% specjbb2015/fu_thread_roots p=0.00000
  Control:      2.346ms (+/-203.14us)        238
  Test:         2.219ms (+/-126.57us)         48

-5.29% xalan/context_switch_count p=0.00851
  Control:  24231.100   (+/-2546.76  )         50
  Test:     23013.125   (+/-972.88  )         10

-------------

Commit messages:
 - Change behavior of is_good_progerss

Changes: https://git.openjdk.org/shenandoah/pull/364/files
 Webrev: https://webrevs.openjdk.org/?repo=shenandoah&pr=364&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8321149
  Stats: 9 lines in 2 files changed: 8 ins; 0 del; 1 mod
  Patch: https://git.openjdk.org/shenandoah/pull/364.diff
  Fetch: git fetch https://git.openjdk.org/shenandoah.git pull/364/head:pull/364

PR: https://git.openjdk.org/shenandoah/pull/364


More information about the shenandoah-dev mailing list