RFR: 8324995: Shenandoah: Skip to full gc for humongous allocation failures [v3]

Tue Feb 6 01:37:54 UTC 2024

On Wed, 31 Jan 2024 21:50:06 GMT, William Kemper <wkemper at openjdk.org> wrote:

>> Shenandoah degenerated cycles do not compact regions. When a humongous allocation fails, it is likely due to fragmentation which is better addressed by a full gc.
>
> William Kemper has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Fix typo in comment

I dusted off the workloads that had originally motivated this change.  Some of the context within which the code executes is different.  At the time, we were doing 64 degens before a full gc.

Here is the test I ran 5 times:

for i in 48g 42g 36g 32g 31g 30g 29g 28g 27g 26g 25g 24g                                                                                                  
do

echo Run TradiShen tip with memory size $i with 4s customer period
>&2 echo Run TradiShen tip with memory size $i with 4s customer period
~/github/jdk.2-1-2024/build/linux-x86_64-server-release/jdk/bin/java \
  -XX:+UnlockExperimentalVMOptions \
  -XX:+UseTransparentHugePages \
  -XX:-ShenandoahPacing \
  -XX:+AlwaysPreTouch -XX:+DisableExplicitGC -Xms$i -Xmx$i \
  -XX:+UseShenandoahGC \
  -Xlog:"gc*=info,ergo" \
  -Xlog:safepoint=trace -Xlog:safepoint=debug -Xlog:safepoint=info \
  -XX:+UnlockDiagnosticVMOptions \
  -jar ~/github/heapothesys/Extremem/target/extremem-1.0-SNAPSHOT.jar \
  -dInitializationDelay=45s -dDictionarySize=16000000 -dNumCustomers=28000000 \
  -dNumProducts=64000 -dCustomerThreads=2000 -dCustomerPeriod=4s -dCustomerThinkTime=1s \
  -dKeywordSearchCount=4 -dServerThreads=5 -dServerPeriod=5s -dProductNameLength=10 \
  -dBrowsingHistoryQueueCount=5 \
  -dSalesTransactionQueueCount=5 \
  -dProductDescriptionLength=64 -dProductReplacementPeriod=25s -dProductReplacementCount=5 \
  -dCustomerReplacementPeriod=30s -dCustomerReplacementCount=1000 -dBrowsingExpiration=1m \
  -dPhasedUpdates=true \
  -dPhasedUpdateInterval=60s \
  -dSimulationDuration=20m -dResponseTimeMeasurements=100000


echo Run Humongous Failure Handling with one degen and memory size $i with 4s customer period
>&2 echo Run Humongous Failure Handling with one degen and memory size $i with 4s customer period
~/gitfarm/shen.humongous-alloc-failure-handling/build/linux-x86_64-server-release/jdk/bin/java \
  -XX:+UnlockExperimentalVMOptions \
  -XX:+UseTransparentHugePages \
  -XX:-ShenandoahPacing \
  -XX:+AlwaysPreTouch -XX:+DisableExplicitGC -Xms$i -Xmx$i \
  -XX:+UseShenandoahGC \
  -Xlog:"gc*=info,ergo" \
  -Xlog:safepoint=trace -Xlog:safepoint=debug -Xlog:safepoint=info \
  -XX:+UnlockDiagnosticVMOptions \
  -jar ~/github/heapothesys/Extremem/target/extremem-1.0-SNAPSHOT.jar \
  -dInitializationDelay=45s -dDictionarySize=16000000 -dNumCustomers=28000000 \
  -dNumProducts=64000 -dCustomerThreads=2000 -dCustomerPeriod=4s -dCustomerThinkTime=1s \
  -dKeywordSearchCount=4 -dServerThreads=5 -dServerPeriod=5s -dProductNameLength=10 \
  -dBrowsingHistoryQueueCount=5 \
  -dSalesTransactionQueueCount=5 \
  -dProductDescriptionLength=64 -dProductReplacementPeriod=25s -dProductReplacementCount=5 \
  -dCustomerReplacementPeriod=30s -dCustomerReplacementCount=1000 -dBrowsingExpiration=1m \
  -dPhasedUpdates=true \
  -dPhasedUpdateInterval=60s \
  -dSimulationDuration=20m -dResponseTimeMeasurements=100000

done

My shen.humongous-alloc-failure-handling branch had an experimental delta from what is here.  The idea was to do only one degen, and then upgrade to full on a humongous alloc failure:
```diff --git a/src/hotspot/share/gc/shenandoah/shenandoahControlThread.cpp b/src/hotspot/share/gc/shenandoah/shenandoahControlThread.cpp
index a4b2adc8f5a..42721059b7e 100644

--- a/src/hotspot/share/gc/shenandoah/shenandoahControlThread.cpp
+++ b/src/hotspot/share/gc/shenandoah/shenandoahControlThread.cpp
@@ -108,7 +108,11 @@ void ShenandoahControlThread::run_service() {
 
       // If a humongous allocation has failed, then the heap is likely in need of compaction, so run
       // a full gc (which compacts regions) instead of a degenerated gc (which does not compact regions).
-      if (ShenandoahDegeneratedGC && heuristics->should_degenerate_cycle() && !humongous_alloc_failure_pending) {
+
+      // Experiment: If we had a humongous_alloc_failure, make sure we try at least one degen before going to full.
+      if (ShenandoahDegeneratedGC &&
+          ((humongous_alloc_failure_pending && heap->shenandoah_policy()->consecutive_degenerated_gc_count() == 0) ||
+           (!humongous_alloc_failure_pending && heuristics->should_degenerate_cycle()))) {
         heuristics->record_allocation_failure_gc();
         policy->record_alloc_failure_to_degenerated(degen_point);
         mode = stw_degenerated;

The results are summarized in the attached spreadsheet.
[humongous-alloc-failure-handling.xlsx](https://github.com/openjdk/jdk/files/14173614/humongous-alloc-failure-handling.xlsx)

-------------

PR Comment: https://git.openjdk.org/jdk/pull/17638#issuecomment-1928614499