RFR: 8324995: Shenandoah: Skip to full gc for humongous allocation failures [v3]
Kelvin Nilsen
kdnilsen at openjdk.org
Tue Feb 6 01:37:54 UTC 2024
On Wed, 31 Jan 2024 21:50:06 GMT, William Kemper <wkemper at openjdk.org> wrote:
>> Shenandoah degenerated cycles do not compact regions. When a humongous allocation fails, it is likely due to fragmentation which is better addressed by a full gc.
>
> William Kemper has updated the pull request incrementally with one additional commit since the last revision:
>
> Fix typo in comment
I dusted off the workloads that had originally motivated this change. Some of the context within which the code executes is different. At the time, we were doing 64 degens before a full gc.
Here is the test I ran 5 times:
for i in 48g 42g 36g 32g 31g 30g 29g 28g 27g 26g 25g 24g
do
echo Run TradiShen tip with memory size $i with 4s customer period
>&2 echo Run TradiShen tip with memory size $i with 4s customer period
~/github/jdk.2-1-2024/build/linux-x86_64-server-release/jdk/bin/java \
-XX:+UnlockExperimentalVMOptions \
-XX:+UseTransparentHugePages \
-XX:-ShenandoahPacing \
-XX:+AlwaysPreTouch -XX:+DisableExplicitGC -Xms$i -Xmx$i \
-XX:+UseShenandoahGC \
-Xlog:"gc*=info,ergo" \
-Xlog:safepoint=trace -Xlog:safepoint=debug -Xlog:safepoint=info \
-XX:+UnlockDiagnosticVMOptions \
-jar ~/github/heapothesys/Extremem/target/extremem-1.0-SNAPSHOT.jar \
-dInitializationDelay=45s -dDictionarySize=16000000 -dNumCustomers=28000000 \
-dNumProducts=64000 -dCustomerThreads=2000 -dCustomerPeriod=4s -dCustomerThinkTime=1s \
-dKeywordSearchCount=4 -dServerThreads=5 -dServerPeriod=5s -dProductNameLength=10 \
-dBrowsingHistoryQueueCount=5 \
-dSalesTransactionQueueCount=5 \
-dProductDescriptionLength=64 -dProductReplacementPeriod=25s -dProductReplacementCount=5 \
-dCustomerReplacementPeriod=30s -dCustomerReplacementCount=1000 -dBrowsingExpiration=1m \
-dPhasedUpdates=true \
-dPhasedUpdateInterval=60s \
-dSimulationDuration=20m -dResponseTimeMeasurements=100000
echo Run Humongous Failure Handling with one degen and memory size $i with 4s customer period
>&2 echo Run Humongous Failure Handling with one degen and memory size $i with 4s customer period
~/gitfarm/shen.humongous-alloc-failure-handling/build/linux-x86_64-server-release/jdk/bin/java \
-XX:+UnlockExperimentalVMOptions \
-XX:+UseTransparentHugePages \
-XX:-ShenandoahPacing \
-XX:+AlwaysPreTouch -XX:+DisableExplicitGC -Xms$i -Xmx$i \
-XX:+UseShenandoahGC \
-Xlog:"gc*=info,ergo" \
-Xlog:safepoint=trace -Xlog:safepoint=debug -Xlog:safepoint=info \
-XX:+UnlockDiagnosticVMOptions \
-jar ~/github/heapothesys/Extremem/target/extremem-1.0-SNAPSHOT.jar \
-dInitializationDelay=45s -dDictionarySize=16000000 -dNumCustomers=28000000 \
-dNumProducts=64000 -dCustomerThreads=2000 -dCustomerPeriod=4s -dCustomerThinkTime=1s \
-dKeywordSearchCount=4 -dServerThreads=5 -dServerPeriod=5s -dProductNameLength=10 \
-dBrowsingHistoryQueueCount=5 \
-dSalesTransactionQueueCount=5 \
-dProductDescriptionLength=64 -dProductReplacementPeriod=25s -dProductReplacementCount=5 \
-dCustomerReplacementPeriod=30s -dCustomerReplacementCount=1000 -dBrowsingExpiration=1m \
-dPhasedUpdates=true \
-dPhasedUpdateInterval=60s \
-dSimulationDuration=20m -dResponseTimeMeasurements=100000
done
My shen.humongous-alloc-failure-handling branch had an experimental delta from what is here. The idea was to do only one degen, and then upgrade to full on a humongous alloc failure:
```diff --git a/src/hotspot/share/gc/shenandoah/shenandoahControlThread.cpp b/src/hotspot/share/gc/shenandoah/shenandoahControlThread.cpp
index a4b2adc8f5a..42721059b7e 100644
--- a/src/hotspot/share/gc/shenandoah/shenandoahControlThread.cpp
+++ b/src/hotspot/share/gc/shenandoah/shenandoahControlThread.cpp
@@ -108,7 +108,11 @@ void ShenandoahControlThread::run_service() {
// If a humongous allocation has failed, then the heap is likely in need of compaction, so run
// a full gc (which compacts regions) instead of a degenerated gc (which does not compact regions).
- if (ShenandoahDegeneratedGC && heuristics->should_degenerate_cycle() && !humongous_alloc_failure_pending) {
+
+ // Experiment: If we had a humongous_alloc_failure, make sure we try at least one degen before going to full.
+ if (ShenandoahDegeneratedGC &&
+ ((humongous_alloc_failure_pending && heap->shenandoah_policy()->consecutive_degenerated_gc_count() == 0) ||
+ (!humongous_alloc_failure_pending && heuristics->should_degenerate_cycle()))) {
heuristics->record_allocation_failure_gc();
policy->record_alloc_failure_to_degenerated(degen_point);
mode = stw_degenerated;
The results are summarized in the attached spreadsheet.
[humongous-alloc-failure-handling.xlsx](https://github.com/openjdk/jdk/files/14173614/humongous-alloc-failure-handling.xlsx)
-------------
PR Comment: https://git.openjdk.org/jdk/pull/17638#issuecomment-1928614499
More information about the shenandoah-dev
mailing list