RFR: 8345423: Shenandoah: Parallelize concurrent cleanup [v6]
William Kemper
wkemper at openjdk.org
Thu Dec 5 18:25:39 UTC 2024
On Thu, 5 Dec 2024 08:58:52 GMT, Xiaolong Peng <xpeng at openjdk.org> wrote:
>> Parallelize concurrent cleanup after Shenandoah collection cycle is executed by single thread(Shenandoah control thread), since currently recycling trashed regions requires heap lock even it can be done w/o heap lock. This PR is a proposal to parallelize the execution of Shenandoah concurrent cleanup after making recycling trashed regions lock free.
>>
>> With the change the time execute Concurrent cleanup has been significantly improved by 10+ times, throughput/allocation rate is also improved significantly:
>>
>> TIP:
>>
>> [30.380s][info][gc] GC(1245) Concurrent cleanup (Young) 3491M->739M(4096M) 3.634ms
>> [30.404s][info][gc] GC(1246) Concurrent cleanup (Young) 3258M->377M(4096M) 2.233ms
>> [30.434s][info][gc] GC(1247) Concurrent cleanup (Young) 2887M->333M(4096M) 7.958ms
>> [30.464s][info][gc] GC(1248) Concurrent cleanup (Young) 3134M->472M(4096M) 6.097ms
>> [30.487s][info][gc] GC(1249) Concurrent cleanup (Young) 2922M->212M(4096M) 3.072ms
>> [30.519s][info][gc] GC(1250) Concurrent cleanup (Young) 3404M->549M(4096M) 3.730ms
>> [30.552s][info][gc] GC(1251) Concurrent cleanup (Young) 3542M->712M(4096M) 6.118ms
>> [30.579s][info][gc] GC(1252) Concurrent cleanup (Young) 3257M->373M(4096M) 5.049ms
>> [30.608s][info][gc] GC(1253) Concurrent cleanup (Young) 3390M->418M(4096M) 2.779ms
>>
>> Parallelized:
>>
>> [30.426s][info][gc] GC(1557) Concurrent cleanup (Young) 3208M->43M(4096M) 0.177ms
>> [30.510s][info][gc] GC(1560) Concurrent cleanup (Young) 2938M->161M(4096M) 0.220ms
>> [30.534s][info][gc] GC(1561) Concurrent cleanup (Young) 2960M->57M(4096M) 0.164ms
>> [30.564s][info][gc] GC(1562) Concurrent cleanup (Young) 3189M->106M(4096M) 0.176ms
>> [30.595s][info][gc] GC(1563) Concurrent cleanup (Young) 3389M->367M(4096M) 0.247ms
>> [30.625s][info][gc] GC(1564) Concurrent cleanup (Young) 3662M->628M(4096M) 0.246ms
>> [30.649s][info][gc] GC(1565) Concurrent cleanup (Young) 3190M->150M(4096M) 0.172ms
>> [30.678s][info][gc] GC(1566) Concurrent cleanup (Young) 3225M->69M(4096M) 0.175ms
>> [30.709s][info][gc] GC(1567) Concurrent cleanup (Young) 3250M->107M(4096M) 0.179ms
>> [30.765s][info][gc] GC(1570) Concurrent cleanup (Young) 2932M->211M(4096M) 0.422ms
>>
>>
>> JVM args for the tests: `-Xms4G -Xmx4G -XX:+AlwaysPreTouch -XX:+UseShenandoahGC -XX:+UnlockExperimentalVMOptions -XX:ShenandoahGCMode=generational -XX:-ShenandoahPacing -XX:+UseTLAB -Xlog:gc`
>>
>>
>> ### Additional test
>> - [x] MacOS AArch64 server fastdebug, hotspot_gc_shenandoah
>
> Xiaolong Peng has updated the pull request incrementally with two additional commits since the last revision:
>
> - Ensure atomicity when access region state
> - Bug fix and move is_trash test into try_recycle
Changes requested by wkemper (Committer).
src/hotspot/share/gc/shenandoah/shenandoahHeapRegion.cpp line 629:
> 627: _recycling.unset();
> 628: } else {
> 629: while (_recycling.is_set()) {
Why are adding this? Won't this make the calling worker thread wait on another worker to recycle the region?
src/hotspot/share/gc/shenandoah/shenandoahHeapRegion.hpp line 385:
> 383: void print_on(outputStream* st) const;
> 384:
> 385: void recycle_under_lock();
Should be `try_recycle_under_lock` for consistency.
-------------
PR Review: https://git.openjdk.org/jdk/pull/22538#pullrequestreview-2482551965
PR Review Comment: https://git.openjdk.org/jdk/pull/22538#discussion_r1871890730
PR Review Comment: https://git.openjdk.org/jdk/pull/22538#discussion_r1871891590
More information about the shenandoah-dev
mailing list