RFR: 8345423: Shenandoah: Parallelize concurrent cleanup [v8]
Y. Srinivas Ramakrishna
ysr at openjdk.org
Tue Dec 10 18:13:46 UTC 2024
On Mon, 9 Dec 2024 20:42:21 GMT, Xiaolong Peng <xpeng at openjdk.org> wrote:
>> Concurrent cleanup after Shenandoah collection cycle is executed by single thread(Shenandoah control thread), since currently recycling trashed regions requires heap lock even it can be done w/o heap lock. This PR is a proposal to parallelize the execution of Shenandoah concurrent cleanup after making recycling trashed regions lock free.
>>
>> With the change the time execute Concurrent cleanup has been significantly improved by 10+ times, throughput/allocation rate is also improved significantly:
>>
>> TIP:
>>
>> [30.380s][info][gc] GC(1245) Concurrent cleanup (Young) 3491M->739M(4096M) 3.634ms
>> [30.404s][info][gc] GC(1246) Concurrent cleanup (Young) 3258M->377M(4096M) 2.233ms
>> [30.434s][info][gc] GC(1247) Concurrent cleanup (Young) 2887M->333M(4096M) 7.958ms
>> [30.464s][info][gc] GC(1248) Concurrent cleanup (Young) 3134M->472M(4096M) 6.097ms
>> [30.487s][info][gc] GC(1249) Concurrent cleanup (Young) 2922M->212M(4096M) 3.072ms
>> [30.519s][info][gc] GC(1250) Concurrent cleanup (Young) 3404M->549M(4096M) 3.730ms
>> [30.552s][info][gc] GC(1251) Concurrent cleanup (Young) 3542M->712M(4096M) 6.118ms
>> [30.579s][info][gc] GC(1252) Concurrent cleanup (Young) 3257M->373M(4096M) 5.049ms
>> [30.608s][info][gc] GC(1253) Concurrent cleanup (Young) 3390M->418M(4096M) 2.779ms
>>
>> Parallelized:
>>
>> [30.426s][info][gc] GC(1557) Concurrent cleanup (Young) 3208M->43M(4096M) 0.177ms
>> [30.510s][info][gc] GC(1560) Concurrent cleanup (Young) 2938M->161M(4096M) 0.220ms
>> [30.534s][info][gc] GC(1561) Concurrent cleanup (Young) 2960M->57M(4096M) 0.164ms
>> [30.564s][info][gc] GC(1562) Concurrent cleanup (Young) 3189M->106M(4096M) 0.176ms
>> [30.595s][info][gc] GC(1563) Concurrent cleanup (Young) 3389M->367M(4096M) 0.247ms
>> [30.625s][info][gc] GC(1564) Concurrent cleanup (Young) 3662M->628M(4096M) 0.246ms
>> [30.649s][info][gc] GC(1565) Concurrent cleanup (Young) 3190M->150M(4096M) 0.172ms
>> [30.678s][info][gc] GC(1566) Concurrent cleanup (Young) 3225M->69M(4096M) 0.175ms
>> [30.709s][info][gc] GC(1567) Concurrent cleanup (Young) 3250M->107M(4096M) 0.179ms
>> [30.765s][info][gc] GC(1570) Concurrent cleanup (Young) 2932M->211M(4096M) 0.422ms
>>
>>
>> JVM args for the tests: `-Xms4G -Xmx4G -XX:+AlwaysPreTouch -XX:+UseShenandoahGC -XX:+UnlockExperimentalVMOptions -XX:ShenandoahGCMode=generational -XX:-ShenandoahPacing -XX:+UseTLAB -Xlog:gc`
>>
>>
>> For the same test test, but with large heap with 32G memory, the improvement on concurrent cleanup is much smaller, which might be related t...
>
> Xiaolong Peng has updated the pull request incrementally with one additional commit since the last revision:
>
> Use parallel_heap_region_iterate to walk the regions
Good improvement.
Just some minor comments.
src/hotspot/share/gc/shenandoah/shenandoahFreeSet.cpp line 1272:
> 1270: void ShenandoahFreeSet::recycle_trash() {
> 1271: // lock is not reentrable, check we don't have it
> 1272: shenandoah_assert_not_heaplocked();
Not your change but may be a good time to fix:
"not reentrable" -> "non-reentrant" (which is the more traditional term)
src/hotspot/share/gc/shenandoah/shenandoahGeneration.cpp line 884:
> 882: // During full gc, multiple GC worker threads may change region affiliations without a lock. No lock is enforced
> 883: // on read and write of _affiliated_region_count. At the end of full gc, a single thread overwrites the count with
> 884: // a coherent value.
Is the comment in its entirety still valid now? The part about "No lock is enforced" seems a bit dubious given the atomic op.
Similarly the comment in `decrement_...` below.
src/hotspot/share/gc/shenandoah/shenandoahHeapRegion.cpp line 91:
> 89: SpaceMangler::mangle_region(MemRegion(_bottom, _end));
> 90: }
> 91: _recycling.unset();
Was this necessary, given the c'tor of the struct ShenandoiahFlag is called for the `_recycling` field? To check, I'd assert:
assert(!_recycling.is_set(), "C'tor should have been called by now.");
src/hotspot/share/gc/shenandoah/shenandoahHeapRegion.cpp line 574:
> 572:
> 573:
> 574: void ShenandoahHeapRegion::recycle_internal() {
A paranoid assertion would be: ```assert(_recycling.is_set() && is_trash(), "Wrong state");```
But may be this is too paranoid since callers already check.
src/hotspot/share/gc/shenandoah/shenandoahHeapRegion.hpp line 192:
> 190: void make_committed_bypass();
> 191:
> 192: // Individual states:
// Primitive state predicates
src/hotspot/share/gc/shenandoah/shenandoahHeapRegion.hpp line 199:
> 197:
> 198: bool is_empty_state(RegionState state) const { return state == _empty_committed || state == _empty_uncommitted; }
> 199: bool is_humongous_start_state(RegionState state) const { return state == _humongous_start || state == _pinned_humongous_start; }
These should move below line 201 which states:
// Participation in logical groups:
src/hotspot/share/gc/shenandoah/shenandoahHeapRegion.hpp line 201:
> 199: bool is_humongous_start_state(RegionState state) const { return state == _humongous_start || state == _pinned_humongous_start; }
> 200:
> 201: // Participation in logical groups:
// Derived state predicates (boolean combinations of individual states)
src/hotspot/share/gc/shenandoah/shenandoahHeapRegion.hpp line 210:
> 208: bool is_cset() const { auto cur_state = state(); return cur_state == _cset || cur_state == _pinned_cset; }
> 209: bool is_pinned() const { auto cur_state = state(); return cur_state == _pinned || cur_state == _pinned_cset || cur_state == _pinned_humongous_start; }
> 210: bool is_regular_pinned() const { return state() == _pinned; }
Should go up into the primitive list at the top.
src/hotspot/share/gc/shenandoah/shenandoahHeapRegion.hpp line 268:
> 266: CENSUS_NOISE(uint _youth;) // tracks epochs of retrograde ageing (rejuvenation)
> 267:
> 268: ShenandoahSharedFlag _recycling;
1-line documentation of what it represents.
// Used to indicate that the region is being recycled; see try_recycle*().
-------------
Marked as reviewed by ysr (Reviewer).
PR Review: https://git.openjdk.org/jdk/pull/22538#pullrequestreview-2490587106
PR Review Comment: https://git.openjdk.org/jdk/pull/22538#discussion_r1877048744
PR Review Comment: https://git.openjdk.org/jdk/pull/22538#discussion_r1877055108
PR Review Comment: https://git.openjdk.org/jdk/pull/22538#discussion_r1877164961
PR Review Comment: https://git.openjdk.org/jdk/pull/22538#discussion_r1877167040
PR Review Comment: https://git.openjdk.org/jdk/pull/22538#discussion_r1877076166
PR Review Comment: https://git.openjdk.org/jdk/pull/22538#discussion_r1877075569
PR Review Comment: https://git.openjdk.org/jdk/pull/22538#discussion_r1877092813
PR Review Comment: https://git.openjdk.org/jdk/pull/22538#discussion_r1877113173
PR Review Comment: https://git.openjdk.org/jdk/pull/22538#discussion_r1877134362
More information about the shenandoah-dev
mailing list