RFR: 8336640: Shenandoah: Parallel worker use in parallel_heap_region_iterate [v2]
Xiaolong Peng
xpeng at openjdk.org
Wed Jul 24 19:50:44 UTC 2024
> [parallel_heap_region_iterate](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp#L1726-L1734) is used to execute lightweight operations on heap regions, including ShenandoahPrepareForMarkClosure, ShenandoahInitMarkUpdateRegionStateClosure, ShenandoahFinalUpdateRefsUpdateRegionStateClosure, ShenandoahResetUpdateRegionStateClosure and ShenandoahFinalMarkUpdateRegionStateClosure. Since all the operations are very lightweight, in regular cases w/o large number of heap regions, the parallelism seems to be an overkill because the cost of multi-thread orchestrating could be more expensive; In most cases, single thread should be more efficient. Also, if multiple threading is needed, we should maximize the utilization of all active workers for best performance.
>
> This PR includes proposed improvments addressing the known issues:
> 1. Change the default value of ShenandoahParallelRegionStride to 0, when it is 0, Shenandoah will auto derive the value of stride for best performance;
> 2. if num_regions is <= 4096, not use worker threads at all to avoid the overhead of multi-threading;
> 3. When num_regions is more than 4096, use worker threads to parallelize the workload, derive the value of stride to evenly distribute the workload to all active workers.
> 4. When number of active workers is 1, don't bother the workers, it is faster to finish the workload in current thread(avoid overhead of multi-threads orchestration)
>
> There are some time metrics I collected from test with TIP version(I added time metrics for parallel_heap_region_iterate):
>
> JVM args: export JAVA_OPTS="-Xms8G -Xmx8G -XX:+AlwaysPreTouch -XX:+UseShenandoahGC -XX:+UnlockExperimentalVMOptions -XX:ShenandoahParallelRegionStride=<stride> -XX:ShenandoahTargetNumRegions=<num_regions> -Xlog:gc*"
>
> | | 1024 regions | 2048 regions | 4096 regions | 8192 regions |16384 regions |
> | ----------- | ------------ | ------------ | ------------ | ------------ |------------ |
> | 1024 stride | 5785 ns | 22194 ns | 20953 ns | 23008 ns |33013 ns |
> | 2048 stride | N/A | 6491 ns | 22476 ns | 25842 ns |34378 ns |
> | 4096 stride | N/A | N/A | 14034 ns | 28425 ns |36324 ns |
> | 8192 stride | N/A | N/A | N/A | 24359 ns |45231 ns |
> | 16384 stride | N/A | N/A | N/A | N/A |53679 ns |
>
> Basically w...
Xiaolong Peng has updated the pull request incrementally with one additional commit since the last revision:
Add trailing space
-------------
Changes:
- all: https://git.openjdk.org/jdk/pull/20305/files
- new: https://git.openjdk.org/jdk/pull/20305/files/01367942..dcc1d29f
Webrevs:
- full: https://webrevs.openjdk.org/?repo=jdk&pr=20305&range=01
- incr: https://webrevs.openjdk.org/?repo=jdk&pr=20305&range=00-01
Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod
Patch: https://git.openjdk.org/jdk/pull/20305.diff
Fetch: git fetch https://git.openjdk.org/jdk.git pull/20305/head:pull/20305
PR: https://git.openjdk.org/jdk/pull/20305
More information about the hotspot-gc-dev
mailing list