RFR: 8256265 G1: Improve parallelism in regions that failed evacuation [v2]
Hamlin Li
mli at openjdk.java.net
Fri Jan 14 08:30:45 UTC 2022
On Thu, 2 Dec 2021 01:51:42 GMT, Hamlin Li <mli at openjdk.org> wrote:
>> Summary
>> -------
>>
>> Currently G1 assigns a thread per failed evacuated region. This can in effect serialize the whole process as often (particularly with region pinning) there is only one region to fix up.
>> Try to improve the parallelsim when walking over the regions by
>>
>> - first, split a region into tasks;
>> - then, process these task in parallel and load balance among GC threads;
>> - last, necessary cleanup
>>
>> NOTE: load balance part of code is almost same as G1ParScanThreadState, if necessary and feasible, consider to refactor this part into a shared code base.
>>
>> Performance Test
>> -------
>>
>> The perf test based on lastest implementation + JDK-8277736 shows that:
>>
>> - when `ParallelGCThreads`=32, when `G1EvacuationFailureALotCSetPercent` <= 50, the parallelism bring more benefit than regression;
>> - when `ParallelGCThreads`=128, whatever `G1EvacuationFailureALotCSetPercent` is, the parallelism bring more benefit than regression;
>>
>> other related evac failure vm options:
>> - `G1EvacuationFailureALotInterval`=1
>> - `G1EvacuationFailureALotCount`=1
>>
>> For detailed perf test result, please check:
>>
>> - https://bugs.openjdk.java.net/secure/attachment/97227/parallel.evac.failure-threads.32.png
>> - https://bugs.openjdk.java.net/secure/attachment/97228/parallel.evac.failure-threads.128.png
>>
>> For the situation like G1EvacuationFailureALotCSetPercent > 50 and ParallelGCThreads=32 , we could fall back to current implmentation, or further optimize the thread sizing at this phase if necessary.
>>
>> NOTE: I don't include perf data for `Remove Self Forwards`, because the comparison of pause time in this phase does not well show the improvement of this implementation, I think the reason is that the original implementation is not load balanced, and the new implementation is. But as `Remove Self Forwards` is part of `Post Evacuate Cleanup 1`, so only `Post Evacuate Cleanup 1` well show the improvement of the new implementation.
>> It could be a potential improvement to refine the Pause time data in `Remove Self Forwards` phase.
>
> Hamlin Li has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains ten commits:
>
> - Merge branch 'master' into parallelize-evac-failure
> - Adjust worker cost by a factor; initialize task queues set and terminator threads by active workers
> - Fix wrong merge
> - Merge with master
> - Remove and merge code of G1ParRemoveSelfForwardPtrsTask into RemoveSelfForwardPtrsTask
> - Fix crashes in ~G1GCParPhaseTimesTracker(), G1PreRemoveSelfForwardClosure::do_heap_region, G1CollectedHeap::par_iterate_regions_array()=>~StubRoutines::atomic entry points; Refine comments
> - Fix inconsistent length between task queues and terminator
> - Fix crash when heap verification; Fix compilation error; Refine comments
> - Initial commit
I've created a draft PR at #7047, this one will be closed.
-------------
PR: https://git.openjdk.java.net/jdk/pull/6627
More information about the hotspot-gc-dev
mailing list