RFR: 8256265 G1: Improve parallelism in regions that failed evacuation [v2]

Hamlin Li mli at openjdk.java.net
Fri Jan 14 08:30:45 UTC 2022


On Thu, 2 Dec 2021 01:51:42 GMT, Hamlin Li <mli at openjdk.org> wrote:

>> Summary
>> -------
>> 
>> Currently G1 assigns a thread per failed evacuated region. This can in effect serialize the whole process as often (particularly with region pinning) there is only one region to fix up.
>> Try to improve the parallelsim when walking over the regions by 
>> 
>>  - first, split a region into tasks;
>>  - then, process these task in parallel and load balance among GC threads;
>>  - last, necessary cleanup
>> 
>> NOTE: load balance part of code is almost same as G1ParScanThreadState, if necessary and feasible, consider to refactor this part into a shared code base.
>> 
>> Performance Test
>> -------
>> 
>> The perf test based on lastest implementation + JDK-8277736 shows that:
>> 
>>  - when `ParallelGCThreads`=32, when `G1EvacuationFailureALotCSetPercent` <= 50, the parallelism bring more benefit than regression;
>> - when `ParallelGCThreads`=128, whatever `G1EvacuationFailureALotCSetPercent` is, the parallelism bring more benefit than regression;
>> 
>> other related evac failure vm options:
>>  - `G1EvacuationFailureALotInterval`=1
>>  - `G1EvacuationFailureALotCount`=1
>> 
>> For detailed perf test result, please check:
>> 
>>  - https://bugs.openjdk.java.net/secure/attachment/97227/parallel.evac.failure-threads.32.png
>>  - https://bugs.openjdk.java.net/secure/attachment/97228/parallel.evac.failure-threads.128.png
>>  
>> For the situation like G1EvacuationFailureALotCSetPercent > 50 and ParallelGCThreads=32 , we could fall back to current implmentation, or further optimize the thread sizing at this phase if necessary.
>> 
>> NOTE: I don't include perf data for `Remove Self Forwards`, because the comparison of pause time in this phase does not well show the improvement of this implementation, I think the reason is that the original implementation is not load balanced, and the new implementation is. But as `Remove Self Forwards` is part of `Post Evacuate Cleanup 1`, so only `Post Evacuate Cleanup 1` well show the improvement of the new implementation.
>> It could be a potential improvement to refine the Pause time data in `Remove Self Forwards` phase.
>
> Hamlin Li has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains ten commits:
> 
>  - Merge branch 'master' into parallelize-evac-failure
>  - Adjust worker cost by a factor; initialize task queues set and terminator threads by active workers
>  - Fix wrong merge
>  - Merge with master
>  - Remove and merge code of G1ParRemoveSelfForwardPtrsTask into RemoveSelfForwardPtrsTask
>  - Fix crashes in ~G1GCParPhaseTimesTracker(), G1PreRemoveSelfForwardClosure::do_heap_region, G1CollectedHeap::par_iterate_regions_array()=>~StubRoutines::atomic entry points; Refine comments
>  - Fix inconsistent length between task queues and terminator
>  - Fix crash when heap verification; Fix compilation error; Refine comments
>  - Initial commit

I've created a draft PR at #7047, this one will be closed.

-------------

PR: https://git.openjdk.java.net/jdk/pull/6627



More information about the hotspot-gc-dev mailing list