RFR: 8256265: G1: Improve parallelism in regions that failed evacuation [v2]

Thomas Schatzl tschatzl at openjdk.org
Tue Sep 13 09:15:48 UTC 2022


On Mon, 12 Sep 2022 11:48:04 GMT, Albert Mingkun Yang <ayang at openjdk.org> wrote:

>> Thomas Schatzl has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 12 commits:
>> 
>>  - Merge branch 'master' of gh:tschatzl/jdk into 8256265-parallel-evac-failure
>>  - disable G1EvacuationFailureALot by default again
>>  - Remove unneeded clear metadata phase
>>  - Remove debug code
>>  - More refactoring
>>  - Initial cleanup
>>  - some refactoring, fix clearing of opt index in cset
>>  - fix test
>>  - some cleanup
>>  - Cleanup, phase names, fixes
>>  - ... and 2 more: https://git.openjdk.org/jdk/compare/37df5f56...2628451d
>
> src/hotspot/share/gc/g1/g1ParScanThreadState.cpp line 632:
> 
>> 630:     // Mark the failing object in the marking bitmap and later use the bitmap to handle
>> 631:     // evacuation failure recovery.
>> 632:     _g1h->mark_evac_failure_object(_worker_id, old, word_sz);
> 
> The same live-size is calculated again in `process_chunk`, isn't it?

You mean the one (ultimately) recorded in the `G1ConcurrentMark::_live_bytes` array and the one for the statistics in `G1RemoveSelfForwardsInChunksTask::process_chunk`?
You probably suggest to merge them somehow to do this work once too?

I do not think this is worth, particularly because `process_chunk` only collects live bytes per thread without any synchronization (it does collect dead bytes with synchronization).
So you could use that value (the total dead bytes) or vice versa to update `G1ConcurrentMark::_live_bytes` at the end of collection, but
* `G1ConcurrentMark::_live_bytes` (the sum) isn't actually gathered here, but again these are basically per-thread values (using the mark-stats cache), so otoh the update is inexpensive (nothing compared to the `par_mark` in `mark_evac_failure_object`), and otoh it can't be used for updating `HeapRegion::_garbage_bytes` directly. This would need to flush buffers somewhere before doing that.
* the collected `HeapRegion::_garbage_bytes` could be used at the end to update `G1ConcurrentMark::_live_bytes`, but it does not seem to be worth the complexity adding another phase somewhere else. Also it would require another `G1ConcurrentMark` API entry just for that which I do not see worth doing.

I.e. I do not think it is worth optimizing something that trivial and add the required code complexity. I would prefer to decrease code complexity.

-------------

PR: https://git.openjdk.org/jdk/pull/9980


More information about the hotspot-dev mailing list