RFR: 8240556: Abort concurrent mark after effective eager reclamation of humongous objects [v2]
Stefan Johansson
sjohanss at openjdk.java.net
Tue Sep 22 08:48:41 UTC 2020
On Mon, 21 Sep 2020 18:15:08 GMT, Thomas Schatzl <tschatzl at openjdk.org> wrote:
>> Hi all,
>>
>> please review this change that implements concurrent mark abort as proposed by Liang Mao from Alibaba.
>>
>> Basically, if at the end of the concurrent start pause we notice that we went below the IHOP threshold (and it has been
>> a concurrent start pause caused by humongous allocation), instead of scheduling a mark operation, we schedule a
>> "concurrent undo" operation that undoes the changes. Regarding the removal of the
>> test/hotspot/jtreg/gc/stress/jfr/TestStressBigAllocationGCEventsWithG1.java test: it only ever accidentally worked in
>> G1. G1 never sent the AllocationRequiringGC events for GCs caused by going over the IHOP threshold for humongous
>> allocations that the test actually expects. That test previously only worked because G1 could not reclaim the
>> humongous objects fast enough so crossing the IHOP threshold causes a full concurrent mark. Allocations during that
>> concurrent mark do not cause a GC that can reclaim these objects, so ultimately some young GC that sends the
>> AllocationRequiringGC event will be sent. With concurrent undo this is not guaranteed any more, i.e. only in
>> environments where concurrent undo is slow (and we'll improve it soon) this test works. The test is too timing
>> dependent, and checking for the wrong thing in this case, so removing it. Testing: tbd.
>
> Thomas Schatzl has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev
> excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since
> the last revision:
> - reviews sjohanns and ayang; merged Command and Status plus cleanup
> - Remove test that only accidentally worked: G1 never sent the AllocationRequiringGC
> event for GCs caused by going over the IHOP threshold for humongous allocations.
>
> That test previously only worked because G1 could not reclaim the humongous objects
> fast enough so crossing the IHOP threshold causes a full concurrent mark. Allocations
> during that concurrent mark do not cause a GC that can reclaim these objects, so
> ultimately some young GC that sends the AllocationRequiringGC event will be sent.
>
> With concurrent undo this is not guaranteed any more, i.e. only in environments where
> concurrent undo is slow (and we'll improve it soon) this test works.
>
> The test is too timing dependent, so removing it.
> - Fixed test
> - Initial import
> I dislike putting this pause-related state into global state. Unless you or somebody else insists on this I would like
> to keep the code.
>
> Filed JDK-8253343 which among other things should clean this up and make it possible to easily manage pause-only state.
I get your argument about not putting this into the collector-state and I'm fine with leaving it like this for now. I
think this update really helped and I have two more comments on how to reduce the "state passing" even more. Let me
know what you think.
src/hotspot/share/gc/g1/g1ConcurrentMark.cpp line 707:
> 705: root_regions()->cancel_scan();
> 706: return;
> 707: }
I still think this should be broken out from this function. Either doing it in
`G1ConcurrentMarkThread::concurrent_undo_cycle_do()` or having a separate function that is called right before
`_cm_thread->start_undo_mark()` in `CollectedHeap::start_concurrent_cycle()`.
src/hotspot/share/gc/g1/g1Policy.cpp line 790:
> 788: if (!concurrent_operation_is_full_mark) {
> 789: abort_time_to_mixed_tracking();
> 790: }
More or less same idea here. What do you think about moving this to a separate function, that is either called from
`start_concurrent_cycle()` or `concurrent_undo_cycle_do()`?
-------------
PR: https://git.openjdk.java.net/jdk/pull/177
More information about the hotspot-jfr-dev
mailing list