RFR: 8240556: Abort concurrent mark after effective eager reclamation of humongous objects [v3]
Stefan Johansson
sjohanss at openjdk.java.net
Thu Sep 24 07:41:10 UTC 2020
On Wed, 23 Sep 2020 14:50:04 GMT, Thomas Schatzl <tschatzl at openjdk.org> wrote:
>> Hi all,
>>
>> please review this change that implements concurrent mark abort as proposed by Liang Mao from Alibaba.
>>
>> Basically, if at the end of the concurrent start pause we notice that we went below the IHOP threshold (and it has been
>> a concurrent start pause caused by humongous allocation), instead of scheduling a mark operation, we schedule a
>> "concurrent undo" operation that undoes the changes. Regarding the removal of the
>> test/hotspot/jtreg/gc/stress/jfr/TestStressBigAllocationGCEventsWithG1.java test: it only ever accidentally worked in
>> G1. G1 never sent the AllocationRequiringGC events for GCs caused by going over the IHOP threshold for humongous
>> allocations that the test actually expects. That test previously only worked because G1 could not reclaim the
>> humongous objects fast enough so crossing the IHOP threshold causes a full concurrent mark. Allocations during that
>> concurrent mark do not cause a GC that can reclaim these objects, so ultimately some young GC that sends the
>> AllocationRequiringGC event will be sent. With concurrent undo this is not guaranteed any more, i.e. only in
>> environments where concurrent undo is slow (and we'll improve it soon) this test works. The test is too timing
>> dependent, and checking for the wrong thing in this case, so removing it. Testing: tbd.
>
> Thomas Schatzl has updated the pull request with a new target base due to a merge or a rebase. The pull request now
> contains five commits:
> - sjohanss comments #2
> - reviews sjohanns and ayang; merged Command and Status plus cleanup
> - Remove test that only accidentally worked: G1 never sent the AllocationRequiringGC
> event for GCs caused by going over the IHOP threshold for humongous allocations.
>
> That test previously only worked because G1 could not reclaim the humongous objects
> fast enough so crossing the IHOP threshold causes a full concurrent mark. Allocations
> during that concurrent mark do not cause a GC that can reclaim these objects, so
> ultimately some young GC that sends the AllocationRequiringGC event will be sent.
>
> With concurrent undo this is not guaranteed any more, i.e. only in environments where
> concurrent undo is slow (and we'll improve it soon) this test works.
>
> The test is too timing dependent, so removing it.
> - Fixed test
> - Initial import
Thanks for doing this additional refactoring Thomas.
I think this is good now, just one more small suggestion.
src/hotspot/share/gc/g1/g1Policy.cpp line 1262:
> 1260: _concurrent_start_to_mixed.record_concurrent_start_end(end);
> 1261: }
> 1262: break;
I would add the undo-case after to avoid having to check for it in the if-statement. Like this:
Suggestion:
case ConcurrentStartMarkGC:
// Do not track time-to-mixed time for periodic collections as they are likely
// to be not representative to regular operation as the mutators are idle at
// that time.
if ((_g1h->gc_cause() != GCCause::_g1_periodic_collection) {
_concurrent_start_to_mixed.record_concurrent_start_end(end);
}
case ConcurrentStartUndoGC:
// Do not track undone concurrent cycles.
break;
If you feel like it you could add an explicit break for the start-case and assert that the start-to-mixed-tracking is
not active in the undo-case. But this would require a `is_active()` getter for the tracker.
-------------
PR: https://git.openjdk.java.net/jdk/pull/177
More information about the hotspot-jfr-dev
mailing list