RFR: 8300915: G1: incomplete SATB because nmethod entry barriers don't get armed [v5]
Thomas Schatzl
tschatzl at openjdk.org
Fri Jan 27 08:47:25 UTC 2023
On Thu, 26 Jan 2023 20:29:25 GMT, Richard Reingruber <rrich at openjdk.org> wrote:
>> The change makes sure all nmethod entry barriers are armed when a G1 concurrent marking cycle starts by doing it unconditionally in `G1ConcurrentMark::pre_concurrent_start()`. The barrier is needed to add oop constants to the SATB.
>>
>> This alone would be a conservative/minimal fix but I didn't like that `CodeCache::is_gc_marking_cycle_active()` can return true after a marking cycle start is undone. I.e. `G1ConcurrentMarkThread::_state == Idle && CodeCache::is_gc_marking_cycle_active()` is possible.
>> Edit: It felt like an inconsistency but maybe its actually ok not to finish the CC marking cycle when the G1 marking is undone. Please comment...
>> The changes in `G1ConcurrentMark::post_concurrent_undo_start()` were made to avoid it.
>>
>>
>> Testing
>>
>> * Rebased this PR onto [JDK-8297487](https://bugs.openjdk.org/browse/JDK-8297487), replaced related assertions with guarantees, and then executed test/langtools:tier1 in a loop for 12h without seeing the issues reported in [JDK-8299956](https://bugs.openjdk.org/browse/JDK-8299956).
>>
>> * CI testing at SAP: This includes most JCK and JTREG tiers 1-4, also in Xcomp mode, on the standard platforms and also on ppc64le.
>>
>> * GHA: build failures because download of apache ant failed. `serviceability/jvmti/vthread/SuspendResumeAll/SuspendResumeAll.java` had a timeout on windows-x64. Should be unrelated. It succeeded in our CI testing.
>
> Richard Reingruber has updated the pull request incrementally with two additional commits since the last revision:
>
> - Remark: only finish codecache cycle if marking was finished
> - Conservative fix: arm nmethods unconditionally at marking start
Some comments/bikeshedding :)
Lgtm otherwise.
src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 3376:
> 3374: // (1) was aborted and a full gc is about to begin or
> 3375: // (2) it was undone because the heap occupancy was below the threshold after
> 3376: // the initiating young gc (see G1ConcurrentMark::post_concurrent_undo_start())
Suggestion:
// the concurrent start young gc (see G1ConcurrentMark::post_concurrent_undo_start())
src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 3381:
> 3379: // finished when executing the G1 undo operation. If it was (repeatedly) finished
> 3380: // then nmethods with frames in continuation StackChunks could appear as not
> 3381: // on stack because they were not marked as on stack in the undone concurrent marking.
Suggestion:
// Especially in case (2) it is important to note the active codecache cycle did not
// finish (maybe "complete, i.e. did not mark through all live objects,"?) when executing the G1 concurrent undo operation. If it (repeatedly) did not finish/complete
// then nmethods with frames in continuation StackChunks could appear as not
// on stack because they were not marked as on stack in the undone concurrent markings.
src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 3384:
> 3382: //
> 3383: // Also for (2) it is important to arm nmethod entry barriers even if no new
> 3384: // code cache cycle is started. They are needed for a complete SATB. A full gc
Suggestion:
// code cache cycle is started. They are needed for a complete SATB (snapshot?). A full gc
-------------
Marked as reviewed by tschatzl (Reviewer).
PR: https://git.openjdk.org/jdk/pull/12194
More information about the hotspot-gc-dev
mailing list