RFR: 8300915: G1: incomplete SATB because nmethod entry barriers don't get armed [v5]

Thomas Schatzl tschatzl at openjdk.org
Fri Jan 27 08:47:25 UTC 2023


On Thu, 26 Jan 2023 20:29:25 GMT, Richard Reingruber <rrich at openjdk.org> wrote:

>> The change makes sure all nmethod entry barriers are armed when a G1 concurrent marking cycle starts by doing it unconditionally in `G1ConcurrentMark::pre_concurrent_start()`. The barrier is needed to add oop constants to the SATB.
>> 
>> This alone would be a conservative/minimal fix but I didn't like that `CodeCache::is_gc_marking_cycle_active()` can return true after a marking cycle start is undone. I.e. `G1ConcurrentMarkThread::_state == Idle && CodeCache::is_gc_marking_cycle_active()` is possible.
>> Edit: It felt like an inconsistency but maybe its actually ok not to finish the CC marking cycle when the G1 marking is undone. Please comment...
>> The changes in `G1ConcurrentMark::post_concurrent_undo_start()` were made to avoid it.
>> 
>> 
>> Testing
>> 
>> * Rebased this PR onto [JDK-8297487](https://bugs.openjdk.org/browse/JDK-8297487), replaced related assertions with guarantees, and then executed test/langtools:tier1 in a loop for 12h without seeing the issues reported in [JDK-8299956](https://bugs.openjdk.org/browse/JDK-8299956).
>> 
>> * CI testing at SAP: This includes most JCK and JTREG tiers 1-4, also in Xcomp mode, on the standard platforms and also on ppc64le.
>> 
>> * GHA: build failures because download of apache ant failed. `serviceability/jvmti/vthread/SuspendResumeAll/SuspendResumeAll.java` had a timeout on windows-x64. Should be unrelated. It succeeded in our CI testing.
>
> Richard Reingruber has updated the pull request incrementally with two additional commits since the last revision:
> 
>  - Remark: only finish codecache cycle if marking was finished
>  - Conservative fix: arm nmethods unconditionally at marking start

Some comments/bikeshedding :)
Lgtm otherwise.

src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 3376:

> 3374:   // (1) was aborted and a full gc is about to begin or
> 3375:   // (2) it was undone because the heap occupancy was below the threshold after
> 3376:   //     the initiating young gc (see G1ConcurrentMark::post_concurrent_undo_start())

Suggestion:

  //     the concurrent start young gc (see G1ConcurrentMark::post_concurrent_undo_start())

src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 3381:

> 3379:   // finished when executing the G1 undo operation. If it was (repeatedly) finished
> 3380:   // then nmethods with frames in continuation StackChunks could appear as not
> 3381:   // on stack because they were not marked as on stack in the undone concurrent marking.

Suggestion:

  // Especially in case (2) it is important to note the active codecache cycle did not
  // finish (maybe "complete, i.e. did not mark through all live objects,"?) when executing the G1 concurrent undo operation. If it (repeatedly) did not finish/complete
  // then nmethods with frames in continuation StackChunks could appear as not
  // on stack because they were not marked as on stack in the undone concurrent markings.

src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 3384:

> 3382:   //
> 3383:   // Also for (2) it is important to arm nmethod entry barriers even if no new
> 3384:   // code cache cycle is started. They are needed for a complete SATB. A full gc

Suggestion:

  // code cache cycle is started. They are needed for a complete SATB (snapshot?). A full gc

-------------

Marked as reviewed by tschatzl (Reviewer).

PR: https://git.openjdk.org/jdk/pull/12194


More information about the hotspot-gc-dev mailing list