[jdk18] RFR: 8274007: [REDO] VM Exit does not abort concurrent mark [v2]

Thu Dec 16 10:21:33 UTC 2021

> Hi all,
> 
>   can I get reviews for this change that redos JDK-8273605 that implemented faster abort of concurrent marking when the VM is about to shutdown?
> 
> In the earlier change (first commit in this change) had the following issues:
>   * it tried to abort (set the abort flag) marking even when root region scanning was active; this is something G1 does not support. In this change we cancel and wait for completion of that phase instead.
>   * the second problem has been in `G1ConcurrentMark::concurrent_cycle_abort`: if a full gc were triggered during VM shutdown, the next bitmap would not be cleared. I.e. the condition to skip the rest of the mark abort handling (in `concurrent_cycle_abort`) in the full gc pause has been wrong.
>  The first part `!cm_thread()->in_progress()` is correct, but the `|| _has_aborted` part is bad: aborting the marking (as in aborting during shutdown, what this change wants to achieve) does not clear the (next) mark bitmap at all with that condition. This is actually a long standing issue: however since nobody aborted the marking early except full gc (where we are in the moment), this path has never been taken, so afaict no harm done. (Global mark stack overflow causes abort of the *mark task*, and overflow of the *mark state*). Another issue with using `_has_aborted` here is that `_has_aborted` might have already been cleared when the full gc occurs: i.e. we abort the marking, it finishes its marking cycle (clearing `_has_aborted`) and only then that full gc happens.
> The correct way is to keep on clearing the next bitmap (for the full gc(s)) if we aborted the marking due to shutdown (`_g1h->concurrent_mark_is_terminating()`).
> This has the disadvantage that not only the first full gc during shutdown clears the next bitmap, but also any subsequent full gcs (concurrent marking is already prohibited to start during shutdown), and they do that unnecessarily.
> However any change here (like keeping a flag whether the next bitmap is dirty) seemed to be an unnecessary complication unsuitable for such a change, and any full gc will be much longer than clearing the bitmap once anyway).
> Another alternative that has been considered has been making an explicit shutdown state in `G1ConcurrentMarkThread`; however that would mirror functionality provided in `ConcurrentGCThread`, which does not improve the code, or at least at this time I would prefer not to change something there.
>  
> Testing: hs-tier1-5,  vmTestbase/vm/mlvm/meth/stress/gc/callSequencesDuringGC/Test.java a few thousand times without issue (both on jdk18 and jdk19)

Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision:

  sjohanss review

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk18/pull/22/files
  - new: https://git.openjdk.java.net/jdk18/pull/22/files/8a10c76c..b4bb2c22

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk18&pr=22&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk18&pr=22&range=00-01

  Stats: 10 lines in 1 file changed: 3 ins; 0 del; 7 mod
  Patch: https://git.openjdk.java.net/jdk18/pull/22.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk18 pull/22/head:pull/22

PR: https://git.openjdk.java.net/jdk18/pull/22