RFR: 8375209: Xcheck:jni should check when GC is about to deadlock in JNI critical section [v2]

Aleksey Shipilev shade at openjdk.org
Wed Jan 14 07:36:47 UTC 2026


On Tue, 13 Jan 2026 19:44:23 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

>> Related to [JDK-8375188](https://bugs.openjdk.org/browse/JDK-8375188), and regardless what happens with the implementations, I think we really want to have `-Xcheck:jni` to tell us when we are about to deadlock. This is useful to diagnose the issue in the field.
>> 
>> We used to have this capability in Serial/Parallel prior to [JDK-8192647](https://bugs.openjdk.org/browse/JDK-8192647), AFAICS: https://github.com/openjdk/jdk/commit/a9c9f7f0cbb2f2395fef08348bf867ffa8875d73#diff-d27fc793db1bf9314b322d494cd1c3269629fe27a605b4441de08d543d020fc3L341-L344
>> 
>> ZGC never had this check, AFAICS. I am not sure if I put the check in the right place. I believe it is in the right one, as we want to check that Java thread is not blocked waiting for GC driver to respond while being in JNI critical section itself. Current placement works well with the test.
>> 
>> I opted to add the checking at the paths that are really affected by the issue, because it is really about what implementations are doing in this case. But we can also summarily check this in all `CollectedHeap::collect` overrides -- similar to ZGC case -- so that testing with `-Xcheck:jni` with Epsilon/G1/Shenandoah would also cover every other GC that might run into trouble.
>> 
>> Additional testing:
>>  - [x] New test, 100x repetitions
>
> Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Terminology

I agree the spec is lenient to JVM in this regard, this is a programming error to do extra things in JNI critical region. `Xcheck:jni` is the facility to help nail programming errors like this, hence the PR. I suppose it is borderline possible/acceptable to have native code return while being in JNI critical region, as long as you don't do anything else. But maybe I am short on imagination: sounds like JNI transition can actually block for GC _while_ the code in question in JNI critical region? IIRC, we never block on Java->native transition, we block on native->Java transition when safepoint is pending.

So something like:


 native_getCritical(); // enters JNI critical region, leaves native code
 // GC is announced somewhere aroudn here, safepoint is armed
 native_doWhatever(); // in and out, discovers safepoint is armed on transition back to Java, blocks
 native_releaseCritical(); // exits JNI critical region, but we never get here


If so, checking we are not holding JNI critical when doing native->Java transition would indeed cover more ground.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/29206#issuecomment-3748210048


More information about the hotspot-gc-dev mailing list