RFR: 8373839: Disable JVM TI suspension during JNI critical regions
Patricio Chilano Mateo
pchilanomate at openjdk.org
Fri Dec 19 11:22:33 UTC 2025
On Thu, 18 Dec 2025 23:25:17 GMT, Patricio Chilano Mateo <pchilanomate at openjdk.org> wrote:
>> So this `_is_disable_suspend` flag will prevent the target from processing the async handshake and suspend, but the suspender will still consider the target suspended once `SuspendThreadHandshakeClosure` is done. We would need to check the state of the target and don't consider it "handshake safe" if it's in a JNI critical region.
>
>> Thanks for looking at this @pchilano
>>
>> > So this `_is_disable_suspend` flag will prevent the target from processing the async handshake and suspend, but the suspender will still consider the target suspended once `SuspendThreadHandshakeClosure` is done.
>>
>> This two-step process always causes me confusion. So the "disabling" mechanism is not actually disabling anything from the requesters point of view. I don't understand what it is actually doing then? And I think I would like it called something else.
>>
> It was added to prevent deadlocks while executing some critical sections in methods of the `VirtualThread` class. Don't remember if there was an issue with delaying the suspender instead.
>
>> > We would need to check the state of the target and don't consider it "handshake safe" if it's in a JNI critical region.
>>
>> So rather than "just" disabling suspension whilst in JNI critical your suggestion would delay all handshakes and safepoints which seems a far more risky proposition.
>>
> It would only be for suspend operations, as the PR is trying to address this specific issue with the debugger. I see that something like this was already suggested in the JBS comments for 8369515, and that there is an alternative suggestion to return a copy of the object. In any case my comment was mainly to point out we are not disabling suspension as intended and describe what would need to be done instead (or at least one of the two possible solutions).
> @pchilano is the latest proposal more what you were thinking? It has an obvious flaw if the critical region is badly behaved, but perhaps we just live with that.
>
Thanks David. I was thinking on having a `in_critical()` check in `HandshakeState::try_process` when the op returned by `get_op()` is `SuspendThreadHandshakeClosure` since we already have the `HandshakeSpinYield` of the operation. This avoids spinning in the closure which would block other handshakes or safepoints. And if the check fails we could pick another operation, if available, and not just return `_not_safe` to retry (to avoid stalling other handshakes once the suspend is at the front of the queue). But I guess it depends on how paranoid we are that this JNI API is being used correctly, so your simpler change could work too.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/28884#issuecomment-3674677513
More information about the serviceability-dev
mailing list