RFR: 8364343: Virtual Thread transition management needs to be independent of JVM TI [v4]
David Holmes
dholmes at openjdk.org
Fri Nov 21 01:00:58 UTC 2025
On Thu, 20 Nov 2025 23:10:48 GMT, Patricio Chilano Mateo <pchilanomate at openjdk.org> wrote:
>> When `ThreadSnapshotFactory::get_thread_snapshot()` captures a snapshot of a virtual thread, it uses `JvmtiVTMSTransitionDisabler` class to disable mount/unmount transitions. However, this only works if a JVMTI agent has attached to the VM, otherwise virtual threads don’t honor the disable request. Since this snapshot mechanism is used by `jcmd Thread.dump_to_file` and `HotSpotDiagnosticMXBean` which don’t require a JVMTI agent to be present, getting the snapshot of a virtual thread in such cases can lead to crashes.
>>
>> This patch moves the transition-disabling mechanism out of JVMTI and into a new class, `MountUnmountDisabler`. The code has been updated so that transitions can be disabled independently of JVMTI, making JVMTI just one user of the API rather than the owner of the mechanism. Here is a summary of the key changes:
>>
>> - Currently when a virtual thread starts a mount/unmount transition we only need to check if `_VTMS_notify_jvmti_events` is set to decide if we need to go to the slow path. With these changes, JVMTI is now only one user of the API, so we instead check the actual transition disabling counters, i.e the per-vthread counter and the global counter. Since these can be set at any time (unlike `_VTMS_notify_jvmti_events` which is only set at startup or during a safepoint in case of late binding agents), we follow the classic Dekker pattern for the required synchronization. That is, the virtual thread sets the “in transition” bits for the carrier and vthread *before* reading the “transition disabled” counters. The thread requesting to disable transitions increments the “transition disabled” counter *before* reading the “in transition” bits.
>> An alternative that avoids the extra fence in `startTransition` would be to place extra overhead on the thread requesting to disable transitions (e.g. using safepoint, handshake-all, or UseSystemMemoryBarrier). Performance analysis show no difference with current mainline so for now I kept this simpler version.
>>
>> - Ending the transition doesn’t need to check if transitions are disabled (equivalent to not need to poll when transitioning from unsafe to safe safepoint state). But we still require to go through the slow path if there is a JVMTI agent present, since we need to check for event posting and JVMTI state rebinding. As such, the end transition follows the same pattern that we have today of only needing to check `_VTMS_notify_jvmti_events`.
>>
>> - The code was previously structured in t...
>
> Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision:
>
> Rename VM methods for endFirstTransition/startFinalTransition
Hi Patricio, this is another significant piece of work. I have taken an initial pass through trying to digest the main parts - can't comment on the C2 code or the Java side. I have made a few minor comments/suggestions.
Thanks
src/hotspot/share/prims/jvm.cpp line 3668:
> 3666: if (!DoJVMTIVirtualThreadTransitions) {
> 3667: assert(!JvmtiExport::can_support_virtual_threads(), "sanity check");
> 3668: return;
Does this not still need checking somewhere?
src/hotspot/share/runtime/mountUnmountDisabler.cpp line 162:
> 160: // be executed once we go back to Java. If this is an unmount, the handshake that the
> 161: // disabler executed against this carrier thread already provided the needed synchronization.
> 162: // This matches the release fence in xx_enable_for_one()/xx_enable_for_all().
Subtle. Do we have comments where the fences are to ensure people realize the fence is serving this purpose?
src/hotspot/share/runtime/mountUnmountDisabler.cpp line 277:
> 275:
> 276: // Start of the critical region. Prevent future memory
> 277: // operations to be ordered before we read the transition flag.
Does this refer to `java_lang_Thread::is_in_VTMS_transition(_vthread())`? If so perhaps that should internally perform the `load_acquire`?
src/hotspot/share/runtime/mountUnmountDisabler.cpp line 278:
> 276: // Start of the critical region. Prevent future memory
> 277: // operations to be ordered before we read the transition flag.
> 278: // This matches the release fence in end_transition().
Suggestion:
// This pairs with the release fence in end_transition().
src/hotspot/share/runtime/mountUnmountDisabler.cpp line 307:
> 305: // Block while some mount/unmount transitions are in progress.
> 306: // Debug version fails and prints diagnostic information.
> 307: for (JavaThreadIteratorWithHandle jtiwh; JavaThread *jt = jtiwh.next(); ) {
This looks very odd, having an assignment in the loop condition check and no actual loop-update expression.
src/hotspot/share/runtime/mountUnmountDisabler.cpp line 316:
> 314: // operations to be ordered before we read the transition flags.
> 315: // This matches the release fence in end_transition().
> 316: OrderAccess::acquire();
Surely the use of the iterator already provides the necessary ordering guarantee here as well. ?
src/hotspot/share/runtime/mountUnmountDisabler.cpp line 327:
> 325: // End of the critical section. Prevent previous memory operations to
> 326: // be ordered after we clear the clear the disable transition flag.
> 327: // This matches the equivalent acquire fence in start_transition().
Suggestion:
// This pairs with the acquire in start_transition().
I just realized you are using "fence" to describe release and acquire memory barrier semantics. Given we have an operation `fence` I find this confusing for the reader - especially when we also have a `release_store_fence` operation which might be confused with "release fence".
src/hotspot/share/runtime/mountUnmountDisabler.cpp line 370:
> 368: assert(VTMSTransition_lock->owned_by_self() || SafepointSynchronize::is_at_safepoint(), "Must be locked");
> 369: assert(_global_start_transition_disable_count >= 0, "");
> 370: AtomicAccess::store(&_global_start_transition_disable_count, _global_start_transition_disable_count + 1);
Suggestion:
AtomicAccess::inc(&_global_start_transition_disable_count);
src/hotspot/share/runtime/mountUnmountDisabler.cpp line 376:
> 374: assert(VTMSTransition_lock->owned_by_self() || SafepointSynchronize::is_at_safepoint(), "Must be locked");
> 375: assert(_global_start_transition_disable_count > 0, "");
> 376: AtomicAccess::store(&_global_start_transition_disable_count, _global_start_transition_disable_count - 1);
Suggestion:
AtomicAccess::dec(&_global_start_transition_disable_count);
src/hotspot/share/runtime/mountUnmountDisabler.hpp line 52:
> 50: // parameter is_SR: suspender or resumer
> 51: MountUnmountDisabler(bool exlusive = false);
> 52: MountUnmountDisabler(oop thread_oop);
What does the comment mean here?
-------------
PR Review: https://git.openjdk.org/jdk/pull/28361#pullrequestreview-3490207826
PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2547887801
PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2548145054
PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2548157390
PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2548150552
PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2548160373
PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2548161340
PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2548168223
PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2548169846
PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2548170787
PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2548174392
More information about the graal-dev
mailing list