RFR: 8311218: fatal error: stuck in JvmtiVTMSTransitionDisabler::VTMS_transition_disable [v3]

Alan Bateman alanb at openjdk.org
Fri Dec 8 12:09:15 UTC 2023


On Fri, 8 Dec 2023 11:54:40 GMT, Serguei Spitsyn <sspitsyn at openjdk.org> wrote:

>> This fix is for JDK 23 but the intention is to back port it to 22 in RDP-1 time frame.
>> It is fixing a deadlock issue between `VirtualThread` class critical sections with the `interruptLock` (in methods: `unpark()`, `interrupt()`, `getAndClearInterrupt()`, `threadState()`, `toString()`), `JvmtiVTMSTransitionDisabler` and JVMTI `Suspend/Resume` mechanisms.
>> The deadlocking scenario is well described by Patricio in a bug report comment.
>> In simple words, a virtual thread should not be suspended during 'interruptLock' critical sections.
>> 
>> The fix is to record that a virtual thread is in a critical section (`JavaThread`'s `_in_critical_section` bit) by notifying the VM/JVMTI about begin/end of critical section.
>> This bit is used in `HandshakeState::get_op_for_self()` to filter out any `HandshakeOperation` if a target `JavaThread` is in a critical section.
>> 
>> Some of new notifications with `notifyJvmtiSync()` method is on a performance critical path. It is why this method has been intrincified.
>> 
>> New test was developed by Patricio:
>>  `test/hotspot/jtreg/serviceability/jvmti/vthread/SuspendWithInterruptLock`
>> The test is very nice as it reliably in 100% reproduces the deadlock without the fix.
>> The test is never failing with this fix.
>> 
>> Testing:
>>  - tested with newly added test: `test/hotspot/jtreg/serviceability/jvmti/vthread/SuspendWithInterruptLock`
>>  - tested with mach5 tiers 1-6
>
> Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision:
> 
>   review: (1) rename notifyJvmti method; (2) add try-final statements to VirtualThread methods

I chatted briefly with @sspitsyn about this. A couple of points:
- It shouldn't be necessary to touch mount/unmount as the thread identity is the carrier, not the virtual thread, when executing the "critical code".
- toggle_is_in_critical_section needs to detect reentrancy, it is otherwise too early to refactor the Java code, e.g. call threadState while holding the interrupt lock.
- All the use-sides will need to use try-finally to more reliably revert the critical section flag when rewinding.
- The naming is very problematic, we'll need to replace with methods that are clearly named enter and exit critical section. Ongoing work in this area to support monitors has to introduce some temporary pinning so there will be enter/exitCriticalSection methods, that's a better place for the JVMTI hooks.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/17011#issuecomment-1847063362


More information about the build-dev mailing list