RFR: 8366659: ObjectMonitor::wait() liveness problem with a suspension request [v20]

Daniel D. Daugherty dcubed at openjdk.org
Fri Jan 16 22:34:59 UTC 2026


On Fri, 9 Jan 2026 12:05:37 GMT, Anton Artemov <aartemov at openjdk.org> wrote:

>> Hi, please consider the following changes:
>> 
>> If suspension is allowed when a thread is re-entering an object monitor (OM), then a following liveness issues can happen in the `ObjectMonitor::wait()` method.
>> 
>> The waiting thread is made to be a successor and is unparked. Upon a suspension request, the thread will suspend itself whilst clearing the successor. The OM will be left unlocked (not grabbed by any thread), while the other threads are parked until a thread grabs the OM and the exits it. The suspended thread is on the entry-list and can be selected as a successor again. None of other threads can be woken up to grab the OM until the suspended thread has been resumed and successfully releases the OM.
>> 
>> This can happen in three places where the successor could be suspended: 
>> 
>> 1:
>> https://github.com/openjdk/jdk/blob/6322aaba63b235cb6c73d23a932210af318404ec/src/hotspot/share/runtime/objectMonitor.cpp#L1897
>> 
>> 2:
>> https://github.com/openjdk/jdk/blob/6322aaba63b235cb6c73d23a932210af318404ec/src/hotspot/share/runtime/objectMonitor.cpp#L1149
>> 
>> 3:
>> https://github.com/openjdk/jdk/blob/6322aaba63b235cb6c73d23a932210af318404ec/src/hotspot/share/runtime/objectMonitor.cpp#L1951
>> 
>> The issues are addressed by not allowing suspension in case 1, and by handling the suspension request at a later stage, after the thread has grabbed the OM in `reenter_internal()` in case 2. In case of a suspension request, the thread exits the OM and enters it again once resumed. 
>> 
>> Case 3 is handled by not transferring a thread to the `entry_list` in `notify_internal()` in case the corresponding JVMTI event is allowed. Instead, a tread is unparked and let run. Since it is not on the `entry_list`, it will not be chosen as a successor and it is no harm to suspend it if needed when posting the event. 
>> 
>> Possible issue of posting a `waited` event while still be suspended is addressed by adding a suspension check just before the posting of event.
>> 
>> Tests are added.
>> 
>> Tested in tiers 1 - 7.
>
> Anton Artemov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 44 commits:
> 
>  - 8366659: Fixed year in the copyright.
>  - Merge remote-tracking branch 'origin/master' into JDK-8366659-OM-wait-suspend-deadlock
>  - 8366659: Removed ClearSuccOnSuspend
>  - 8366659: Fixed liveness problem.
>  - Merge remote-tracking branch 'origin/master' into JDK-8366659-OM-wait-suspend-deadlock
>  - 8366659: Fixed build problem.
>  - 8366659: Fixed build issue.
>  - 8366659: Changed the way how notify_internal works if JVMTI monitor waited event allowed.
>  - Merge remote-tracking branch 'origin/master' into JDK-8366659-OM-wait-suspend-deadlock
>  - 8366659: Fixed semi-broken tests
>  - ... and 34 more: https://git.openjdk.org/jdk/compare/a01283a5...21b83214

I have to disagree that the notion that test case #2 and test case #3 are "artificial".
These tests were written to line up the executing conditions to reproduce the two
failure modes posited by the bug. Both of the new test cases, when executed on
a VM without the product code changes cause a definite liveness condition. Maybe
we should have said "hang" instead of "deadlock". Maybe even "livelock" is better.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/27040#issuecomment-3762066851


More information about the hotspot-runtime-dev mailing list