RFR: 8373120: Virtual thread stuck in BLOCKED state
Alan Bateman
alanb at openjdk.org
Fri Jan 16 13:49:46 UTC 2026
On Thu, 15 Jan 2026 17:42:33 GMT, Patricio Chilano Mateo <pchilanomate at openjdk.org> wrote:
> Please review the following patch. This fixes a bug in how we handle state changes for the timed `Object.wait` case in `afterYield`, which can leave a virtual thread stuck in the `BLOCKED` state. It can be triggered by two consecutive calls to timed `Object.wait`, if the first call receives a notification and the second call relies on the timeout task to wake up the thread. I added the full sequence of events that leads to the vthread getting stuck in JBS.
>
> The fix is to check for `notified` and attempt to change the state to `BLOCKED` inside the synchronized block. This guarantees that we don't change the state of an already new timed `Object.wait` call.
>
> The PR includes a new test which reproduces the issue when run several times in mach5. It's a hybrid of my original repro test and another one created by @AlanBateman.
>
> Thanks,
> Patricio
Great sleuthing, this was a really hard one to diagnose.
src/java.base/share/classes/java/lang/VirtualThread.java line 644:
> 642: setState(newState = TIMED_WAIT);
> 643: // May have been notified while in transition. This must be done while
> 644: // holding the monitor to avoid changing the state of a new timed wait call.
"to avoid changing the state of a new timed wait call". It might be clearer to say move to the blocked state before the timeout task can execute.
src/java.base/share/classes/java/lang/VirtualThread.java line 652:
> 650: // may have been unblocked already
> 651: if (blockPermit && compareAndSetState(BLOCKED, UNBLOCKED)) {
> 652: lazySubmitRunContinuation();
Moving to lazySubmit is good here, this helps improve the chance of continuing with the current platform thread as the carrier when notified and unblocked during the transition.
test/jdk/java/lang/Thread/virtual/stress/NotifiedThenTimedOutWait.java line 77:
> 75: }
> 76: });
> 77: var pthread = Thread.ofPlatform().start(() -> {
A future maintainer may wonder why the notify is done in a platform thread in race1, and a virtual thread in race2. We should probably add a comment.
-------------
PR Review: https://git.openjdk.org/jdk/pull/29255#pullrequestreview-3670903428
PR Review Comment: https://git.openjdk.org/jdk/pull/29255#discussion_r2698546064
PR Review Comment: https://git.openjdk.org/jdk/pull/29255#discussion_r2698550733
PR Review Comment: https://git.openjdk.org/jdk/pull/29255#discussion_r2698564280
More information about the core-libs-dev
mailing list