RFR: 8332738: Debug agent can deadlock on callbackLock when using StackFrame.PopFrames
Chris Plummer
cjplummer at openjdk.org
Mon Jul 22 20:36:41 UTC 2024
On Mon, 22 Jul 2024 20:32:35 GMT, Chris Plummer <cjplummer at openjdk.org> wrote:
> Fix deadlock with debug agent callbackLock. Details in first comment.
>
> Tested by running all jdi, jdwp, and jdb tests with and without virtual threads about 40 times. The testing was initially done with my hack to force the self suspend (see the first comment in the CR), and then I did final testing without it. I also did final testing with all tier5 svc tests.
There is a deadlock with the callbackLock that currently happens with my work in progress for [JDK-8328866](https://bugs.openjdk.org/browse/JDK-8328866) (ranked monitor support), but it could potentially happen without it. The only thing that saves the current implementation from the deadlock is the fact that RawMonitorExit releases the monitor before doing a self suspend, and there is nothing else done while holding the callbackLock that might allow for a self suspend. However, with [JDK-8328866](https://bugs.openjdk.org/browse/JDK-8328866) there is an opportunity for a self suspend, which is why I started to see callbackLock deadlocks with that work.
The description of the CR contains details on the root cause of this bug.
The main fix for this bug is acquiring the callbackLock in getLocks() to avoid any thread from being suspended while holding the callbackLock. I also fixed some lock order issues that were discovered while working on JDK-8328866. Some of these were necessary to do now, and all will be needed for JDK-8328866, so making the change here will help avoid conflicts.
In stepControl_beginStep() I also added the callbackLock to the list of locks acquired. This is needed to maintain proper lock ordering because threadControl_suspendThread() will be called, which calls getLocks().
In invoker_completeInvokeRequest() I also added the callbackLock to the list of locks acquired. This is needed to maintain proper lock ordering because threadControl_suspendThread() or threadControl_suspendAll() will be called, which both call getLocks(). I also added acquiring the stepLock. This is needed because we already acquire the invokeLock here, and getLocks() acquires the invokeLock after the stepLock, so we must do the same here in order to maintain lock ordering.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/20282#issuecomment-2243765072
More information about the serviceability-dev
mailing list