RFR: 8274687: JDWP deadlocks if some Java thread reaches wait in blockOnDebuggerSuspend [v4]
Richard Reingruber
rrich at openjdk.java.net
Fri Oct 15 09:19:48 UTC 2021
On Thu, 14 Oct 2021 18:33:51 GMT, Chris Plummer <cjplummer at openjdk.org> wrote:
>>> Ok, so you need to hold threadLock from the time `blockOnDebuggerSuspend()` is done waiting on it until after `trackAppResume()` is done, and since `trackAppResume()` needs to grab the handlerLock, and you need to grab the handerLock before the threadLock, you need to jump through some lock grabbing/release hoops.
>>
>> That's correct.
>>
>>> However, you are in fact releasing the threadLock for a short time in `blockOnDebuggerSuspend()` after the wait completes. Doesn't this expose the resumee to potential suspending after the wait has completed and before `trackAppResume()` has been called?
>>
>> That's correct too. I wouldn't see an issue with it though. I think this is even a preexisting condition. The current thread can lose the race grabbing threadLock after it was notified to the debugger trying to suspend again (if there wasn't the deadlock of course) and consequently suspendCount can (again) be greater than 0 right after the wait. In that case we simply retry.
>
> Ok. So you just need to reacquire the threadLock before the `findThread()` call and before exiting the while loop, and hold it until after the `trackAppResume()` call. I guess it ok then. But this exiting of the handlerLock here and in `blockOnDebuggerSuspend()` is always going to arouse suspicion for anyone that reads the code. The first question will be if the lock does not need to be held here, why grab it in the first place? We know the answer is this lock ordering issue that turns up when `trackAppResume()` is later called, but this will be far from obvious to the reader. It might be possible to make this a bit clearer with some code restructuring, like maybe combining `blockOnDebuggerSuspend()` and `trackAppResume()` into one function (I'm not sure that this will actually help, but maybe something to consider), but at the very least we need some comments calling out why handlerLock is held in the first place and why it is ok to release it.
I've added the comments. Maybe I should really combine `blockOnDebuggerSuspend()` and `trackAppResume()` into one function...
-------------
PR: https://git.openjdk.java.net/jdk/pull/5849
More information about the serviceability-dev
mailing list