RFR: 8274687: JDWP deadlocks if some Java thread reaches wait in blockOnDebuggerSuspend [v7]
Richard Reingruber
rrich at openjdk.java.net
Thu Oct 21 08:04:04 UTC 2021
On Thu, 21 Oct 2021 03:34:30 GMT, Chris Plummer <cjplummer at openjdk.org> wrote:
>> You already release threadLock temporarily in `trackAppResume()` and you used to release it temporarily in doPendingTasks(). I'm suggesting to instead release it in `threadControl_onEventHandlerExit()` before calling `doPendingTasks()` and then grab it again (after first grabbing the handlerLock) in `doPendingTasks()` before calling `trackAppResume()`. What can be modified in the ThreadNode during that time that you are concerned about? threadLock would still be held when `blockOnDebuggerSuspend()` is called. The only interesting lines that are currently executed while holding the threadLock but no longer would be are:
>>
>> 2193 if (node->handlingAppResume) {
>> 2194 jthread resumer = node->thread;
>> 2195 jthread resumee = getResumee(resumer);
>>
>> I don't think any of those can change, because only the currently executing thread can change them (when the Thread.resume() breakpoint is hit).
>>
>> The suspendCount is only meaningful when in `blockOnDebuggerSuspend()`, and we'll be under the threadLock by that point.
>
> BTW, normally I would not suggest less locking simply because I *thought* it might not be needed. However, you already temporarily release and then regrab the lock. This means there is a period of time where the locking is not needed. What I've suggested is to better define (and expand) that period of time so the locking can be made cleaner. I think the actual expansion of time where the lock would no longer be held is pretty small, with a well defined set of code being executed that to me does not appear to require the lock, and if it did require the lock, then probably releasing the lock in the first place is a bug.
> move the locking and unlocking into doPendingTasks(). At the start of the
> node->handlingAppResume block, grab handlerLock and threadLock, and explain
> that handlerLock is being grabbed because trackAppResume() will (indirectly)
> try to grab it, and grabbing it before threadLock is necessary for lock
> ordering. Then grab threadLock and hold onto it until the end of the block.
So threadLock would be released at the end of the new block that depends on
node->handlingAppResume. After that block follow accesses to `pendingInterrupt`
and `pendingStop` that would be unsynchronized then. I think they do need to be
synchronized through threadLock though.
-------------
PR: https://git.openjdk.java.net/jdk/pull/5849
More information about the serviceability-dev
mailing list