RFR: 8231289: Disentangle JvmtiRawMonitor from ObjectMonitor and clean it up

David Holmes david.holmes at oracle.com
Thu Oct 3 13:26:43 UTC 2019



On 3/10/2019 11:11 pm, Daniel D. Daugherty wrote:
> On 10/3/19 6:13 AM, David Holmes wrote:
>> Hi Dan,
>>
>> On 3/10/2019 3:20 am, Daniel D. Daugherty wrote:
>>> Sorry for the delay in reviewing this one... I've been playing 
>>> whack-a-mole
>>> with Robbin's MoCrazy test and my AsyncMonitorDeflation bits...
>>
>> No problem - your contribution made the wait worthwhile :)
>>
>>> On 9/24/19 1:09 AM, David Holmes wrote:
>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8231289
>>>> webrev: http://cr.openjdk.java.net/~dholmes/8231289/webrev/
>>>
>>> src/hotspot/share/prims/jvmtiEnv.cpp
>>>      Thanks for removing the PROPER_TRANSITIONS stuff. That was old
>>>      and crufty stuff.
>>>
>>> src/hotspot/share/prims/jvmtiEnvBase.cpp
>>>      No comments.
>>>
>>> src/hotspot/share/prims/jvmtiRawMonitor.cpp
>>>      L39:   new (ResourceObj::C_HEAP, mtInternal) 
>>> GrowableArray<JvmtiRawMonitor*>(1,true);
>>>          nit - need a space between ',' and 'true'.
>>>
>>>          Update: leave for your follow-up bug.
>>
>> Fixed now so I don't forget later. :)
>>
>>> src/hotspot/share/prims/jvmtiRawMonitor.hpp
>>>      No comments.
>>>
>>> src/hotspot/share/runtime/objectMonitor.hpp
>>>      Glad I added those 'protected for JvmtiRawMonitor' in one
>>>      of my recent cleanup bugs. Obviously I'll have to merge
>>>      with Async Monitor Deflation. :-)
>>>
>>> src/hotspot/share/runtime/thread.cpp
>>>      No comments.
>>>
>>> src/hotspot/share/runtime/thread.hpp
>>>      No comments.
>>>
>>> src/hotspot/share/services/threadService.cpp
>>>      L397:     waitingToLockMonitor = jt->current_pending_monitor();
>>>      L398:     if (waitingToLockMonitor == NULL) {
>>>      L399:       // we can only be blocked on a raw monitor if not 
>>> blocked on an ObjectMonitor
>>>      L400:       waitingToLockRawMonitor = 
>>> jt->current_pending_raw_monitor();
>>>      L401:     }
>>>
>>>          JVM/TI has this event handler:
>>>
>>>            typedef void (JNICALL *jvmtiEventMonitorContendedEnter)
>>>                (jvmtiEnv *jvmti_env,
>>>                 JNIEnv* jni_env,
>>>                 jthread thread,
>>>                 jobject object);
>>>
>>>          This event handler is called after 
>>> set_current_pending_monitor()
>>>          and if the event handler uses a RawMonitor, then it possible 
>>> for
>>>          for the thread to show up as blocked on both a Java monitor and
>>>          a JVM/TI RawMonitor.
>>
>> Oh that is interesting - good catch! So that means the current code is 
>> broken because the raw monitor will replace the ObjectMonitor as the 
>> pending monitor and then set it back to NULL, thus losing the fact the 
>> thread is actually pending on the ObjectMonitor. And of course while 
>> the pending monitor is the raw monitor that totally messes up the 
>> deadlock detection as the ObjectMonitor is missing from consideration. :(
>>
>> This also probably means that you can have a pending raw monitor at 
>> the same time as you have a "Blocker" as I'm pretty sure there are 
>> various JVM TI event handlers that may execute between the Blocker 
>> being set and the actual park. So that would be an additional breakage 
>> in the existing code.
>>
>> Back to my code and I have two problems. The second, which is easy to 
>> address, is the deadlock printing code. I'll hoist the 
>> waitingToLockRawMonitor chunk to the top so it is executed independent 
>> of the waitingToLockMonitor value (which remains in an if/else 
>> relationship with the waitingToLockBlocker). But now that we might 
>> print two "records" at a time I have to make additional changes to get 
>> meaningful output for the current thread (which is handled as a common 
>> code after the if/else block to finish whichever record was being 
>> printed). Also I can no longer use "continue" as the 3 outcomes are 
>> not mutually exclusive - so this could get a bit messy. :(
>>
>> So definitely a v2 webrev on the way.
>>
>> But before that I need to solve my first problem - and I don't know 
>> how. Now that it is apparent that a thread can be blocked on both a 
>> raw monitor and an ObjectMonitor at the same time, I have no idea how 
>> to actually account for this in the deadlock detection code. That code 
>> has a while loop that expects to at most find either a locked 
>> ObjectMonitor or j.u.c Blocker, and it adds the owner thread to the 
>> cycle detection, then moves on. But now I can have two different owner 
>> threads in the same loop iteration. I don't know how to account for that.
>>
>> Given that it seems to me that the current code is already broken if 
>> we encounter these conditions, then perhaps all I can do is handle the 
>> other cases, where the blocking reasons are mutually exclusive, and 
>> not try to fix things? i.e. leave lines #434 to #440 as they are in 
>> webrev v1 - which implies no change to line #398 except the comment 
>> ... ??
> 
> Ouch. Sorry I didn't mean to throw such a large monkey wrench into the
> mix... I skimmed the JVM/TI spec again looking for anything that might
> help ease the situation, but had no luck.
> 
> Perhaps approach it from a slightly different perspective...
> 
>      If both waitingToLockMonitor and waitingToLockRawMonitor are set
>      on the same thread, then waitingToLockRawMonitor should take
>      precedence since we are not yet truly blocked on the
>      waitingToLockMonitor condition. After we did continue to execute
>      and that got us into code that got waitingToLockRawMonitor set.

I don't see that. The raw monitor usage is incidental to the event 
callback. There could be a real deadlock with the ObjectMonitor, but 
we're just transiently blocked on the raw monitor.

>      Will that help from a deadlock detection point of view? Maybe.
>      If our target thread is holding some other lock before it logically
>      blocked on waitingToLockMonitor, it is still useful to report that
>      it is now blocked on waitingToLockRawMonitor. After all the block
>      on waitingToLockRawMonitor is also contributing to the deadlock.

It isn't the reporting that is the issue it is the actual deadlock 
detection logic. That code as written can't accommodate being blocked on 
two different "locks" with potentially two different owners, at the same 
time. To me it just breaks the whole approach that has been taken to 
detect cycles in the locking.

Thanks,
David
-----

>      Once the developer solves the RawMonitor deadlock cause, there may
>      still be another deadlock related to waitingToLockMonitor, but we've
>      reported one layer of the onion.
> 
> Food for thought...
> 
> Dan
> 
> 
>>
>>> test/hotspot/jtreg/vmTestbase/nsk/jvmti/RawMonitorWait/rawmnwait005/rawmnwait005.cpp 
>>>
>>>      No comments.
>>>
>>>
>>> Thumbs up! The only non-nit I have is the setting of 
>>> waitingToLockRawMonitor
>>> on L400 and the corresponding comment on L399. Everything else is a nit.
>>>
>>> I don't need to see a new webrev.
>>
>> If only that were true :(
>>
>> Thanks,
>> David
>>
>>> Thanks for tackling this disentangle issue!
>>>
>>> Dan
>>>
>>>
>>>>
>>>> The earlier attempt to rewrite JvmtiRawMonitor as a simple wrapper 
>>>> around PlatformMonitor proved not so simple and ultimately had too 
>>>> many issues due to the need to support Thread.interrupt.
>>>>
>>>> I'd previously stated in the bug report:
>>>>
>>>> "In the worst-case I suppose we could just copy ObjectMonitor to a 
>>>> new class and have JvmtiRawMonitor continue to extend that (with 
>>>> some additional minor adjustments) - or even just inline it all as 
>>>> needed."
>>>>
>>>> but hadn't looked at it in detail. Richard Reingruber did look at it 
>>>> and pointed out that it is actually quite simple - we barely use any 
>>>> actual code from ObjectMonitor, mainly just the state. So thanks 
>>>> Richard! :)
>>>>
>>>> So this change basically copies or moves anything needed by 
>>>> JvmtiRawMonitor from ObjectMonitor, breaking the connection between 
>>>> the two. We also copy and simplify ObjectWaiter, turning it into a 
>>>> QNode internal class. There is then a lot of cleanup that was 
>>>> applied (and a lot more that could still be done):
>>>>
>>>> - Removed the never implemented/used PROPER_TRANSITIONS ifdefs
>>>> - Fixed the disconnect between the types of non-JavaThreads expected 
>>>> by the upper layer code and lower layer code
>>>> - cleaned up and simplified return codes
>>>> - consolidated code that is identical for JavaThreads and 
>>>> non-JavaThreads (e.g. notify/notifyAll).
>>>> - removed used of TRAPS/THREAD where not appropriate and replaced 
>>>> with "Thread * Self" in the style of the rest of the code
>>>> - changed recursions to be int rather than intptr_t (a "fixme" in 
>>>> the ObjectMonitor code)
>>>>
>>>>
>>>> I have not changed the many style flaws with this code:
>>>> - Capitalized names
>>>> - extra spaces before ;
>>>> - ...
>>>>
>>>> but could do so if needed. I wanted to try and keep it more obvious 
>>>> that the fundamental functional code is actually unmodified.
>>>>
>>>> There is one aspect that requires further explanation: the notion of 
>>>> current pending monitor. The "current pending monitor" is stored in 
>>>> the Thread and used by a number of introspection APIs for things 
>>>> like finding monitors, doing deadlock detection, etc. The 
>>>> JvmtiRawMonitor code would also set/clear itself as "current pending 
>>>> monitor". Most uses of the current pending monitor actually, 
>>>> explicitly or implicitly, ignore the case when the monitor is a 
>>>> JvmtiRawMonitor (observed by the fact the mon->object() query 
>>>> returns NULL). The exception to that is deadlock detection where raw 
>>>> monitors are at least partially accounted for. To preserve that I 
>>>> added the notion of "current pending raw monitor" and updated the 
>>>> deadlock detection code to use that.
>>>>
>>>> The test:
>>>>
>>>>
>>>> test/hotspot/jtreg/vmTestbase/nsk/jvmti/RawMonitorWait/rawmnwait005/rawmnwait005.cpp 
>>>>
>>>>
>>>> was updated because I'd noticed previously that it was the only test 
>>>> that used interrupt with raw monitors, but was in fact broken: the 
>>>> test thread is a daemon thread so the main thread could terminate 
>>>> the VM immediately after the interrupt() call, thus you would never 
>>>> know if the interruption actually worked as expected.
>>>>
>>>> Testing:
>>>>  - tiers 1 - 3
>>>>  - vmTestbase/nsk/monitoring/  (for deadlock detection**)
>>>>  - vmTestbase/nsk/jdwp
>>>>  - vmTestbase/nsk/jdb/
>>>>  - vmTestbase/nsk/jdi/
>>>>  - vmTestbase/nsk/jvmti/
>>>>  - serviceability/jvmti/
>>>>  - serviceability/jdwp
>>>>  - JDK: java/lang/management
>>>>
>>>> ** There are no existing deadlock related tests involving 
>>>> JvmtiRawMonitor. It would be interesting/useful to add them to the 
>>>> existing nsk/monitoring tests that cover synchronized and JNI 
>>>> locking. But it's a non-trivial enhancement that I don't really have 
>>>> time to do.
>>>>
>>>> Thanks,
>>>> David
>>>> -----
>>>
> 


More information about the hotspot-runtime-dev mailing list