RFR: 8231289: Disentangle JvmtiRawMonitor from ObjectMonitor and clean it up

Thu Oct 3 17:15:56 UTC 2019

On 10/3/19 03:13, David Holmes wrote:
> Hi Dan,
>
> On 3/10/2019 3:20 am, Daniel D. Daugherty wrote:
>> Sorry for the delay in reviewing this one... I've been playing 
>> whack-a-mole
>> with Robbin's MoCrazy test and my AsyncMonitorDeflation bits...
>
> No problem - your contribution made the wait worthwhile :)
>
>> On 9/24/19 1:09 AM, David Holmes wrote:
>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8231289
>>> webrev: http://cr.openjdk.java.net/~dholmes/8231289/webrev/
>>
>> src/hotspot/share/prims/jvmtiEnv.cpp
>>      Thanks for removing the PROPER_TRANSITIONS stuff. That was old
>>      and crufty stuff.
>>
>> src/hotspot/share/prims/jvmtiEnvBase.cpp
>>      No comments.
>>
>> src/hotspot/share/prims/jvmtiRawMonitor.cpp
>>      L39:   new (ResourceObj::C_HEAP, mtInternal) 
>> GrowableArray<JvmtiRawMonitor*>(1,true);
>>          nit - need a space between ',' and 'true'.
>>
>>          Update: leave for your follow-up bug.
>
> Fixed now so I don't forget later. :)
>
>> src/hotspot/share/prims/jvmtiRawMonitor.hpp
>>      No comments.
>>
>> src/hotspot/share/runtime/objectMonitor.hpp
>>      Glad I added those 'protected for JvmtiRawMonitor' in one
>>      of my recent cleanup bugs. Obviously I'll have to merge
>>      with Async Monitor Deflation. :-)
>>
>> src/hotspot/share/runtime/thread.cpp
>>      No comments.
>>
>> src/hotspot/share/runtime/thread.hpp
>>      No comments.
>>
>> src/hotspot/share/services/threadService.cpp
>>      L397:     waitingToLockMonitor = jt->current_pending_monitor();
>>      L398:     if (waitingToLockMonitor == NULL) {
>>      L399:       // we can only be blocked on a raw monitor if not 
>> blocked on an ObjectMonitor
>>      L400:       waitingToLockRawMonitor = 
>> jt->current_pending_raw_monitor();
>>      L401:     }
>>
>>          JVM/TI has this event handler:
>>
>>            typedef void (JNICALL *jvmtiEventMonitorContendedEnter)
>>                (jvmtiEnv *jvmti_env,
>>                 JNIEnv* jni_env,
>>                 jthread thread,
>>                 jobject object);
>>
>>          This event handler is called after 
>> set_current_pending_monitor()
>>          and if the event handler uses a RawMonitor, then it possible 
>> for
>>          for the thread to show up as blocked on both a Java monitor and
>>          a JVM/TI RawMonitor.
>
> Oh that is interesting - good catch! So that means the current code is 
> broken because the raw monitor will replace the ObjectMonitor as the 
> pending monitor and then set it back to NULL, thus losing the fact the 
> thread is actually pending on the ObjectMonitor. And of course while 
> the pending monitor is the raw monitor that totally messes up the 
> deadlock detection as the ObjectMonitor is missing from consideration. :(

If I remember correctly it is a scenario where this issue also comes to 
the picture:
https://bugs.openjdk.java.net/browse/JDK-8033399

I do not really understand how shared ParkEvent can be used/consumed by 
both ObjectMonitor and RawMonitor on the same thread.
But we observed and investigated this problem several years ago and Dan 
finally filed this enhancement.

Thanks,
Serguei

> This also probably means that you can have a pending raw monitor at 
> the same time as you have a "Blocker" as I'm pretty sure there are 
> various JVM TI event handlers that may execute between the Blocker 
> being set and the actual park. So that would be an additional breakage 
> in the existing code.
>
> Back to my code and I have two problems. The second, which is easy to 
> address, is the deadlock printing code. I'll hoist the 
> waitingToLockRawMonitor chunk to the top so it is executed independent 
> of the waitingToLockMonitor value (which remains in an if/else 
> relationship with the waitingToLockBlocker). But now that we might 
> print two "records" at a time I have to make additional changes to get 
> meaningful output for the current thread (which is handled as a common 
> code after the if/else block to finish whichever record was being 
> printed). Also I can no longer use "continue" as the 3 outcomes are 
> not mutually exclusive - so this could get a bit messy. :(
>
> So definitely a v2 webrev on the way.
>
> But before that I need to solve my first problem - and I don't know 
> how. Now that it is apparent that a thread can be blocked on both a 
> raw monitor and an ObjectMonitor at the same time, I have no idea how 
> to actually account for this in the deadlock detection code. That code 
> has a while loop that expects to at most find either a locked 
> ObjectMonitor or j.u.c Blocker, and it adds the owner thread to the 
> cycle detection, then moves on. But now I can have two different owner 
> threads in the same loop iteration. I don't know how to account for that.
>
> Given that it seems to me that the current code is already broken if 
> we encounter these conditions, then perhaps all I can do is handle the 
> other cases, where the blocking reasons are mutually exclusive, and 
> not try to fix things? i.e. leave lines #434 to #440 as they are in 
> webrev v1 - which implies no change to line #398 except the comment 
> ... ??
>
>> test/hotspot/jtreg/vmTestbase/nsk/jvmti/RawMonitorWait/rawmnwait005/rawmnwait005.cpp 
>>
>>      No comments.
>>
>>
>> Thumbs up! The only non-nit I have is the setting of 
>> waitingToLockRawMonitor
>> on L400 and the corresponding comment on L399. Everything else is a nit.
>>
>> I don't need to see a new webrev.
>
> If only that were true :(
>
> Thanks,
> David
>
>> Thanks for tackling this disentangle issue!
>>
>> Dan
>>
>>
>>>
>>> The earlier attempt to rewrite JvmtiRawMonitor as a simple wrapper 
>>> around PlatformMonitor proved not so simple and ultimately had too 
>>> many issues due to the need to support Thread.interrupt.
>>>
>>> I'd previously stated in the bug report:
>>>
>>> "In the worst-case I suppose we could just copy ObjectMonitor to a 
>>> new class and have JvmtiRawMonitor continue to extend that (with 
>>> some additional minor adjustments) - or even just inline it all as 
>>> needed."
>>>
>>> but hadn't looked at it in detail. Richard Reingruber did look at it 
>>> and pointed out that it is actually quite simple - we barely use any 
>>> actual code from ObjectMonitor, mainly just the state. So thanks 
>>> Richard! :)
>>>
>>> So this change basically copies or moves anything needed by 
>>> JvmtiRawMonitor from ObjectMonitor, breaking the connection between 
>>> the two. We also copy and simplify ObjectWaiter, turning it into a 
>>> QNode internal class. There is then a lot of cleanup that was 
>>> applied (and a lot more that could still be done):
>>>
>>> - Removed the never implemented/used PROPER_TRANSITIONS ifdefs
>>> - Fixed the disconnect between the types of non-JavaThreads expected 
>>> by the upper layer code and lower layer code
>>> - cleaned up and simplified return codes
>>> - consolidated code that is identical for JavaThreads and 
>>> non-JavaThreads (e.g. notify/notifyAll).
>>> - removed used of TRAPS/THREAD where not appropriate and replaced 
>>> with "Thread * Self" in the style of the rest of the code
>>> - changed recursions to be int rather than intptr_t (a "fixme" in 
>>> the ObjectMonitor code)
>>>
>>>
>>> I have not changed the many style flaws with this code:
>>> - Capitalized names
>>> - extra spaces before ;
>>> - ...
>>>
>>> but could do so if needed. I wanted to try and keep it more obvious 
>>> that the fundamental functional code is actually unmodified.
>>>
>>> There is one aspect that requires further explanation: the notion of 
>>> current pending monitor. The "current pending monitor" is stored in 
>>> the Thread and used by a number of introspection APIs for things 
>>> like finding monitors, doing deadlock detection, etc. The 
>>> JvmtiRawMonitor code would also set/clear itself as "current pending 
>>> monitor". Most uses of the current pending monitor actually, 
>>> explicitly or implicitly, ignore the case when the monitor is a 
>>> JvmtiRawMonitor (observed by the fact the mon->object() query 
>>> returns NULL). The exception to that is deadlock detection where raw 
>>> monitors are at least partially accounted for. To preserve that I 
>>> added the notion of "current pending raw monitor" and updated the 
>>> deadlock detection code to use that.
>>>
>>> The test:
>>>
>>>
>>> test/hotspot/jtreg/vmTestbase/nsk/jvmti/RawMonitorWait/rawmnwait005/rawmnwait005.cpp 
>>>
>>>
>>> was updated because I'd noticed previously that it was the only test 
>>> that used interrupt with raw monitors, but was in fact broken: the 
>>> test thread is a daemon thread so the main thread could terminate 
>>> the VM immediately after the interrupt() call, thus you would never 
>>> know if the interruption actually worked as expected.
>>>
>>> Testing:
>>>  - tiers 1 - 3
>>>  - vmTestbase/nsk/monitoring/  (for deadlock detection**)
>>>  - vmTestbase/nsk/jdwp
>>>  - vmTestbase/nsk/jdb/
>>>  - vmTestbase/nsk/jdi/
>>>  - vmTestbase/nsk/jvmti/
>>>  - serviceability/jvmti/
>>>  - serviceability/jdwp
>>>  - JDK: java/lang/management
>>>
>>> ** There are no existing deadlock related tests involving 
>>> JvmtiRawMonitor. It would be interesting/useful to add them to the 
>>> existing nsk/monitoring tests that cover synchronized and JNI 
>>> locking. But it's a non-trivial enhancement that I don't really have 
>>> time to do.
>>>
>>> Thanks,
>>> David
>>> -----
>>