RFR: 8231289: Disentangle JvmtiRawMonitor from ObjectMonitor and clean it up

Thu Oct 3 23:35:06 UTC 2019

On 10/3/19 3:33 PM, David Holmes wrote:
> On 4/10/2019 3:15 am, serguei.spitsyn at oracle.com wrote:
>> On 10/3/19 03:13, David Holmes wrote:
>>> Hi Dan,
>>>
>>> On 3/10/2019 3:20 am, Daniel D. Daugherty wrote:
>>>> Sorry for the delay in reviewing this one... I've been playing 
>>>> whack-a-mole
>>>> with Robbin's MoCrazy test and my AsyncMonitorDeflation bits...
>>>
>>> No problem - your contribution made the wait worthwhile :)
>>>
>>>> On 9/24/19 1:09 AM, David Holmes wrote:
>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8231289
>>>>> webrev: http://cr.openjdk.java.net/~dholmes/8231289/webrev/
>>>>
>>>> src/hotspot/share/prims/jvmtiEnv.cpp
>>>>      Thanks for removing the PROPER_TRANSITIONS stuff. That was old
>>>>      and crufty stuff.
>>>>
>>>> src/hotspot/share/prims/jvmtiEnvBase.cpp
>>>>      No comments.
>>>>
>>>> src/hotspot/share/prims/jvmtiRawMonitor.cpp
>>>>      L39:   new (ResourceObj::C_HEAP, mtInternal) 
>>>> GrowableArray<JvmtiRawMonitor*>(1,true);
>>>>          nit - need a space between ',' and 'true'.
>>>>
>>>>          Update: leave for your follow-up bug.
>>>
>>> Fixed now so I don't forget later. :)
>>>
>>>> src/hotspot/share/prims/jvmtiRawMonitor.hpp
>>>>      No comments.
>>>>
>>>> src/hotspot/share/runtime/objectMonitor.hpp
>>>>      Glad I added those 'protected for JvmtiRawMonitor' in one
>>>>      of my recent cleanup bugs. Obviously I'll have to merge
>>>>      with Async Monitor Deflation. :-)
>>>>
>>>> src/hotspot/share/runtime/thread.cpp
>>>>      No comments.
>>>>
>>>> src/hotspot/share/runtime/thread.hpp
>>>>      No comments.
>>>>
>>>> src/hotspot/share/services/threadService.cpp
>>>>      L397:     waitingToLockMonitor = jt->current_pending_monitor();
>>>>      L398:     if (waitingToLockMonitor == NULL) {
>>>>      L399:       // we can only be blocked on a raw monitor if not 
>>>> blocked on an ObjectMonitor
>>>>      L400:       waitingToLockRawMonitor = 
>>>> jt->current_pending_raw_monitor();
>>>>      L401:     }
>>>>
>>>>          JVM/TI has this event handler:
>>>>
>>>>            typedef void (JNICALL *jvmtiEventMonitorContendedEnter)
>>>>                (jvmtiEnv *jvmti_env,
>>>>                 JNIEnv* jni_env,
>>>>                 jthread thread,
>>>>                 jobject object);
>>>>
>>>>          This event handler is called after 
>>>> set_current_pending_monitor()
>>>>          and if the event handler uses a RawMonitor, then it 
>>>> possible for
>>>>          for the thread to show up as blocked on both a Java 
>>>> monitor and
>>>>          a JVM/TI RawMonitor.
>>>
>>> Oh that is interesting - good catch! So that means the current code 
>>> is broken because the raw monitor will replace the ObjectMonitor as 
>>> the pending monitor and then set it back to NULL, thus losing the 
>>> fact the thread is actually pending on the ObjectMonitor. And of 
>>> course while the pending monitor is the raw monitor that totally 
>>> messes up the deadlock detection as the ObjectMonitor is missing 
>>> from consideration. :(
>>
>> If I remember correctly it is a scenario where this issue also comes 
>> to the picture:
>> https://bugs.openjdk.java.net/browse/JDK-8033399
>>
>> I do not really understand how shared ParkEvent can be used/consumed 
>> by both ObjectMonitor and RawMonitor on the same thread.
>> But we observed and investigated this problem several years ago and 
>> Dan finally filed this enhancement.
>
> I still don't see how this is possible as you are not actually 
> enqueued on the ObjectMonitor when the call out to the event callback 
> is made. but that is a topic for another email thread. :)

Agreed.
Just wanted to point out it can be related.
Dan filed this RFE and can have more knowledge.

> Meanwhile what to do about broken deadlock detection ... :(

It is a good catch from Dan.

Thanks,
Serguei

>
> Thanks,
> David
>
>> Thanks,
>> Serguei
>>
>>
>>
>>
>>> This also probably means that you can have a pending raw monitor at 
>>> the same time as you have a "Blocker" as I'm pretty sure there are 
>>> various JVM TI event handlers that may execute between the Blocker 
>>> being set and the actual park. So that would be an additional 
>>> breakage in the existing code.
>>>
>>> Back to my code and I have two problems. The second, which is easy 
>>> to address, is the deadlock printing code. I'll hoist the 
>>> waitingToLockRawMonitor chunk to the top so it is executed 
>>> independent of the waitingToLockMonitor value (which remains in an 
>>> if/else relationship with the waitingToLockBlocker). But now that we 
>>> might print two "records" at a time I have to make additional 
>>> changes to get meaningful output for the current thread (which is 
>>> handled as a common code after the if/else block to finish whichever 
>>> record was being printed). Also I can no longer use "continue" as 
>>> the 3 outcomes are not mutually exclusive - so this could get a bit 
>>> messy. :(
>>>
>>> So definitely a v2 webrev on the way.
>>>
>>> But before that I need to solve my first problem - and I don't know 
>>> how. Now that it is apparent that a thread can be blocked on both a 
>>> raw monitor and an ObjectMonitor at the same time, I have no idea 
>>> how to actually account for this in the deadlock detection code. 
>>> That code has a while loop that expects to at most find either a 
>>> locked ObjectMonitor or j.u.c Blocker, and it adds the owner thread 
>>> to the cycle detection, then moves on. But now I can have two 
>>> different owner threads in the same loop iteration. I don't know how 
>>> to account for that.
>>>
>>> Given that it seems to me that the current code is already broken if 
>>> we encounter these conditions, then perhaps all I can do is handle 
>>> the other cases, where the blocking reasons are mutually exclusive, 
>>> and not try to fix things? i.e. leave lines #434 to #440 as they are 
>>> in webrev v1 - which implies no change to line #398 except the 
>>> comment ... ??
>>>
>>>> test/hotspot/jtreg/vmTestbase/nsk/jvmti/RawMonitorWait/rawmnwait005/rawmnwait005.cpp 
>>>>
>>>>      No comments.
>>>>
>>>>
>>>> Thumbs up! The only non-nit I have is the setting of 
>>>> waitingToLockRawMonitor
>>>> on L400 and the corresponding comment on L399. Everything else is a 
>>>> nit.
>>>>
>>>> I don't need to see a new webrev.
>>>
>>> If only that were true :(
>>>
>>> Thanks,
>>> David
>>>
>>>> Thanks for tackling this disentangle issue!
>>>>
>>>> Dan
>>>>
>>>>
>>>>>
>>>>> The earlier attempt to rewrite JvmtiRawMonitor as a simple wrapper 
>>>>> around PlatformMonitor proved not so simple and ultimately had too 
>>>>> many issues due to the need to support Thread.interrupt.
>>>>>
>>>>> I'd previously stated in the bug report:
>>>>>
>>>>> "In the worst-case I suppose we could just copy ObjectMonitor to a 
>>>>> new class and have JvmtiRawMonitor continue to extend that (with 
>>>>> some additional minor adjustments) - or even just inline it all as 
>>>>> needed."
>>>>>
>>>>> but hadn't looked at it in detail. Richard Reingruber did look at 
>>>>> it and pointed out that it is actually quite simple - we barely 
>>>>> use any actual code from ObjectMonitor, mainly just the state. So 
>>>>> thanks Richard! :)
>>>>>
>>>>> So this change basically copies or moves anything needed by 
>>>>> JvmtiRawMonitor from ObjectMonitor, breaking the connection 
>>>>> between the two. We also copy and simplify ObjectWaiter, turning 
>>>>> it into a QNode internal class. There is then a lot of cleanup 
>>>>> that was applied (and a lot more that could still be done):
>>>>>
>>>>> - Removed the never implemented/used PROPER_TRANSITIONS ifdefs
>>>>> - Fixed the disconnect between the types of non-JavaThreads 
>>>>> expected by the upper layer code and lower layer code
>>>>> - cleaned up and simplified return codes
>>>>> - consolidated code that is identical for JavaThreads and 
>>>>> non-JavaThreads (e.g. notify/notifyAll).
>>>>> - removed used of TRAPS/THREAD where not appropriate and replaced 
>>>>> with "Thread * Self" in the style of the rest of the code
>>>>> - changed recursions to be int rather than intptr_t (a "fixme" in 
>>>>> the ObjectMonitor code)
>>>>>
>>>>>
>>>>> I have not changed the many style flaws with this code:
>>>>> - Capitalized names
>>>>> - extra spaces before ;
>>>>> - ...
>>>>>
>>>>> but could do so if needed. I wanted to try and keep it more 
>>>>> obvious that the fundamental functional code is actually unmodified.
>>>>>
>>>>> There is one aspect that requires further explanation: the notion 
>>>>> of current pending monitor. The "current pending monitor" is 
>>>>> stored in the Thread and used by a number of introspection APIs 
>>>>> for things like finding monitors, doing deadlock detection, etc. 
>>>>> The JvmtiRawMonitor code would also set/clear itself as "current 
>>>>> pending monitor". Most uses of the current pending monitor 
>>>>> actually, explicitly or implicitly, ignore the case when the 
>>>>> monitor is a JvmtiRawMonitor (observed by the fact the 
>>>>> mon->object() query returns NULL). The exception to that is 
>>>>> deadlock detection where raw monitors are at least partially 
>>>>> accounted for. To preserve that I added the notion of "current 
>>>>> pending raw monitor" and updated the deadlock detection code to 
>>>>> use that.
>>>>>
>>>>> The test:
>>>>>
>>>>>
>>>>> test/hotspot/jtreg/vmTestbase/nsk/jvmti/RawMonitorWait/rawmnwait005/rawmnwait005.cpp 
>>>>>
>>>>>
>>>>> was updated because I'd noticed previously that it was the only 
>>>>> test that used interrupt with raw monitors, but was in fact 
>>>>> broken: the test thread is a daemon thread so the main thread 
>>>>> could terminate the VM immediately after the interrupt() call, 
>>>>> thus you would never know if the interruption actually worked as 
>>>>> expected.
>>>>>
>>>>> Testing:
>>>>>  - tiers 1 - 3
>>>>>  - vmTestbase/nsk/monitoring/  (for deadlock detection**)
>>>>>  - vmTestbase/nsk/jdwp
>>>>>  - vmTestbase/nsk/jdb/
>>>>>  - vmTestbase/nsk/jdi/
>>>>>  - vmTestbase/nsk/jvmti/
>>>>>  - serviceability/jvmti/
>>>>>  - serviceability/jdwp
>>>>>  - JDK: java/lang/management
>>>>>
>>>>> ** There are no existing deadlock related tests involving 
>>>>> JvmtiRawMonitor. It would be interesting/useful to add them to the 
>>>>> existing nsk/monitoring tests that cover synchronized and JNI 
>>>>> locking. But it's a non-trivial enhancement that I don't really 
>>>>> have time to do.
>>>>>
>>>>> Thanks,
>>>>> David
>>>>> -----
>>>>
>>