RFR: 8231289: Disentangle JvmtiRawMonitor from ObjectMonitor and clean it up
serguei.spitsyn at oracle.com
serguei.spitsyn at oracle.com
Thu Oct 3 23:35:06 UTC 2019
On 10/3/19 3:33 PM, David Holmes wrote:
> On 4/10/2019 3:15 am, serguei.spitsyn at oracle.com wrote:
>> On 10/3/19 03:13, David Holmes wrote:
>>> Hi Dan,
>>>
>>> On 3/10/2019 3:20 am, Daniel D. Daugherty wrote:
>>>> Sorry for the delay in reviewing this one... I've been playing
>>>> whack-a-mole
>>>> with Robbin's MoCrazy test and my AsyncMonitorDeflation bits...
>>>
>>> No problem - your contribution made the wait worthwhile :)
>>>
>>>> On 9/24/19 1:09 AM, David Holmes wrote:
>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8231289
>>>>> webrev: http://cr.openjdk.java.net/~dholmes/8231289/webrev/
>>>>
>>>> src/hotspot/share/prims/jvmtiEnv.cpp
>>>> Thanks for removing the PROPER_TRANSITIONS stuff. That was old
>>>> and crufty stuff.
>>>>
>>>> src/hotspot/share/prims/jvmtiEnvBase.cpp
>>>> No comments.
>>>>
>>>> src/hotspot/share/prims/jvmtiRawMonitor.cpp
>>>> L39: new (ResourceObj::C_HEAP, mtInternal)
>>>> GrowableArray<JvmtiRawMonitor*>(1,true);
>>>> nit - need a space between ',' and 'true'.
>>>>
>>>> Update: leave for your follow-up bug.
>>>
>>> Fixed now so I don't forget later. :)
>>>
>>>> src/hotspot/share/prims/jvmtiRawMonitor.hpp
>>>> No comments.
>>>>
>>>> src/hotspot/share/runtime/objectMonitor.hpp
>>>> Glad I added those 'protected for JvmtiRawMonitor' in one
>>>> of my recent cleanup bugs. Obviously I'll have to merge
>>>> with Async Monitor Deflation. :-)
>>>>
>>>> src/hotspot/share/runtime/thread.cpp
>>>> No comments.
>>>>
>>>> src/hotspot/share/runtime/thread.hpp
>>>> No comments.
>>>>
>>>> src/hotspot/share/services/threadService.cpp
>>>> L397: waitingToLockMonitor = jt->current_pending_monitor();
>>>> L398: if (waitingToLockMonitor == NULL) {
>>>> L399: // we can only be blocked on a raw monitor if not
>>>> blocked on an ObjectMonitor
>>>> L400: waitingToLockRawMonitor =
>>>> jt->current_pending_raw_monitor();
>>>> L401: }
>>>>
>>>> JVM/TI has this event handler:
>>>>
>>>> typedef void (JNICALL *jvmtiEventMonitorContendedEnter)
>>>> (jvmtiEnv *jvmti_env,
>>>> JNIEnv* jni_env,
>>>> jthread thread,
>>>> jobject object);
>>>>
>>>> This event handler is called after
>>>> set_current_pending_monitor()
>>>> and if the event handler uses a RawMonitor, then it
>>>> possible for
>>>> for the thread to show up as blocked on both a Java
>>>> monitor and
>>>> a JVM/TI RawMonitor.
>>>
>>> Oh that is interesting - good catch! So that means the current code
>>> is broken because the raw monitor will replace the ObjectMonitor as
>>> the pending monitor and then set it back to NULL, thus losing the
>>> fact the thread is actually pending on the ObjectMonitor. And of
>>> course while the pending monitor is the raw monitor that totally
>>> messes up the deadlock detection as the ObjectMonitor is missing
>>> from consideration. :(
>>
>> If I remember correctly it is a scenario where this issue also comes
>> to the picture:
>> https://bugs.openjdk.java.net/browse/JDK-8033399
>>
>> I do not really understand how shared ParkEvent can be used/consumed
>> by both ObjectMonitor and RawMonitor on the same thread.
>> But we observed and investigated this problem several years ago and
>> Dan finally filed this enhancement.
>
> I still don't see how this is possible as you are not actually
> enqueued on the ObjectMonitor when the call out to the event callback
> is made. but that is a topic for another email thread. :)
Agreed.
Just wanted to point out it can be related.
Dan filed this RFE and can have more knowledge.
> Meanwhile what to do about broken deadlock detection ... :(
It is a good catch from Dan.
Thanks,
Serguei
>
> Thanks,
> David
>
>> Thanks,
>> Serguei
>>
>>
>>
>>
>>> This also probably means that you can have a pending raw monitor at
>>> the same time as you have a "Blocker" as I'm pretty sure there are
>>> various JVM TI event handlers that may execute between the Blocker
>>> being set and the actual park. So that would be an additional
>>> breakage in the existing code.
>>>
>>> Back to my code and I have two problems. The second, which is easy
>>> to address, is the deadlock printing code. I'll hoist the
>>> waitingToLockRawMonitor chunk to the top so it is executed
>>> independent of the waitingToLockMonitor value (which remains in an
>>> if/else relationship with the waitingToLockBlocker). But now that we
>>> might print two "records" at a time I have to make additional
>>> changes to get meaningful output for the current thread (which is
>>> handled as a common code after the if/else block to finish whichever
>>> record was being printed). Also I can no longer use "continue" as
>>> the 3 outcomes are not mutually exclusive - so this could get a bit
>>> messy. :(
>>>
>>> So definitely a v2 webrev on the way.
>>>
>>> But before that I need to solve my first problem - and I don't know
>>> how. Now that it is apparent that a thread can be blocked on both a
>>> raw monitor and an ObjectMonitor at the same time, I have no idea
>>> how to actually account for this in the deadlock detection code.
>>> That code has a while loop that expects to at most find either a
>>> locked ObjectMonitor or j.u.c Blocker, and it adds the owner thread
>>> to the cycle detection, then moves on. But now I can have two
>>> different owner threads in the same loop iteration. I don't know how
>>> to account for that.
>>>
>>> Given that it seems to me that the current code is already broken if
>>> we encounter these conditions, then perhaps all I can do is handle
>>> the other cases, where the blocking reasons are mutually exclusive,
>>> and not try to fix things? i.e. leave lines #434 to #440 as they are
>>> in webrev v1 - which implies no change to line #398 except the
>>> comment ... ??
>>>
>>>> test/hotspot/jtreg/vmTestbase/nsk/jvmti/RawMonitorWait/rawmnwait005/rawmnwait005.cpp
>>>>
>>>> No comments.
>>>>
>>>>
>>>> Thumbs up! The only non-nit I have is the setting of
>>>> waitingToLockRawMonitor
>>>> on L400 and the corresponding comment on L399. Everything else is a
>>>> nit.
>>>>
>>>> I don't need to see a new webrev.
>>>
>>> If only that were true :(
>>>
>>> Thanks,
>>> David
>>>
>>>> Thanks for tackling this disentangle issue!
>>>>
>>>> Dan
>>>>
>>>>
>>>>>
>>>>> The earlier attempt to rewrite JvmtiRawMonitor as a simple wrapper
>>>>> around PlatformMonitor proved not so simple and ultimately had too
>>>>> many issues due to the need to support Thread.interrupt.
>>>>>
>>>>> I'd previously stated in the bug report:
>>>>>
>>>>> "In the worst-case I suppose we could just copy ObjectMonitor to a
>>>>> new class and have JvmtiRawMonitor continue to extend that (with
>>>>> some additional minor adjustments) - or even just inline it all as
>>>>> needed."
>>>>>
>>>>> but hadn't looked at it in detail. Richard Reingruber did look at
>>>>> it and pointed out that it is actually quite simple - we barely
>>>>> use any actual code from ObjectMonitor, mainly just the state. So
>>>>> thanks Richard! :)
>>>>>
>>>>> So this change basically copies or moves anything needed by
>>>>> JvmtiRawMonitor from ObjectMonitor, breaking the connection
>>>>> between the two. We also copy and simplify ObjectWaiter, turning
>>>>> it into a QNode internal class. There is then a lot of cleanup
>>>>> that was applied (and a lot more that could still be done):
>>>>>
>>>>> - Removed the never implemented/used PROPER_TRANSITIONS ifdefs
>>>>> - Fixed the disconnect between the types of non-JavaThreads
>>>>> expected by the upper layer code and lower layer code
>>>>> - cleaned up and simplified return codes
>>>>> - consolidated code that is identical for JavaThreads and
>>>>> non-JavaThreads (e.g. notify/notifyAll).
>>>>> - removed used of TRAPS/THREAD where not appropriate and replaced
>>>>> with "Thread * Self" in the style of the rest of the code
>>>>> - changed recursions to be int rather than intptr_t (a "fixme" in
>>>>> the ObjectMonitor code)
>>>>>
>>>>>
>>>>> I have not changed the many style flaws with this code:
>>>>> - Capitalized names
>>>>> - extra spaces before ;
>>>>> - ...
>>>>>
>>>>> but could do so if needed. I wanted to try and keep it more
>>>>> obvious that the fundamental functional code is actually unmodified.
>>>>>
>>>>> There is one aspect that requires further explanation: the notion
>>>>> of current pending monitor. The "current pending monitor" is
>>>>> stored in the Thread and used by a number of introspection APIs
>>>>> for things like finding monitors, doing deadlock detection, etc.
>>>>> The JvmtiRawMonitor code would also set/clear itself as "current
>>>>> pending monitor". Most uses of the current pending monitor
>>>>> actually, explicitly or implicitly, ignore the case when the
>>>>> monitor is a JvmtiRawMonitor (observed by the fact the
>>>>> mon->object() query returns NULL). The exception to that is
>>>>> deadlock detection where raw monitors are at least partially
>>>>> accounted for. To preserve that I added the notion of "current
>>>>> pending raw monitor" and updated the deadlock detection code to
>>>>> use that.
>>>>>
>>>>> The test:
>>>>>
>>>>>
>>>>> test/hotspot/jtreg/vmTestbase/nsk/jvmti/RawMonitorWait/rawmnwait005/rawmnwait005.cpp
>>>>>
>>>>>
>>>>> was updated because I'd noticed previously that it was the only
>>>>> test that used interrupt with raw monitors, but was in fact
>>>>> broken: the test thread is a daemon thread so the main thread
>>>>> could terminate the VM immediately after the interrupt() call,
>>>>> thus you would never know if the interruption actually worked as
>>>>> expected.
>>>>>
>>>>> Testing:
>>>>> - tiers 1 - 3
>>>>> - vmTestbase/nsk/monitoring/ (for deadlock detection**)
>>>>> - vmTestbase/nsk/jdwp
>>>>> - vmTestbase/nsk/jdb/
>>>>> - vmTestbase/nsk/jdi/
>>>>> - vmTestbase/nsk/jvmti/
>>>>> - serviceability/jvmti/
>>>>> - serviceability/jdwp
>>>>> - JDK: java/lang/management
>>>>>
>>>>> ** There are no existing deadlock related tests involving
>>>>> JvmtiRawMonitor. It would be interesting/useful to add them to the
>>>>> existing nsk/monitoring tests that cover synchronized and JNI
>>>>> locking. But it's a non-trivial enhancement that I don't really
>>>>> have time to do.
>>>>>
>>>>> Thanks,
>>>>> David
>>>>> -----
>>>>
>>
More information about the serviceability-dev
mailing list