RFR: 8231289: Disentangle JvmtiRawMonitor from ObjectMonitor and clean it up
Daniel D. Daugherty
daniel.daugherty at oracle.com
Fri Oct 4 13:43:41 UTC 2019
On 10/3/19 9:55 PM, David Holmes wrote:
> <trimming>
>
> On 4/10/2019 10:01 am, Daniel D. Daugherty wrote:
>> On 10/3/19 7:35 PM, serguei.spitsyn at oracle.com wrote:
>>>>> If I remember correctly it is a scenario where this issue also
>>>>> comes to the picture:
>>>>> https://bugs.openjdk.java.net/browse/JDK-8033399
>>>>>
>>>>> I do not really understand how shared ParkEvent can be
>>>>> used/consumed by both ObjectMonitor and RawMonitor on the same
>>>>> thread.
>>>>> But we observed and investigated this problem several years ago
>>>>> and Dan finally filed this enhancement.
>>>>
>>>> I still don't see how this is possible as you are not actually
>>>> enqueued on the ObjectMonitor when the call out to the event
>>>> callback is made. but that is a topic for another email thread. :)
>>
>> Correct that you cannot be enqueued on the ObjectMonitor when you
>> make the callback. However, I don't think that was the point of
>> 8033399 when we filed so very long ago...
>>
>> Quoting a comment from David:
>>
>>> David Holmes added a comment - 2014-01-27 18:34
>>> Could there be multiple places in event handling code that could in
>>> theory consume unparks and so require the re-issue of an unpark()
>>> from different locations in the code?
>>>
>>> Seems to me that perhaps raw monitors - given they can be entered
>>> from within the normal monitor code - should have their own _event
>>> object per thread, so that this accidental consumption of unparks
>>> can not occur.
>
> And since then I've decided this isn't actually a problem.
Sounds good to me.
>
>>
>> The scenario that comes to mind:
>>
>> - T1 is contending on an ObjectMonitor and has set waitingToLockMonitor.
>> - T1 calls the jvmtiEventMonitorContendedEnter event handler that
>> contends on a JvmtiRawMonitor and has set waitingToLockRawMonitor.
>> - T1 blocks on the JvmtiRawMonitor and parks.
>> - T2 is exiting the ObjectMonitor and has picked T1 as the successor
>> so it unparks T1.
>
> Nope - T1 is not yet enqueued on the ObjectMonitor so it can't be
> picked as successor.
Also sounds good.
>
>> Only T1 is parked for the JvmtiRawMonitor and
>> not for the ObjectMonitor. T2 hasn't quite finished exiting the
>> ObjectMonitor yet... (not sure if this lag is possible)
>> - T1 has unparked early for the JvmtiRawMonitor and at the same
>> time T3 is exiting the JvmtiRawMonitor.
>> - T1 quickly enters the JvmtiRawMonitor and T3 doesn't have to
>> pick a successor so it doesn't do an unpark.
>> - T1 finishes the work of the jvmtiEventMonitorContendedEnter and
>> returns to the caller which is the code that's about to block on
>> the ObjectMonitor...
>> - T1 blocks on the ObjectMonitor and parks.
>
> For this to actually be a problem requires that the call out to the
> raw monitor code happens inbetween the check for whether T1 needs to
> park and the park call. That also is not the case.
Also sounds good.
>
>> - T2 finishes exiting the ObjectMonitor... Does T1 get unparked?
>>
>> I can't remember when T2 does the unpark() relative to dropping
>> ownership of the ObjectMonitor. If the unpark() is first or if
>> the _owner field setting to NULL lingers... it's possible for T1
>> to block and park... with nothing to unpark T1...
>>
>> Pure, crazy theory here...
>>
>> However, with David's work on this bug (8231289), this theoretical
>> problem goes away... That's the only reason for trying to close
>> this 8033399 sub-thread here...
>
> My work here makes no difference to 8033399 perceived problem. The
> same ParkEvent continues to be used by both JvmtiRawMonitor and
> ObjectMonitor.
I confused myself into thinking that you either created a
new ParkEvent or didn't use the existing ParkEvent anymore.
My mistake.
Dan
>
> Cheers,
> David
>
>> Dan
>>
>>
>>>
>>> Agreed.
>>> Just wanted to point out it can be related.
>>> Dan filed this RFE and can have more knowledge.
>>>
>>>> Meanwhile what to do about broken deadlock detection ... :(
>>>
>>> It is a good catch from Dan.
>>>
>>> Thanks,
>>> Serguei
>>>
>>>>
>>>> Thanks,
>>>> David
>>>>
>>>>> Thanks,
>>>>> Serguei
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> This also probably means that you can have a pending raw monitor
>>>>>> at the same time as you have a "Blocker" as I'm pretty sure there
>>>>>> are various JVM TI event handlers that may execute between the
>>>>>> Blocker being set and the actual park. So that would be an
>>>>>> additional breakage in the existing code.
>>>>>>
>>>>>> Back to my code and I have two problems. The second, which is
>>>>>> easy to address, is the deadlock printing code. I'll hoist the
>>>>>> waitingToLockRawMonitor chunk to the top so it is executed
>>>>>> independent of the waitingToLockMonitor value (which remains in
>>>>>> an if/else relationship with the waitingToLockBlocker). But now
>>>>>> that we might print two "records" at a time I have to make
>>>>>> additional changes to get meaningful output for the current
>>>>>> thread (which is handled as a common code after the if/else block
>>>>>> to finish whichever record was being printed). Also I can no
>>>>>> longer use "continue" as the 3 outcomes are not mutually
>>>>>> exclusive - so this could get a bit messy. :(
>>>>>>
>>>>>> So definitely a v2 webrev on the way.
>>>>>>
>>>>>> But before that I need to solve my first problem - and I don't
>>>>>> know how. Now that it is apparent that a thread can be blocked on
>>>>>> both a raw monitor and an ObjectMonitor at the same time, I have
>>>>>> no idea how to actually account for this in the deadlock
>>>>>> detection code. That code has a while loop that expects to at
>>>>>> most find either a locked ObjectMonitor or j.u.c Blocker, and it
>>>>>> adds the owner thread to the cycle detection, then moves on. But
>>>>>> now I can have two different owner threads in the same loop
>>>>>> iteration. I don't know how to account for that.
>>>>>>
>>>>>> Given that it seems to me that the current code is already broken
>>>>>> if we encounter these conditions, then perhaps all I can do is
>>>>>> handle the other cases, where the blocking reasons are mutually
>>>>>> exclusive, and not try to fix things? i.e. leave lines #434 to
>>>>>> #440 as they are in webrev v1 - which implies no change to line
>>>>>> #398 except the comment ... ??
>>>>>>
>>>>>>> test/hotspot/jtreg/vmTestbase/nsk/jvmti/RawMonitorWait/rawmnwait005/rawmnwait005.cpp
>>>>>>>
>>>>>>> No comments.
>>>>>>>
>>>>>>>
>>>>>>> Thumbs up! The only non-nit I have is the setting of
>>>>>>> waitingToLockRawMonitor
>>>>>>> on L400 and the corresponding comment on L399. Everything else
>>>>>>> is a nit.
>>>>>>>
>>>>>>> I don't need to see a new webrev.
>>>>>>
>>>>>> If only that were true :(
>>>>>>
>>>>>> Thanks,
>>>>>> David
>>>>>>
>>>>>>> Thanks for tackling this disentangle issue!
>>>>>>>
>>>>>>> Dan
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> The earlier attempt to rewrite JvmtiRawMonitor as a simple
>>>>>>>> wrapper around PlatformMonitor proved not so simple and
>>>>>>>> ultimately had too many issues due to the need to support
>>>>>>>> Thread.interrupt.
>>>>>>>>
>>>>>>>> I'd previously stated in the bug report:
>>>>>>>>
>>>>>>>> "In the worst-case I suppose we could just copy ObjectMonitor
>>>>>>>> to a new class and have JvmtiRawMonitor continue to extend that
>>>>>>>> (with some additional minor adjustments) - or even just inline
>>>>>>>> it all as needed."
>>>>>>>>
>>>>>>>> but hadn't looked at it in detail. Richard Reingruber did look
>>>>>>>> at it and pointed out that it is actually quite simple - we
>>>>>>>> barely use any actual code from ObjectMonitor, mainly just the
>>>>>>>> state. So thanks Richard! :)
>>>>>>>>
>>>>>>>> So this change basically copies or moves anything needed by
>>>>>>>> JvmtiRawMonitor from ObjectMonitor, breaking the connection
>>>>>>>> between the two. We also copy and simplify ObjectWaiter,
>>>>>>>> turning it into a QNode internal class. There is then a lot of
>>>>>>>> cleanup that was applied (and a lot more that could still be
>>>>>>>> done):
>>>>>>>>
>>>>>>>> - Removed the never implemented/used PROPER_TRANSITIONS ifdefs
>>>>>>>> - Fixed the disconnect between the types of non-JavaThreads
>>>>>>>> expected by the upper layer code and lower layer code
>>>>>>>> - cleaned up and simplified return codes
>>>>>>>> - consolidated code that is identical for JavaThreads and
>>>>>>>> non-JavaThreads (e.g. notify/notifyAll).
>>>>>>>> - removed used of TRAPS/THREAD where not appropriate and
>>>>>>>> replaced with "Thread * Self" in the style of the rest of the code
>>>>>>>> - changed recursions to be int rather than intptr_t (a "fixme"
>>>>>>>> in the ObjectMonitor code)
>>>>>>>>
>>>>>>>>
>>>>>>>> I have not changed the many style flaws with this code:
>>>>>>>> - Capitalized names
>>>>>>>> - extra spaces before ;
>>>>>>>> - ...
>>>>>>>>
>>>>>>>> but could do so if needed. I wanted to try and keep it more
>>>>>>>> obvious that the fundamental functional code is actually
>>>>>>>> unmodified.
>>>>>>>>
>>>>>>>> There is one aspect that requires further explanation: the
>>>>>>>> notion of current pending monitor. The "current pending
>>>>>>>> monitor" is stored in the Thread and used by a number of
>>>>>>>> introspection APIs for things like finding monitors, doing
>>>>>>>> deadlock detection, etc. The JvmtiRawMonitor code would also
>>>>>>>> set/clear itself as "current pending monitor". Most uses of the
>>>>>>>> current pending monitor actually, explicitly or implicitly,
>>>>>>>> ignore the case when the monitor is a JvmtiRawMonitor (observed
>>>>>>>> by the fact the mon->object() query returns NULL). The
>>>>>>>> exception to that is deadlock detection where raw monitors are
>>>>>>>> at least partially accounted for. To preserve that I added the
>>>>>>>> notion of "current pending raw monitor" and updated the
>>>>>>>> deadlock detection code to use that.
>>>>>>>>
>>>>>>>> The test:
>>>>>>>>
>>>>>>>>
>>>>>>>> test/hotspot/jtreg/vmTestbase/nsk/jvmti/RawMonitorWait/rawmnwait005/rawmnwait005.cpp
>>>>>>>>
>>>>>>>>
>>>>>>>> was updated because I'd noticed previously that it was the only
>>>>>>>> test that used interrupt with raw monitors, but was in fact
>>>>>>>> broken: the test thread is a daemon thread so the main thread
>>>>>>>> could terminate the VM immediately after the interrupt() call,
>>>>>>>> thus you would never know if the interruption actually worked
>>>>>>>> as expected.
>>>>>>>>
>>>>>>>> Testing:
>>>>>>>> - tiers 1 - 3
>>>>>>>> - vmTestbase/nsk/monitoring/ (for deadlock detection**)
>>>>>>>> - vmTestbase/nsk/jdwp
>>>>>>>> - vmTestbase/nsk/jdb/
>>>>>>>> - vmTestbase/nsk/jdi/
>>>>>>>> - vmTestbase/nsk/jvmti/
>>>>>>>> - serviceability/jvmti/
>>>>>>>> - serviceability/jdwp
>>>>>>>> - JDK: java/lang/management
>>>>>>>>
>>>>>>>> ** There are no existing deadlock related tests involving
>>>>>>>> JvmtiRawMonitor. It would be interesting/useful to add them to
>>>>>>>> the existing nsk/monitoring tests that cover synchronized and
>>>>>>>> JNI locking. But it's a non-trivial enhancement that I don't
>>>>>>>> really have time to do.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> David
>>>>>>>> -----
>>>>>>>
>>>>>
>>>
>>
More information about the serviceability-dev
mailing list