RFR: 8231289: Disentangle JvmtiRawMonitor from ObjectMonitor and clean it up

serguei.spitsyn at oracle.com serguei.spitsyn at oracle.com
Mon Oct 7 07:58:44 UTC 2019


Hi Dan,


On 10/3/19 17:01, Daniel D. Daugherty wrote:
> On 10/3/19 7:35 PM, serguei.spitsyn at oracle.com wrote:
>>
>>
>> On 10/3/19 3:33 PM, David Holmes wrote:
>>> On 4/10/2019 3:15 am, serguei.spitsyn at oracle.com wrote:
>>>> On 10/3/19 03:13, David Holmes wrote:
>>>>> Hi Dan,
>>>>>
>>>>> On 3/10/2019 3:20 am, Daniel D. Daugherty wrote:
>>>>>> Sorry for the delay in reviewing this one... I've been playing 
>>>>>> whack-a-mole
>>>>>> with Robbin's MoCrazy test and my AsyncMonitorDeflation bits...
>>>>>
>>>>> No problem - your contribution made the wait worthwhile :)
>>>>>
>>>>>> On 9/24/19 1:09 AM, David Holmes wrote:
>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8231289
>>>>>>> webrev: http://cr.openjdk.java.net/~dholmes/8231289/webrev/
>>>>>>
>>>>>> src/hotspot/share/prims/jvmtiEnv.cpp
>>>>>>      Thanks for removing the PROPER_TRANSITIONS stuff. That was old
>>>>>>      and crufty stuff.
>>>>>>
>>>>>> src/hotspot/share/prims/jvmtiEnvBase.cpp
>>>>>>      No comments.
>>>>>>
>>>>>> src/hotspot/share/prims/jvmtiRawMonitor.cpp
>>>>>>      L39:   new (ResourceObj::C_HEAP, mtInternal) 
>>>>>> GrowableArray<JvmtiRawMonitor*>(1,true);
>>>>>>          nit - need a space between ',' and 'true'.
>>>>>>
>>>>>>          Update: leave for your follow-up bug.
>>>>>
>>>>> Fixed now so I don't forget later. :)
>>>>>
>>>>>> src/hotspot/share/prims/jvmtiRawMonitor.hpp
>>>>>>      No comments.
>>>>>>
>>>>>> src/hotspot/share/runtime/objectMonitor.hpp
>>>>>>      Glad I added those 'protected for JvmtiRawMonitor' in one
>>>>>>      of my recent cleanup bugs. Obviously I'll have to merge
>>>>>>      with Async Monitor Deflation. :-)
>>>>>>
>>>>>> src/hotspot/share/runtime/thread.cpp
>>>>>>      No comments.
>>>>>>
>>>>>> src/hotspot/share/runtime/thread.hpp
>>>>>>      No comments.
>>>>>>
>>>>>> src/hotspot/share/services/threadService.cpp
>>>>>>      L397:     waitingToLockMonitor = jt->current_pending_monitor();
>>>>>>      L398:     if (waitingToLockMonitor == NULL) {
>>>>>>      L399:       // we can only be blocked on a raw monitor if 
>>>>>> not blocked on an ObjectMonitor
>>>>>>      L400:       waitingToLockRawMonitor = 
>>>>>> jt->current_pending_raw_monitor();
>>>>>>      L401:     }
>>>>>>
>>>>>>          JVM/TI has this event handler:
>>>>>>
>>>>>>            typedef void (JNICALL *jvmtiEventMonitorContendedEnter)
>>>>>>                (jvmtiEnv *jvmti_env,
>>>>>>                 JNIEnv* jni_env,
>>>>>>                 jthread thread,
>>>>>>                 jobject object);
>>>>>>
>>>>>>          This event handler is called after 
>>>>>> set_current_pending_monitor()
>>>>>>          and if the event handler uses a RawMonitor, then it 
>>>>>> possible for
>>>>>>          for the thread to show up as blocked on both a Java 
>>>>>> monitor and
>>>>>>          a JVM/TI RawMonitor.
>>>>>
>>>>> Oh that is interesting - good catch! So that means the current 
>>>>> code is broken because the raw monitor will replace the 
>>>>> ObjectMonitor as the pending monitor and then set it back to NULL, 
>>>>> thus losing the fact the thread is actually pending on the 
>>>>> ObjectMonitor. And of course while the pending monitor is the raw 
>>>>> monitor that totally messes up the deadlock detection as the 
>>>>> ObjectMonitor is missing from consideration. :(
>>>>
>>>> If I remember correctly it is a scenario where this issue also 
>>>> comes to the picture:
>>>> https://bugs.openjdk.java.net/browse/JDK-8033399
>>>>
>>>> I do not really understand how shared ParkEvent can be 
>>>> used/consumed by both ObjectMonitor and RawMonitor on the same thread.
>>>> But we observed and investigated this problem several years ago and 
>>>> Dan finally filed this enhancement.
>>>
>>> I still don't see how this is possible as you are not actually 
>>> enqueued on the ObjectMonitor when the call out to the event 
>>> callback is made. but that is a topic for another email thread. :)
>
> Correct that you cannot be enqueued on the ObjectMonitor when you
> make the callback. However, I don't think that was the point of
> 8033399 when we filed so very long ago...
>
> Quoting a comment from David:
>
>> David Holmes added a comment - 2014-01-27 18:34
>> Could there be multiple places in event handling code that could in 
>> theory consume unparks and so require the re-issue of an unpark() 
>> from different locations in the code?
>>
>> Seems to me that perhaps raw monitors - given they can be entered 
>> from within the normal monitor code - should have their own _event 
>> object per thread, so that this accidental consumption of unparks can 
>> not occur. 
>
>
> The scenario that comes to mind:
>
> - T1 is contending on an ObjectMonitor and has set waitingToLockMonitor.
> - T1 calls the jvmtiEventMonitorContendedEnter event handler that
>   contends on a JvmtiRawMonitor and has set waitingToLockRawMonitor.
> - T1 blocks on the JvmtiRawMonitor and parks.
> - T2 is exiting the ObjectMonitor and has picked T1 as the successor
>   so it unparks T1. Only T1 is parked for the JvmtiRawMonitor and
>   not for the ObjectMonitor. T2 hasn't quite finished exiting the
>   ObjectMonitor yet... (not sure if this lag is possible)
> - T1 has unparked early for the JvmtiRawMonitor and at the same
>   time T3 is exiting the JvmtiRawMonitor.
> - T1 quickly enters the JvmtiRawMonitor and T3 doesn't have to
>   pick a successor so it doesn't do an unpark.
> - T1 finishes the work of the jvmtiEventMonitorContendedEnter and
>   returns to the caller which is the code that's about to block on
>   the ObjectMonitor...
> - T1 blocks on the ObjectMonitor and parks.
> - T2 finishes exiting the ObjectMonitor... Does T1 get unparked?
>
> I can't remember when T2 does the unpark() relative to dropping
> ownership of the ObjectMonitor. If the unpark() is first or if
> the _owner field setting to NULL lingers... it's possible for T1
> to block and park... with nothing to unpark T1...
>
> Pure, crazy theory here...

The scenario above is very interesting, thank you a lot!
I've added a comment with this to the 8033399.


> However, with David's work on this bug (8231289), this theoretical
> problem goes away... That's the only reason for trying to close
> this 8033399 sub-thread here...

It is not clear why this theoretical problem goes away with the David's 
work on the bug 8231289.
I don't want to continue this discussion here and would suggest to move 
to the 8231289 comments.


Thanks,
Serguei

>
> Dan


More information about the serviceability-dev mailing list