RFC: 8229160: Reimplement JvmtiRawMonitor to use PlatformMonitor

Tue Sep 10 07:54:54 UTC 2019

It turns out that polling for interrupts is actually very difficult to 
do correctly. There is an inherent race with timing out and being 
signalled that can result in lost signals. Trying to account for that 
without introducing other problems would lead to a very complex 
synchronization mechanism (which would need to track waiters, 
notifications and a "generation" count). The end result is that we'd 
break the tie to ObjectMonitor code, but would replace it with new 
complex untried code. :(

David

On 10/09/2019 8:49 am, David Holmes wrote:
> Hi Serguei,
> 
> On 10/09/2019 4:26 am, serguei.spitsyn at oracle.com wrote:
>> Hi David,
>>
>> On 9/8/19 19:15, David Holmes wrote:
>>> Hi Dan,
>>>
>>> On 7/09/2019 6:50 am, Daniel D. Daugherty wrote:
>>>> Hi David,
>>>>
>>>> I've finally gotten back to this email thread...
>>>
>>> Thanks.
>>>
>>>>> FYI testing to date:
>>>>>  - tiers 1 -3 all platforms
>>>>>  - hotspot: serviceability/jvmti
>>>>>                           /jdwp
>>>>>             vmTestbase/nsk/jvmti
>>>>>                           /jdwp
>>>>>  - JDK: com/sun/jdi 
>>>>
>>>> You should also add:
>>>>
>>>> open/test/hotspot/jtreg/vmTestbase/nsk/jdb
>>>> open/test/hotspot/jtreg/vmTestbase/nsk/jdi
>>>> open/test/jdk/java/lang/instrument
>>>
>>> Okay - in progress. Though I can't see any use of RawMonitors in any 
>>> of these tests.
>>>
>>>> I took a quick look through the preliminary webrev and I don't see
>>>> anything that worries me.
>>>
>>> Thanks. I'll prepare a more polished webrev soon.
>>>
>>>> Re: Thread.interrupt() and raw_wait()
>>>>
>>>> It would be good to see if that semantic is being tested via the
>>>> JCK test suite for JVM/TI.
>>>
>>> It isn't. The only thing directly tested for RawMonitorWait is normal 
>>> successful operation and reporting "not owner" when not the owner. No 
>>> check for JVMTI_ERROR_INTERRUPT exists other than as input for the 
>>> GetErrorName function.
>>
>> This is most likely true.
>> My only concern is if RawMonitor's can be used in the JCK test 
>> libraries (low probability).
>> I've asked Leonid Kuskov (JCK) to double check this (added to the 
>> mailing list).
>>
>>>
>>> There's only one test in the whole test base that checks for the 
>>> interrupt and that is 
>>> vmTestbase/nsk/jvmti/RawMonitorWait/rawmnwait005/. In that test if we 
>>> are not interrupted before the RawMonitorWait we will wait until the 
>>> full timeout elapses - which is 2 minutes by default - then return 
>>> and report the interrupt. Hence the test still passes. (If it was an 
>>> untimed wait that would be different of course).
>>
>> I figured the same last Friday.
>> One more place to care about are NSK tests libraries that are located 
>> here:
>>    test/hotspot/jtreg/vmTestbase/nsk/share
>>
>> There are a couple of places where the RawMonitor is used:
>>    jvmti/hotswap/HotSwap.cpp:    if 
>> (!NSK_JVMTI_VERIFY(jvmti->RawMonitorWait(waitLock, millis)))
>>    jvmti/jvmti_tools.cpp:    jvmtiError error = 
>> env->RawMonitorWait(monitor, millis);
>>
>> The use in HotSwap.cpp is local.
>> The jvmti_tools.cpp defines this:
>>    void rawMonitorWait(jvmtiEnv *env, jrawMonitorID monitor, jlong 
>> millis) {
>>        jvmtiError error = env->RawMonitorWait(monitor, millis);
>>        exitOnError(error);
>>    }
>>
>> which is used in the jvmti/agent_tools.cpp but does not depend on 
>> interrupting of RawMonitor's (as I can see).
>> One more place to mention is:
>>    jvmti/DataDumpRequest/datadumpreq001/datadumpreq001.cpp
>>
>> But I see no problems there as well.
>>
>>
>>
>> The JDWP implementation is using RawMonitor's.
>> Please, see functions debugMonitorWait()/debugMonitorTimedWait() in 
>> src/jdk.jdwp.agent/share/native/libjdwp/util.c.
>>
>> It expects the JVMTI_ERROR_INTERRUPT but never makes a call to the 
>> JVMTI ThreadInterrupt().
>> So, it looks like it does not depend on interrupting of RawMonitor's 
>> in any way.
>>
>>>
>>> The more I try to convince people this change should be okay, the 
>>> more uncomfortable I get with my own arguments. :) I think I'm going 
>>> to implement the polling approach for checking interrupts - say 500ms.
>>
>> The JVMTI spec tells that the JVMTI_ERROR_INTERRUPT can be returned 
>> from the RawMonitorWait:
>> https://docs.oracle.com/en/java/javase/11/docs/specs/jvmti.html#RawMonitorWait 
>>
> 
> Yes it does and that is the only thing that implies a connection to 
> Thread.interrupt.
> 
>> which means that RawMonitorWait can be interrupted with the 
>> Thread.Interrupt()
>> or JVMTI InterruptThread():
>> https://docs.oracle.com/en/java/javase/11/docs/specs/jvmti.html#InterruptThread 
>>
> 
> That's one way to interpret the fact that RawMonitorWait can return 
> JVMTI_ERROR_INTERRUPT, but the actual interaction between 
> Thread.interrupt and RawMonitorWait is not explicitly stated. Arguably 
> you can just check for interruption before and after the wait, to see 
> whether to return JVMTI_ERROR_INTERRUPT, without necessarily being able 
> to break out of the wait itself. That's been the whole premise of this 
> change proposal - that responsiveness to interrupts is more a 
> quality-of-implementation issue.
> 
> But in any case I've decided to try the polling approach so that we 
> won't wait forever if interrupted but not notified.
> 
> Thanks,
> David
> -----
> 
>>
>>
>> Thanks,
>> Serguei
>>
>>>> I also very much like/appreciate the decoupling of JvmtiRawMonitors
>>>> from ObjectMonitors... Thanks for tackling this crazy task.
>>>
>>> Thanks :)
>>>
>>> David
>>>
>>>> Dan
>>>>
>>>>
>>>>
>>>> On 8/15/19 2:22 AM, David Holmes wrote:
>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8229160
>>>>>
>>>>> Preliminary webrev (still has rough edges): 
>>>>> http://cr.openjdk.java.net/~dholmes/8229160/webrev.prelim/
>>>>>
>>>>> Background:
>>>>>
>>>>> We've had this comment for a long time:
>>>>>
>>>>>  // The raw monitor subsystem is entirely distinct from normal
>>>>>  // java-synchronization or jni-synchronization.  raw monitors are not
>>>>>  // associated with objects.  They can be implemented in any manner
>>>>>  // that makes sense.  The original implementors decided to piggy-back
>>>>>  // the raw-monitor implementation on the existing Java 
>>>>> objectMonitor mechanism.
>>>>>  // This flaw needs to fixed.  We should reimplement raw monitors 
>>>>> as sui-generis.
>>>>>  // Specifically, we should not implement raw monitors via java 
>>>>> monitors.
>>>>>  // Time permitting, we should disentangle and deconvolve the two 
>>>>> implementations
>>>>>  // and move the resulting raw monitor implementation over to the 
>>>>> JVMTI directories.
>>>>>  // Ideally, the raw monitor implementation would be built on top of
>>>>>  // park-unpark and nothing else.
>>>>>
>>>>> This is an attempt to do that disentangling so that we can then 
>>>>> consider changes to ObjectMonitor without having to worry about 
>>>>> JvmtiRawMonitors. But rather than building on low-level park/unpark 
>>>>> (which would require the same manual queue management and much of 
>>>>> the same complex code as exists in ObjectMonitor) I decided to try 
>>>>> and do this on top of PlatformMonitor.
>>>>>
>>>>> The reason this is just a RFC rather than RFR is that I overlooked 
>>>>> a non-trivial aspect of JvmtiRawMonitors: like Java monitors (as 
>>>>> implemented by ObjectMonitor) they interact with the 
>>>>> Thread.interrupt mechanism. This is not clearly stated in the JVM 
>>>>> TI specification [1] but only in passing by the possible errors for 
>>>>> RawMonitorWait:
>>>>>
>>>>> JVMTI_ERROR_INTERRUPT    Wait was interrupted, try again
>>>>>
>>>>> As I explain in the bug report there is no way to build in proper 
>>>>> interrupt support using PlatformMonitor as there is no way we can 
>>>>> "interrupt" the low-level pthread_cond_wait. But we can approximate 
>>>>> it. What I've done in this preliminary version is just check 
>>>>> interrupt state before and after the actual "wait" but we won't get 
>>>>> woken by the interrupt once we have actually blocked. Alternatively 
>>>>> we could use a periodic polling approach and wakeup every Nms to 
>>>>> check for interruption.
>>>>>
>>>>> The only use of JvmtiRawMonitors in the JDK libraries (JDWP) is not 
>>>>> affected by this choice as that code ignores the interrupt until 
>>>>> the real action it was waiting for has occurred. The interrupt is 
>>>>> then reposted later.
>>>>>
>>>>> But more generally there could be users of JvmtiRawMonitors that 
>>>>> expect/require that RawMonitorWait is responsive to 
>>>>> Thread.interrupt in a manner similar to Object.wait. And if any of 
>>>>> them are reading this then I'd like to know - hence this RFC :)
>>>>>
>>>>> FYI testing to date:
>>>>>  - tiers 1 -3 all platforms
>>>>>  - hotspot: serviceability/jvmti
>>>>>                           /jdwp
>>>>>             vmTestbase/nsk/jvmti
>>>>>                           /jdwp
>>>>>  - JDK: com/sun/jdi
>>>>>
>>>>> Comments/opinions appreciated.
>>>>>
>>>>> Thanks,
>>>>> David
>>>>>
>>>>> [1] 
>>>>> https://docs.oracle.com/en/java/javase/11/docs/specs/jvmti.html#RawMonitorWait 
>>>>>
>>>>
>>