RFC: 8229160: Reimplement JvmtiRawMonitor to use PlatformMonitor

Tue Sep 10 08:52:00 UTC 2019

On 10/09/2019 5:54 pm, David Holmes wrote:
> It turns out that polling for interrupts is actually very difficult to 
> do correctly. There is an inherent race with timing out and being 
> signalled that can result in lost signals. Trying to account for that 
> without introducing other problems would lead to a very complex 
> synchronization mechanism (which would need to track waiters, 
> notifications and a "generation" count). The end result is that we'd 
> break the tie to ObjectMonitor code, but would replace it with new 
> complex untried code. :(

Or ... rather than the initial "unresponsive to interrupts" approach I 
could take it the other way and have every raw monitor wait limited to, 
say 500ms, at which point it will return as a "spurious wakeup" and 
check the interrupt state on the way out ...

David

> David
> 
> On 10/09/2019 8:49 am, David Holmes wrote:
>> Hi Serguei,
>>
>> On 10/09/2019 4:26 am, serguei.spitsyn at oracle.com wrote:
>>> Hi David,
>>>
>>> On 9/8/19 19:15, David Holmes wrote:
>>>> Hi Dan,
>>>>
>>>> On 7/09/2019 6:50 am, Daniel D. Daugherty wrote:
>>>>> Hi David,
>>>>>
>>>>> I've finally gotten back to this email thread...
>>>>
>>>> Thanks.
>>>>
>>>>>> FYI testing to date:
>>>>>>  - tiers 1 -3 all platforms
>>>>>>  - hotspot: serviceability/jvmti
>>>>>>                           /jdwp
>>>>>>             vmTestbase/nsk/jvmti
>>>>>>                           /jdwp
>>>>>>  - JDK: com/sun/jdi 
>>>>>
>>>>> You should also add:
>>>>>
>>>>> open/test/hotspot/jtreg/vmTestbase/nsk/jdb
>>>>> open/test/hotspot/jtreg/vmTestbase/nsk/jdi
>>>>> open/test/jdk/java/lang/instrument
>>>>
>>>> Okay - in progress. Though I can't see any use of RawMonitors in any 
>>>> of these tests.
>>>>
>>>>> I took a quick look through the preliminary webrev and I don't see
>>>>> anything that worries me.
>>>>
>>>> Thanks. I'll prepare a more polished webrev soon.
>>>>
>>>>> Re: Thread.interrupt() and raw_wait()
>>>>>
>>>>> It would be good to see if that semantic is being tested via the
>>>>> JCK test suite for JVM/TI.
>>>>
>>>> It isn't. The only thing directly tested for RawMonitorWait is 
>>>> normal successful operation and reporting "not owner" when not the 
>>>> owner. No check for JVMTI_ERROR_INTERRUPT exists other than as input 
>>>> for the GetErrorName function.
>>>
>>> This is most likely true.
>>> My only concern is if RawMonitor's can be used in the JCK test 
>>> libraries (low probability).
>>> I've asked Leonid Kuskov (JCK) to double check this (added to the 
>>> mailing list).
>>>
>>>>
>>>> There's only one test in the whole test base that checks for the 
>>>> interrupt and that is 
>>>> vmTestbase/nsk/jvmti/RawMonitorWait/rawmnwait005/. In that test if 
>>>> we are not interrupted before the RawMonitorWait we will wait until 
>>>> the full timeout elapses - which is 2 minutes by default - then 
>>>> return and report the interrupt. Hence the test still passes. (If it 
>>>> was an untimed wait that would be different of course).
>>>
>>> I figured the same last Friday.
>>> One more place to care about are NSK tests libraries that are located 
>>> here:
>>>    test/hotspot/jtreg/vmTestbase/nsk/share
>>>
>>> There are a couple of places where the RawMonitor is used:
>>>    jvmti/hotswap/HotSwap.cpp:    if 
>>> (!NSK_JVMTI_VERIFY(jvmti->RawMonitorWait(waitLock, millis)))
>>>    jvmti/jvmti_tools.cpp:    jvmtiError error = 
>>> env->RawMonitorWait(monitor, millis);
>>>
>>> The use in HotSwap.cpp is local.
>>> The jvmti_tools.cpp defines this:
>>>    void rawMonitorWait(jvmtiEnv *env, jrawMonitorID monitor, jlong 
>>> millis) {
>>>        jvmtiError error = env->RawMonitorWait(monitor, millis);
>>>        exitOnError(error);
>>>    }
>>>
>>> which is used in the jvmti/agent_tools.cpp but does not depend on 
>>> interrupting of RawMonitor's (as I can see).
>>> One more place to mention is:
>>>    jvmti/DataDumpRequest/datadumpreq001/datadumpreq001.cpp
>>>
>>> But I see no problems there as well.
>>>
>>>
>>>
>>> The JDWP implementation is using RawMonitor's.
>>> Please, see functions debugMonitorWait()/debugMonitorTimedWait() in 
>>> src/jdk.jdwp.agent/share/native/libjdwp/util.c.
>>>
>>> It expects the JVMTI_ERROR_INTERRUPT but never makes a call to the 
>>> JVMTI ThreadInterrupt().
>>> So, it looks like it does not depend on interrupting of RawMonitor's 
>>> in any way.
>>>
>>>>
>>>> The more I try to convince people this change should be okay, the 
>>>> more uncomfortable I get with my own arguments. :) I think I'm going 
>>>> to implement the polling approach for checking interrupts - say 500ms.
>>>
>>> The JVMTI spec tells that the JVMTI_ERROR_INTERRUPT can be returned 
>>> from the RawMonitorWait:
>>> https://docs.oracle.com/en/java/javase/11/docs/specs/jvmti.html#RawMonitorWait 
>>>
>>
>> Yes it does and that is the only thing that implies a connection to 
>> Thread.interrupt.
>>
>>> which means that RawMonitorWait can be interrupted with the 
>>> Thread.Interrupt()
>>> or JVMTI InterruptThread():
>>> https://docs.oracle.com/en/java/javase/11/docs/specs/jvmti.html#InterruptThread 
>>>
>>
>> That's one way to interpret the fact that RawMonitorWait can return 
>> JVMTI_ERROR_INTERRUPT, but the actual interaction between 
>> Thread.interrupt and RawMonitorWait is not explicitly stated. Arguably 
>> you can just check for interruption before and after the wait, to see 
>> whether to return JVMTI_ERROR_INTERRUPT, without necessarily being 
>> able to break out of the wait itself. That's been the whole premise of 
>> this change proposal - that responsiveness to interrupts is more a 
>> quality-of-implementation issue.
>>
>> But in any case I've decided to try the polling approach so that we 
>> won't wait forever if interrupted but not notified.
>>
>> Thanks,
>> David
>> -----
>>
>>>
>>>
>>> Thanks,
>>> Serguei
>>>
>>>>> I also very much like/appreciate the decoupling of JvmtiRawMonitors
>>>>> from ObjectMonitors... Thanks for tackling this crazy task.
>>>>
>>>> Thanks :)
>>>>
>>>> David
>>>>
>>>>> Dan
>>>>>
>>>>>
>>>>>
>>>>> On 8/15/19 2:22 AM, David Holmes wrote:
>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8229160
>>>>>>
>>>>>> Preliminary webrev (still has rough edges): 
>>>>>> http://cr.openjdk.java.net/~dholmes/8229160/webrev.prelim/
>>>>>>
>>>>>> Background:
>>>>>>
>>>>>> We've had this comment for a long time:
>>>>>>
>>>>>>  // The raw monitor subsystem is entirely distinct from normal
>>>>>>  // java-synchronization or jni-synchronization.  raw monitors are 
>>>>>> not
>>>>>>  // associated with objects.  They can be implemented in any manner
>>>>>>  // that makes sense.  The original implementors decided to 
>>>>>> piggy-back
>>>>>>  // the raw-monitor implementation on the existing Java 
>>>>>> objectMonitor mechanism.
>>>>>>  // This flaw needs to fixed.  We should reimplement raw monitors 
>>>>>> as sui-generis.
>>>>>>  // Specifically, we should not implement raw monitors via java 
>>>>>> monitors.
>>>>>>  // Time permitting, we should disentangle and deconvolve the two 
>>>>>> implementations
>>>>>>  // and move the resulting raw monitor implementation over to the 
>>>>>> JVMTI directories.
>>>>>>  // Ideally, the raw monitor implementation would be built on top of
>>>>>>  // park-unpark and nothing else.
>>>>>>
>>>>>> This is an attempt to do that disentangling so that we can then 
>>>>>> consider changes to ObjectMonitor without having to worry about 
>>>>>> JvmtiRawMonitors. But rather than building on low-level 
>>>>>> park/unpark (which would require the same manual queue management 
>>>>>> and much of the same complex code as exists in ObjectMonitor) I 
>>>>>> decided to try and do this on top of PlatformMonitor.
>>>>>>
>>>>>> The reason this is just a RFC rather than RFR is that I overlooked 
>>>>>> a non-trivial aspect of JvmtiRawMonitors: like Java monitors (as 
>>>>>> implemented by ObjectMonitor) they interact with the 
>>>>>> Thread.interrupt mechanism. This is not clearly stated in the JVM 
>>>>>> TI specification [1] but only in passing by the possible errors 
>>>>>> for RawMonitorWait:
>>>>>>
>>>>>> JVMTI_ERROR_INTERRUPT    Wait was interrupted, try again
>>>>>>
>>>>>> As I explain in the bug report there is no way to build in proper 
>>>>>> interrupt support using PlatformMonitor as there is no way we can 
>>>>>> "interrupt" the low-level pthread_cond_wait. But we can 
>>>>>> approximate it. What I've done in this preliminary version is just 
>>>>>> check interrupt state before and after the actual "wait" but we 
>>>>>> won't get woken by the interrupt once we have actually blocked. 
>>>>>> Alternatively we could use a periodic polling approach and wakeup 
>>>>>> every Nms to check for interruption.
>>>>>>
>>>>>> The only use of JvmtiRawMonitors in the JDK libraries (JDWP) is 
>>>>>> not affected by this choice as that code ignores the interrupt 
>>>>>> until the real action it was waiting for has occurred. The 
>>>>>> interrupt is then reposted later.
>>>>>>
>>>>>> But more generally there could be users of JvmtiRawMonitors that 
>>>>>> expect/require that RawMonitorWait is responsive to 
>>>>>> Thread.interrupt in a manner similar to Object.wait. And if any of 
>>>>>> them are reading this then I'd like to know - hence this RFC :)
>>>>>>
>>>>>> FYI testing to date:
>>>>>>  - tiers 1 -3 all platforms
>>>>>>  - hotspot: serviceability/jvmti
>>>>>>                           /jdwp
>>>>>>             vmTestbase/nsk/jvmti
>>>>>>                           /jdwp
>>>>>>  - JDK: com/sun/jdi
>>>>>>
>>>>>> Comments/opinions appreciated.
>>>>>>
>>>>>> Thanks,
>>>>>> David
>>>>>>
>>>>>> [1] 
>>>>>> https://docs.oracle.com/en/java/javase/11/docs/specs/jvmti.html#RawMonitorWait 
>>>>>>
>>>>>
>>>