RFR: 8247729: GetObjectMonitorUsage() might return inconsistent information

Thu Jun 18 09:07:51 UTC 2020

On 2020/06/18 17:36, David Holmes wrote:
> On 18/06/2020 3:47 pm, Yasumasa Suenaga wrote:
>> Hi David,
>>
>> Both ThreadsListHandle and ResourceMarks would use `Thread::current()` for their resource. It is set as default parameter in c'tor.
>> Do you mean we should it explicitly in c'tor?
> 
> Yes pass current_thread so we don't do the additional unnecessary calls to Thread::current().

Ok, I've fixed them. Could you review again?

   http://cr.openjdk.java.net/~ysuenaga/JDK-8247729/webrev.02/

Thanks,

Yasumasa

> David
> 
>>
>> Thanks,
>>
>> Yasumasa
>>
>>
>> On 2020/06/18 13:58, David Holmes wrote:
>>> Hi Yasumasa,
>>>
>>> On 18/06/2020 12:59 pm, Yasumasa Suenaga wrote:
>>>> Hi Serguei,
>>>>
>>>> Thanks for your comment!
>>>> I uploaded new webrev:
>>>>
>>>>    http://cr.openjdk.java.net/~ysuenaga/JDK-8247729/webrev.01/
>>>>
>>>> I'm not sure the following change is correct.
>>>> Can we assume owning_thread is not NULL at safepoint?
>>>
>>> We can if "owner != NULL". So that change seem fine to me.
>>>
>>> But given this is now only executed at a safepoint there are additional simplifications that can be made:
>>>
>>> - current thread determination can be simplified:
>>>
>>> 945   Thread* current_thread = Thread::current();
>>>
>>> becomes:
>>>
>>>     Thread* current_thread = VMThread::vm_thread();
>>>     assert(current_thread == Thread::current(), "must be");
>>>
>>> - these comments can be removed
>>>
>>>   994       // Use current thread since function can be called from a
>>>   995       // JavaThread or the VMThread.
>>> 1053       // Use current thread since function can be called from a
>>> 1054       // JavaThread or the VMThread.
>>>
>>> - these TLH constructions should be passing current_thread (existing bug)
>>>
>>> 996       ThreadsListHandle tlh;
>>> 1055       ThreadsListHandle tlh;
>>>
>>> - All ResourceMarks should be passing current_thread (existing bug)
>>>
>>>
>>> Aside: there is a major inconsistency between the spec and implementation for this method. I've traced the history to see how this came about from JVMDI (ref JDK-4546581) but it never resulted in the JVM TI specification clearly stating what the waiters/waiter_count means. I will file a bug to have the spec clarified to match the implementation (even though I think the implementation is what is wrong). :(
>>>
>>> Thanks,
>>> David
>>> -----
>>>
>>>> All tests on submit repo and serviceability/jvmti and vmTestbase/nsk/jvmti have been passed with this change.
>>>>
>>>>
>>>> ```
>>>>         // This monitor is owned so we have to find the owning JavaThread.
>>>>         owning_thread = Threads::owning_thread_from_monitor_owner(tlh.list(), owner);
>>>> -      // Cannot assume (owning_thread != NULL) here because this function
>>>> -      // may not have been called at a safepoint and the owning_thread
>>>> -      // might not be suspended.
>>>> -      if (owning_thread != NULL) {
>>>> -        // The monitor's owner either has to be the current thread, at safepoint
>>>> -        // or it has to be suspended. Any of these conditions will prevent both
>>>> -        // contending and waiting threads from modifying the state of
>>>> -        // the monitor.
>>>> -        if (!at_safepoint && !owning_thread->is_thread_fully_suspended(true, &debug_bits)) {
>>>> -          // Don't worry! This return of JVMTI_ERROR_THREAD_NOT_SUSPENDED
>>>> -          // will not make it back to the JVM/TI agent. The error code will
>>>> -          // get intercepted in JvmtiEnv::GetObjectMonitorUsage() which
>>>> -          // will retry the call via a VM_GetObjectMonitorUsage VM op.
>>>> -          return JVMTI_ERROR_THREAD_NOT_SUSPENDED;
>>>> -        }
>>>> -        HandleMark hm;
>>>> +      assert(owning_thread != NULL, "owning JavaThread must not be NULL");
>>>>           Handle     th(current_thread, owning_thread->threadObj());
>>>>           ret.owner = (jthread)jni_reference(calling_thread, th);
>>>>
>>>> ```
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Yasumasa
>>>>
>>>>
>>>> On 2020/06/18 0:42, serguei.spitsyn at oracle.com wrote:
>>>>> Hi Yasumasa,
>>>>>
>>>>> This fix is not enough.
>>>>> The function JvmtiEnvBase::get_object_monitor_usage works in two modes: in VMop and non-VMop.
>>>>> The non-VMop mode has to be removed.
>>>>>
>>>>> Thanks,
>>>>> Serguei
>>>>>
>>>>>
>>>>> On 6/17/20 02:18, Yasumasa Suenaga wrote:
>>>>>> (Change subject for RFR)
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I filed it to JBS and upload a webrev for it.
>>>>>> Could you review it?
>>>>>>
>>>>>>   JBS: https://bugs.openjdk.java.net/browse/JDK-8247729
>>>>>>   webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8247729/webrev.00/
>>>>>>
>>>>>> This change has passed tests on submit repo.
>>>>>> Also I tested it with serviceability/jvmti and vmTestbase/nsk/jvmti on Linux x64.
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Yasumasa
>>>>>>
>>>>>>
>>>>>> On 2020/06/17 14:37, serguei.spitsyn at oracle.com wrote:
>>>>>>> Yes. It seems we have a consensus.
>>>>>>> Thank you for taking care about it.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Serguei
>>>>>>>
>>>>>>>
>>>>>>> On 6/16/20 18:34, David Holmes wrote:
>>>>>>>>> Ok, may I file it to JBS and fix it? 
>>>>>>>>
>>>>>>>> Go for it! :)
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> David
>>>>>>>>
>>>>>>>> On 17/06/2020 10:23 am, Yasumasa Suenaga wrote:
>>>>>>>>> On 2020/06/17 8:47, serguei.spitsyn at oracle.com wrote:
>>>>>>>>>> Hi Dan, David and Yasumasa,
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 6/16/20 07:39, Daniel D. Daugherty wrote:
>>>>>>>>>>> On 6/15/20 9:28 PM, David Holmes wrote:
>>>>>>>>>>>> On 16/06/2020 10:57 am, Daniel D. Daugherty wrote:
>>>>>>>>>>>>> On 6/15/20 7:19 PM, David Holmes wrote:
>>>>>>>>>>>>>> On 16/06/2020 8:40 am, Daniel D. Daugherty wrote:
>>>>>>>>>>>>>>> On 6/15/20 6:14 PM, David Holmes wrote:
>>>>>>>>>>>>>>>> Hi Dan,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On 15/06/2020 11:38 pm, Daniel D. Daugherty wrote:
>>>>>>>>>>>>>>>>> On 6/15/20 3:26 AM, David Holmes wrote:
>>>>>>>>>>>>>>>>>> On 15/06/2020 4:02 pm, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>>>>>>> Hi David,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On 2020/06/15 14:15, David Holmes wrote:
>>>>>>>>>>>>>>>>>>>> Hi Yasumasa,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On 15/06/2020 2:49 pm, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> I wonder why JvmtiEnvBase::get_object_monitor_usage() (implementation of GetObjectMonitorUsage()) does not perform at safepoint.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> GetObjectMonitorUsage will use a safepoint if the target is not suspended:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> jvmtiError
>>>>>>>>>>>>>>>>>>>> JvmtiEnv::GetObjectMonitorUsage(jobject object, jvmtiMonitorUsage* info_ptr) {
>>>>>>>>>>>>>>>>>>>>    JavaThread* calling_thread = JavaThread::current();
>>>>>>>>>>>>>>>>>>>>    jvmtiError err = get_object_monitor_usage(calling_thread, object, info_ptr);
>>>>>>>>>>>>>>>>>>>>    if (err == JVMTI_ERROR_THREAD_NOT_SUSPENDED) {
>>>>>>>>>>>>>>>>>>>>      // Some of the critical threads were not suspended. go to a safepoint and try again
>>>>>>>>>>>>>>>>>>>>      VM_GetObjectMonitorUsage op(this, calling_thread, object, info_ptr);
>>>>>>>>>>>>>>>>>>>>      VMThread::execute(&op);
>>>>>>>>>>>>>>>>>>>>      err = op.result();
>>>>>>>>>>>>>>>>>>>>    }
>>>>>>>>>>>>>>>>>>>>    return err;
>>>>>>>>>>>>>>>>>>>> } /* end GetObject */
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I saw this code, so I guess there are some cases when JVMTI_ERROR_THREAD_NOT_SUSPENDED is not returned from get_object_monitor_usage().
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Monitor owner would be acquired from monitor object at first [1], but it would perform concurrently.
>>>>>>>>>>>>>>>>>>>>> If owner thread is not suspended, the owner might be changed to others in subsequent code.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> For example, the owner might release the monitor before [2].
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> The expectation is that when we find an owner thread it is either suspended or not. If it is suspended then it cannot release the monitor. If it is not suspended we detect that and redo the whole query at a safepoint.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I think the owner thread might resume unfortunately after suspending check.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Yes you are right. I was thinking resuming also required a safepoint but it only requires the Threads_lock. So yes the code is wrong.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Which code is wrong?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Yes, a rogue resume can happen when the GetObjectMonitorUsage() caller
>>>>>>>>>>>>>>>>> has started the process of gathering the information while not at a
>>>>>>>>>>>>>>>>> safepoint. Thus the information returned by GetObjectMonitorUsage()
>>>>>>>>>>>>>>>>> might be stale, but that's a bug in the agent code.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The code tries to make sure that it either collects data about a monitor owned by a thread that is suspended, or else it collects that data at a safepoint. But the owning thread can be resumed just after the code determined it was suspended. The monitor can then be released and the information gathered not only stale but potentially completely wrong as it could now be owned by a different thread and will report that thread's entry count.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> If the agent is not using SuspendThread(), then as soon as
>>>>>>>>>>>>>>> GetObjectMonitorUsage() returns to the caller the information
>>>>>>>>>>>>>>> can be stale. In fact as soon as the implementation returns
>>>>>>>>>>>>>>> from the safepoint that gathered the info, the target thread
>>>>>>>>>>>>>>> could have moved on.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> That isn't the issue. That the info is stale is fine. But the expectation is that the information was actually an accurate snapshot of the state of the monitor at some point in time. The current code does not ensure that.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Please explain. I clearly don't understand why you think the info
>>>>>>>>>>>>> returned isn't "an accurate snapshot of the state of the monitor
>>>>>>>>>>>>> at some point in time".
>>>>>>>>>>>>
>>>>>>>>>>>> Because it may not be a "snapshot" at all. There is no atomicity**. The reported owner thread may not own it any longer when the entry count is read, so straight away you may have the wrong entry count information. The set of threads trying to acquire the monitor, or wait on the monitor can change in unexpected ways. It would be possible for instance to report the same thread as being the owner, being blocked trying to enter the monitor, and being in the wait-set of the monitor - apparently all at the same time!
>>>>>>>>>>>>
>>>>>>>>>>>> ** even if the owner is suspended we don't have complete atomicity because threads can join the set of threads trying to enter the monitor (unless they are all suspended).
>>>>>>>>>>>
>>>>>>>>>>> Consider the case when the monitor's owner is _not_ suspended:
>>>>>>>>>>>
>>>>>>>>>>>   - GetObjectMonitorUsage() uses a safepoint to gather the info about
>>>>>>>>>>>     the object's monitor. Since we're at a safepoint, the info that
>>>>>>>>>>>     we are gathering cannot change until we return from the safepoint.
>>>>>>>>>>>     It is a snapshot and a valid one at that.
>>>>>>>>>>>
>>>>>>>>>>> Consider the case when the monitor's owner is suspended:
>>>>>>>>>>>
>>>>>>>>>>>   - GetObjectMonitorUsage() will gather info about the object's
>>>>>>>>>>>     monitor while _not_ at a safepoint. Assuming that no other
>>>>>>>>>>>     thread is suspended, then entry_count can change because
>>>>>>>>>>>     another thread can block on entry while we are gathering
>>>>>>>>>>>     info. waiter_count and waiters can change if a thread was
>>>>>>>>>>>     in a timed wait that has timed out and now that thread is
>>>>>>>>>>>     blocked on re-entry. I don't think that notify_waiter_count
>>>>>>>>>>>     and notify_waiters can change.
>>>>>>>>>>>
>>>>>>>>>>>     So in this case, the owner info and notify info is stable,
>>>>>>>>>>>     but the entry_count and waiter info is not stable.
>>>>>>>>>>>
>>>>>>>>>>> Consider the case when the monitor is not owned:
>>>>>>>>>>>
>>>>>>>>>>>   - GetObjectMonitorUsage() will start to gather info about the
>>>>>>>>>>>     object's monitor while _not_ at a safepoint. If it finds a
>>>>>>>>>>>     thread on the entry queue that is not suspended, then it will
>>>>>>>>>>>     bail out and redo the info gather at a safepoint. I just
>>>>>>>>>>>     noticed that it doesn't check for suspension for the threads
>>>>>>>>>>>     on the waiters list so a timed Object.wait() call can cause
>>>>>>>>>>>     some confusion here.
>>>>>>>>>>>
>>>>>>>>>>>     So in this case, the owner info is not stable if a thread
>>>>>>>>>>>     comes out of a timed wait and reenters the monitor. This
>>>>>>>>>>>     case is no different than if a "barger" thread comes in
>>>>>>>>>>>     after the NULL owner field is observed and enters the
>>>>>>>>>>>     monitor. We'll return that there is no owner, a list of
>>>>>>>>>>>     suspended pending entry thread and a list of waiting
>>>>>>>>>>>     threads. The reality is that the object's monitor is
>>>>>>>>>>>     owned by the "barger" that completely bypassed the entry
>>>>>>>>>>>     queue by virtue of seeing the NULL owner field at exactly
>>>>>>>>>>>     the right time.
>>>>>>>>>>>
>>>>>>>>>>> So the owner field is only stable when we have an owner. If
>>>>>>>>>>> that owner is not suspended, then the other fields are also
>>>>>>>>>>> stable because we gathered the info at a safepoint. If the
>>>>>>>>>>> owner is suspended, then the owner and notify info is stable,
>>>>>>>>>>> but the entry_count and waiter info is not stable.
>>>>>>>>>>>
>>>>>>>>>>> If we have a NULL owner field, then the info is only stable
>>>>>>>>>>> if you have a non-suspended thread on the entry list. Ouch!
>>>>>>>>>>> That's deterministic, but not without some work.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Okay so only when we gather the info at a safepoint is all
>>>>>>>>>>> of it a valid and stable snapshot. Unfortunately, we only
>>>>>>>>>>> do that at a safepoint when the owner thread is not suspended
>>>>>>>>>>> or if owner == NULL and one of the entry threads is not
>>>>>>>>>>> suspended. If either of those conditions is not true, then
>>>>>>>>>>> the different pieces of info is unstable to varying degrees.
>>>>>>>>>>>
>>>>>>>>>>> As for this claim:
>>>>>>>>>>>
>>>>>>>>>>>> It would be possible for instance to report the same thread
>>>>>>>>>>>> as being the owner, being blocked trying to enter the monitor,
>>>>>>>>>>>> and being in the wait-set of the monitor - apparently all at
>>>>>>>>>>>> the same time! 
>>>>>>>>>>>
>>>>>>>>>>> I can't figure out a way to make that scenario work. If the
>>>>>>>>>>> thread is seen as the owner and is not suspended, then we
>>>>>>>>>>> gather info at a safepoint. If it is suspended, then it can't
>>>>>>>>>>> then be seen as on the entry queue or on the wait queue since
>>>>>>>>>>> it is suspended. If it is seen on the entry queue and is not
>>>>>>>>>>> suspended, then we gather info at a safepoint. If it is
>>>>>>>>>>> suspended on the entry queue, then it can't be seen on the
>>>>>>>>>>> wait queue.
>>>>>>>>>>>
>>>>>>>>>>> So the info instability of this API is bad, but it's not
>>>>>>>>>>> quite that bad. :-) (That is a small mercy.)
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Handshaking is not going to make this situation any better
>>>>>>>>>>> for GetObjectMonitorUsage(). If the monitor is owned and we
>>>>>>>>>>> handshake with the owner, the stability or instability of
>>>>>>>>>>> the other fields remains the same as when SuspendThread is
>>>>>>>>>>> used. Handshaking with all threads won't make the data as
>>>>>>>>>>> stable as when at a safepoint because individual threads
>>>>>>>>>>> can resume execution after doing their handshake so there
>>>>>>>>>>> will still be field instability.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Short version: GetObjectMonitorUsage() should only gather
>>>>>>>>>>> data at a safepoint. Yes, I've changed my mind.
>>>>>>>>>>
>>>>>>>>>> I agree with this.
>>>>>>>>>> The advantages are:
>>>>>>>>>>   - the result is stable
>>>>>>>>>>   - the implementation can be simplified
>>>>>>>>>>
>>>>>>>>>> Performance impact is not very clear but should not be that
>>>>>>>>>> big as suspending all the threads has some overhead too.
>>>>>>>>>> I'm not sure if using handshakes can make performance better.
>>>>>>>>>
>>>>>>>>> Ok, may I file it to JBS and fix it?
>>>>>>>>>
>>>>>>>>> Yasumasa
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Serguei
>>>>>>>>>>
>>>>>>>>>>> Dan
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> David
>>>>>>>>>>>> -----
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The only way to make sure you don't have stale information is
>>>>>>>>>>>>>>> to use SuspendThread(), but it's not required. Perhaps the doc
>>>>>>>>>>>>>>> should have more clear about the possibility of returning stale
>>>>>>>>>>>>>>> info. That's a question for Robert F.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> GetObjectMonitorUsage says nothing about thread's being suspended so I can't see how this could be construed as an agent bug.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> In your scenario above, you mention that the target thread was
>>>>>>>>>>>>>>> suspended, GetObjectMonitorUsage() was called while the target
>>>>>>>>>>>>>>> was suspended, and then the target thread was resumed after
>>>>>>>>>>>>>>> GetObjectMonitorUsage() checked for suspension, but before
>>>>>>>>>>>>>>> GetObjectMonitorUsage() was able to gather the info.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> All three of those calls: SuspendThread(), GetObjectMonitorUsage()
>>>>>>>>>>>>>>> and ResumeThread() are made by the agent and the agent should not
>>>>>>>>>>>>>>> resume the target thread while also calling GetObjectMonitorUsage().
>>>>>>>>>>>>>>> The calls were allowed to be made out of order so agent bug.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Perhaps. I was thinking more generally about an independent resume, but you're right that doesn't really make a lot of sense. But when the spec says nothing about suspension ...
>>>>>>>>>>>>>
>>>>>>>>>>>>> And it is intentional that suspension is not required. JVM/DI and JVM/PI
>>>>>>>>>>>>> used to require suspension for these kinds of get-the-info APIs. JVM/TI
>>>>>>>>>>>>> intentionally was designed to not require suspension.
>>>>>>>>>>>>>
>>>>>>>>>>>>> As I've said before, we could add a note about the data being potentially
>>>>>>>>>>>>> stale unless SuspendThread is used. I think of it like stat(2). You can
>>>>>>>>>>>>> fetch the file's info, but there's no guarantee that the info is current
>>>>>>>>>>>>> by the time you process what you got back. Is it too much motherhood to
>>>>>>>>>>>>> state that the data might be stale? I could go either way...
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Using a handshake on the owner thread will allow this to be fixed in the future without forcing/using any safepoints.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I have to think about that which is why I'm avoiding talking about
>>>>>>>>>>>>>>> handshakes in this thread.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Effectively the handshake can "suspend" the thread whilst the monitor is queried. In effect the operation would create a per-thread safepoint.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I "know" that, but I still need time to think about it and probably
>>>>>>>>>>>>> see the code to see if there are holes...
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Semantically it is no different to the code actually suspending the owner thread, but it can't actually do that because suspends/resume don't nest.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Yeah... we used have a suspend count back when we tracked internal and
>>>>>>>>>>>>> external suspends separately. That was a nightmare...
>>>>>>>>>>>>>
>>>>>>>>>>>>> Dan
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Dan
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Dan
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> JavaThread::is_ext_suspend_completed() is used to check thread state, it returns `true` when the thread is sleeping [3], or when it performs in native [4].
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Sure but if the thread is actually suspended it can't continue execution in the VM or in Java code.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> This appears to be an optimisation for the assumed common case where threads are first suspended and then the monitors are queried.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I agree with this, but I could find out it from JVMTI spec - it just says "Get information about the object's monitor."
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Yes it was just an implementation optimisation, nothing to do with the spec.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> GetObjectMonitorUsage() might return incorrect information in some case.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> It starts with finding owner thread, but the owner might be just before wakeup.
>>>>>>>>>>>>>>>>>>> So I think it is more safe if GetObjectMonitorUsage() is called at safepoint in any case.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Except we're moving away from safepoints to using Handshakes, so this particular operation will require that the apparent owner is Handshake-safe (by entering a handshake with it) before querying the monitor. This would still be preferable I think to always using a safepoint for the entire operation.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>> -----
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> [3] http://hg.openjdk.java.net/jdk/jdk/file/76a17c8143d8/src/hotspot/share/runtime/thread.cpp#l671
>>>>>>>>>>>>>>>>>>> [4] http://hg.openjdk.java.net/jdk/jdk/file/76a17c8143d8/src/hotspot/share/runtime/thread.cpp#l684
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> However there is still a potential bug as the thread reported as the owner may not be suspended at the time we first see it, and may release the monitor, but then it may get suspended before we call:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>   owning_thread = Threads::owning_thread_from_monitor_owner(tlh.list(), owner);
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> and so we think it is still the monitor owner and proceed to query the monitor information in a racy way. This can't happen when suspension itself requires a safepoint as the current thread won't go to that safepoint during this code. However, if suspension is implemented via a direct handshake with the target thread then we have a problem.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>>>> -----
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> [1] http://hg.openjdk.java.net/jdk/jdk/file/76a17c8143d8/src/hotspot/share/prims/jvmtiEnvBase.cpp#l973
>>>>>>>>>>>>>>>>>>>>> [2] http://hg.openjdk.java.net/jdk/jdk/file/76a17c8143d8/src/hotspot/share/prims/jvmtiEnvBase.cpp#l996
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>
>>>>>