RFR: 8247729: GetObjectMonitorUsage() might return inconsistent information

David Holmes david.holmes at oracle.com
Thu Jun 18 04:58:19 UTC 2020


Hi Yasumasa,

On 18/06/2020 12:59 pm, Yasumasa Suenaga wrote:
> Hi Serguei,
> 
> Thanks for your comment!
> I uploaded new webrev:
> 
>    http://cr.openjdk.java.net/~ysuenaga/JDK-8247729/webrev.01/
> 
> I'm not sure the following change is correct.
> Can we assume owning_thread is not NULL at safepoint?

We can if "owner != NULL". So that change seem fine to me.

But given this is now only executed at a safepoint there are additional 
simplifications that can be made:

- current thread determination can be simplified:

945   Thread* current_thread = Thread::current();

becomes:

    Thread* current_thread = VMThread::vm_thread();
    assert(current_thread == Thread::current(), "must be");

- these comments can be removed

  994       // Use current thread since function can be called from a
  995       // JavaThread or the VMThread.
1053       // Use current thread since function can be called from a
1054       // JavaThread or the VMThread.

- these TLH constructions should be passing current_thread (existing bug)

996       ThreadsListHandle tlh;
1055       ThreadsListHandle tlh;

- All ResourceMarks should be passing current_thread (existing bug)


Aside: there is a major inconsistency between the spec and 
implementation for this method. I've traced the history to see how this 
came about from JVMDI (ref JDK-4546581) but it never resulted in the JVM 
TI specification clearly stating what the waiters/waiter_count means. I 
will file a bug to have the spec clarified to match the implementation 
(even though I think the implementation is what is wrong). :(

Thanks,
David
-----

> All tests on submit repo and serviceability/jvmti and 
> vmTestbase/nsk/jvmti have been passed with this change.
> 
> 
> ```
>         // This monitor is owned so we have to find the owning JavaThread.
>         owning_thread = 
> Threads::owning_thread_from_monitor_owner(tlh.list(), owner);
> -      // Cannot assume (owning_thread != NULL) here because this function
> -      // may not have been called at a safepoint and the owning_thread
> -      // might not be suspended.
> -      if (owning_thread != NULL) {
> -        // The monitor's owner either has to be the current thread, at 
> safepoint
> -        // or it has to be suspended. Any of these conditions will 
> prevent both
> -        // contending and waiting threads from modifying the state of
> -        // the monitor.
> -        if (!at_safepoint && 
> !owning_thread->is_thread_fully_suspended(true, &debug_bits)) {
> -          // Don't worry! This return of JVMTI_ERROR_THREAD_NOT_SUSPENDED
> -          // will not make it back to the JVM/TI agent. The error code 
> will
> -          // get intercepted in JvmtiEnv::GetObjectMonitorUsage() which
> -          // will retry the call via a VM_GetObjectMonitorUsage VM op.
> -          return JVMTI_ERROR_THREAD_NOT_SUSPENDED;
> -        }
> -        HandleMark hm;
> +      assert(owning_thread != NULL, "owning JavaThread must not be NULL");
>           Handle     th(current_thread, owning_thread->threadObj());
>           ret.owner = (jthread)jni_reference(calling_thread, th);
> 
> ```
> 
> 
> Thanks,
> 
> Yasumasa
> 
> 
> On 2020/06/18 0:42, serguei.spitsyn at oracle.com wrote:
>> Hi Yasumasa,
>>
>> This fix is not enough.
>> The function JvmtiEnvBase::get_object_monitor_usage works in two 
>> modes: in VMop and non-VMop.
>> The non-VMop mode has to be removed.
>>
>> Thanks,
>> Serguei
>>
>>
>> On 6/17/20 02:18, Yasumasa Suenaga wrote:
>>> (Change subject for RFR)
>>>
>>> Hi,
>>>
>>> I filed it to JBS and upload a webrev for it.
>>> Could you review it?
>>>
>>>   JBS: https://bugs.openjdk.java.net/browse/JDK-8247729
>>>   webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8247729/webrev.00/
>>>
>>> This change has passed tests on submit repo.
>>> Also I tested it with serviceability/jvmti and vmTestbase/nsk/jvmti 
>>> on Linux x64.
>>>
>>>
>>> Thanks,
>>>
>>> Yasumasa
>>>
>>>
>>> On 2020/06/17 14:37, serguei.spitsyn at oracle.com wrote:
>>>> Yes. It seems we have a consensus.
>>>> Thank you for taking care about it.
>>>>
>>>> Thanks,
>>>> Serguei
>>>>
>>>>
>>>> On 6/16/20 18:34, David Holmes wrote:
>>>>>> Ok, may I file it to JBS and fix it? 
>>>>>
>>>>> Go for it! :)
>>>>>
>>>>> Cheers,
>>>>> David
>>>>>
>>>>> On 17/06/2020 10:23 am, Yasumasa Suenaga wrote:
>>>>>> On 2020/06/17 8:47, serguei.spitsyn at oracle.com wrote:
>>>>>>> Hi Dan, David and Yasumasa,
>>>>>>>
>>>>>>>
>>>>>>> On 6/16/20 07:39, Daniel D. Daugherty wrote:
>>>>>>>> On 6/15/20 9:28 PM, David Holmes wrote:
>>>>>>>>> On 16/06/2020 10:57 am, Daniel D. Daugherty wrote:
>>>>>>>>>> On 6/15/20 7:19 PM, David Holmes wrote:
>>>>>>>>>>> On 16/06/2020 8:40 am, Daniel D. Daugherty wrote:
>>>>>>>>>>>> On 6/15/20 6:14 PM, David Holmes wrote:
>>>>>>>>>>>>> Hi Dan,
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 15/06/2020 11:38 pm, Daniel D. Daugherty wrote:
>>>>>>>>>>>>>> On 6/15/20 3:26 AM, David Holmes wrote:
>>>>>>>>>>>>>>> On 15/06/2020 4:02 pm, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>>>> Hi David,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On 2020/06/15 14:15, David Holmes wrote:
>>>>>>>>>>>>>>>>> Hi Yasumasa,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On 15/06/2020 2:49 pm, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I wonder why JvmtiEnvBase::get_object_monitor_usage() 
>>>>>>>>>>>>>>>>>> (implementation of GetObjectMonitorUsage()) does not 
>>>>>>>>>>>>>>>>>> perform at safepoint.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> GetObjectMonitorUsage will use a safepoint if the 
>>>>>>>>>>>>>>>>> target is not suspended:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> jvmtiError
>>>>>>>>>>>>>>>>> JvmtiEnv::GetObjectMonitorUsage(jobject object, 
>>>>>>>>>>>>>>>>> jvmtiMonitorUsage* info_ptr) {
>>>>>>>>>>>>>>>>>    JavaThread* calling_thread = JavaThread::current();
>>>>>>>>>>>>>>>>>    jvmtiError err = 
>>>>>>>>>>>>>>>>> get_object_monitor_usage(calling_thread, object, 
>>>>>>>>>>>>>>>>> info_ptr);
>>>>>>>>>>>>>>>>>    if (err == JVMTI_ERROR_THREAD_NOT_SUSPENDED) {
>>>>>>>>>>>>>>>>>      // Some of the critical threads were not 
>>>>>>>>>>>>>>>>> suspended. go to a safepoint and try again
>>>>>>>>>>>>>>>>>      VM_GetObjectMonitorUsage op(this, calling_thread, 
>>>>>>>>>>>>>>>>> object, info_ptr);
>>>>>>>>>>>>>>>>>      VMThread::execute(&op);
>>>>>>>>>>>>>>>>>      err = op.result();
>>>>>>>>>>>>>>>>>    }
>>>>>>>>>>>>>>>>>    return err;
>>>>>>>>>>>>>>>>> } /* end GetObject */
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I saw this code, so I guess there are some cases when 
>>>>>>>>>>>>>>>> JVMTI_ERROR_THREAD_NOT_SUSPENDED is not returned from 
>>>>>>>>>>>>>>>> get_object_monitor_usage().
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Monitor owner would be acquired from monitor object at 
>>>>>>>>>>>>>>>>>> first [1], but it would perform concurrently.
>>>>>>>>>>>>>>>>>> If owner thread is not suspended, the owner might be 
>>>>>>>>>>>>>>>>>> changed to others in subsequent code.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> For example, the owner might release the monitor 
>>>>>>>>>>>>>>>>>> before [2].
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> The expectation is that when we find an owner thread it 
>>>>>>>>>>>>>>>>> is either suspended or not. If it is suspended then it 
>>>>>>>>>>>>>>>>> cannot release the monitor. If it is not suspended we 
>>>>>>>>>>>>>>>>> detect that and redo the whole query at a safepoint.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I think the owner thread might resume unfortunately 
>>>>>>>>>>>>>>>> after suspending check.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Yes you are right. I was thinking resuming also required 
>>>>>>>>>>>>>>> a safepoint but it only requires the Threads_lock. So yes 
>>>>>>>>>>>>>>> the code is wrong.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Which code is wrong?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Yes, a rogue resume can happen when the 
>>>>>>>>>>>>>> GetObjectMonitorUsage() caller
>>>>>>>>>>>>>> has started the process of gathering the information while 
>>>>>>>>>>>>>> not at a
>>>>>>>>>>>>>> safepoint. Thus the information returned by 
>>>>>>>>>>>>>> GetObjectMonitorUsage()
>>>>>>>>>>>>>> might be stale, but that's a bug in the agent code.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The code tries to make sure that it either collects data 
>>>>>>>>>>>>> about a monitor owned by a thread that is suspended, or 
>>>>>>>>>>>>> else it collects that data at a safepoint. But the owning 
>>>>>>>>>>>>> thread can be resumed just after the code determined it was 
>>>>>>>>>>>>> suspended. The monitor can then be released and the 
>>>>>>>>>>>>> information gathered not only stale but potentially 
>>>>>>>>>>>>> completely wrong as it could now be owned by a different 
>>>>>>>>>>>>> thread and will report that thread's entry count.
>>>>>>>>>>>>
>>>>>>>>>>>> If the agent is not using SuspendThread(), then as soon as
>>>>>>>>>>>> GetObjectMonitorUsage() returns to the caller the information
>>>>>>>>>>>> can be stale. In fact as soon as the implementation returns
>>>>>>>>>>>> from the safepoint that gathered the info, the target thread
>>>>>>>>>>>> could have moved on.
>>>>>>>>>>>
>>>>>>>>>>> That isn't the issue. That the info is stale is fine. But the 
>>>>>>>>>>> expectation is that the information was actually an accurate 
>>>>>>>>>>> snapshot of the state of the monitor at some point in time. 
>>>>>>>>>>> The current code does not ensure that.
>>>>>>>>>>
>>>>>>>>>> Please explain. I clearly don't understand why you think the info
>>>>>>>>>> returned isn't "an accurate snapshot of the state of the monitor
>>>>>>>>>> at some point in time".
>>>>>>>>>
>>>>>>>>> Because it may not be a "snapshot" at all. There is no 
>>>>>>>>> atomicity**. The reported owner thread may not own it any 
>>>>>>>>> longer when the entry count is read, so straight away you may 
>>>>>>>>> have the wrong entry count information. The set of threads 
>>>>>>>>> trying to acquire the monitor, or wait on the monitor can 
>>>>>>>>> change in unexpected ways. It would be possible for instance to 
>>>>>>>>> report the same thread as being the owner, being blocked trying 
>>>>>>>>> to enter the monitor, and being in the wait-set of the monitor 
>>>>>>>>> - apparently all at the same time!
>>>>>>>>>
>>>>>>>>> ** even if the owner is suspended we don't have complete 
>>>>>>>>> atomicity because threads can join the set of threads trying to 
>>>>>>>>> enter the monitor (unless they are all suspended).
>>>>>>>>
>>>>>>>> Consider the case when the monitor's owner is _not_ suspended:
>>>>>>>>
>>>>>>>>   - GetObjectMonitorUsage() uses a safepoint to gather the info 
>>>>>>>> about
>>>>>>>>     the object's monitor. Since we're at a safepoint, the info that
>>>>>>>>     we are gathering cannot change until we return from the 
>>>>>>>> safepoint.
>>>>>>>>     It is a snapshot and a valid one at that.
>>>>>>>>
>>>>>>>> Consider the case when the monitor's owner is suspended:
>>>>>>>>
>>>>>>>>   - GetObjectMonitorUsage() will gather info about the object's
>>>>>>>>     monitor while _not_ at a safepoint. Assuming that no other
>>>>>>>>     thread is suspended, then entry_count can change because
>>>>>>>>     another thread can block on entry while we are gathering
>>>>>>>>     info. waiter_count and waiters can change if a thread was
>>>>>>>>     in a timed wait that has timed out and now that thread is
>>>>>>>>     blocked on re-entry. I don't think that notify_waiter_count
>>>>>>>>     and notify_waiters can change.
>>>>>>>>
>>>>>>>>     So in this case, the owner info and notify info is stable,
>>>>>>>>     but the entry_count and waiter info is not stable.
>>>>>>>>
>>>>>>>> Consider the case when the monitor is not owned:
>>>>>>>>
>>>>>>>>   - GetObjectMonitorUsage() will start to gather info about the
>>>>>>>>     object's monitor while _not_ at a safepoint. If it finds a
>>>>>>>>     thread on the entry queue that is not suspended, then it will
>>>>>>>>     bail out and redo the info gather at a safepoint. I just
>>>>>>>>     noticed that it doesn't check for suspension for the threads
>>>>>>>>     on the waiters list so a timed Object.wait() call can cause
>>>>>>>>     some confusion here.
>>>>>>>>
>>>>>>>>     So in this case, the owner info is not stable if a thread
>>>>>>>>     comes out of a timed wait and reenters the monitor. This
>>>>>>>>     case is no different than if a "barger" thread comes in
>>>>>>>>     after the NULL owner field is observed and enters the
>>>>>>>>     monitor. We'll return that there is no owner, a list of
>>>>>>>>     suspended pending entry thread and a list of waiting
>>>>>>>>     threads. The reality is that the object's monitor is
>>>>>>>>     owned by the "barger" that completely bypassed the entry
>>>>>>>>     queue by virtue of seeing the NULL owner field at exactly
>>>>>>>>     the right time.
>>>>>>>>
>>>>>>>> So the owner field is only stable when we have an owner. If
>>>>>>>> that owner is not suspended, then the other fields are also
>>>>>>>> stable because we gathered the info at a safepoint. If the
>>>>>>>> owner is suspended, then the owner and notify info is stable,
>>>>>>>> but the entry_count and waiter info is not stable.
>>>>>>>>
>>>>>>>> If we have a NULL owner field, then the info is only stable
>>>>>>>> if you have a non-suspended thread on the entry list. Ouch!
>>>>>>>> That's deterministic, but not without some work.
>>>>>>>>
>>>>>>>>
>>>>>>>> Okay so only when we gather the info at a safepoint is all
>>>>>>>> of it a valid and stable snapshot. Unfortunately, we only
>>>>>>>> do that at a safepoint when the owner thread is not suspended
>>>>>>>> or if owner == NULL and one of the entry threads is not
>>>>>>>> suspended. If either of those conditions is not true, then
>>>>>>>> the different pieces of info is unstable to varying degrees.
>>>>>>>>
>>>>>>>> As for this claim:
>>>>>>>>
>>>>>>>>> It would be possible for instance to report the same thread
>>>>>>>>> as being the owner, being blocked trying to enter the monitor,
>>>>>>>>> and being in the wait-set of the monitor - apparently all at
>>>>>>>>> the same time! 
>>>>>>>>
>>>>>>>> I can't figure out a way to make that scenario work. If the
>>>>>>>> thread is seen as the owner and is not suspended, then we
>>>>>>>> gather info at a safepoint. If it is suspended, then it can't
>>>>>>>> then be seen as on the entry queue or on the wait queue since
>>>>>>>> it is suspended. If it is seen on the entry queue and is not
>>>>>>>> suspended, then we gather info at a safepoint. If it is
>>>>>>>> suspended on the entry queue, then it can't be seen on the
>>>>>>>> wait queue.
>>>>>>>>
>>>>>>>> So the info instability of this API is bad, but it's not
>>>>>>>> quite that bad. :-) (That is a small mercy.)
>>>>>>>>
>>>>>>>>
>>>>>>>> Handshaking is not going to make this situation any better
>>>>>>>> for GetObjectMonitorUsage(). If the monitor is owned and we
>>>>>>>> handshake with the owner, the stability or instability of
>>>>>>>> the other fields remains the same as when SuspendThread is
>>>>>>>> used. Handshaking with all threads won't make the data as
>>>>>>>> stable as when at a safepoint because individual threads
>>>>>>>> can resume execution after doing their handshake so there
>>>>>>>> will still be field instability.
>>>>>>>>
>>>>>>>>
>>>>>>>> Short version: GetObjectMonitorUsage() should only gather
>>>>>>>> data at a safepoint. Yes, I've changed my mind.
>>>>>>>
>>>>>>> I agree with this.
>>>>>>> The advantages are:
>>>>>>>   - the result is stable
>>>>>>>   - the implementation can be simplified
>>>>>>>
>>>>>>> Performance impact is not very clear but should not be that
>>>>>>> big as suspending all the threads has some overhead too.
>>>>>>> I'm not sure if using handshakes can make performance better.
>>>>>>
>>>>>> Ok, may I file it to JBS and fix it?
>>>>>>
>>>>>> Yasumasa
>>>>>>
>>>>>>
>>>>>>> Thanks,
>>>>>>> Serguei
>>>>>>>
>>>>>>>> Dan
>>>>>>>>
>>>>>>>>>
>>>>>>>>> David
>>>>>>>>> -----
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> The only way to make sure you don't have stale information is
>>>>>>>>>>>> to use SuspendThread(), but it's not required. Perhaps the doc
>>>>>>>>>>>> should have more clear about the possibility of returning stale
>>>>>>>>>>>> info. That's a question for Robert F.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> GetObjectMonitorUsage says nothing about thread's being 
>>>>>>>>>>>>> suspended so I can't see how this could be construed as an 
>>>>>>>>>>>>> agent bug.
>>>>>>>>>>>>
>>>>>>>>>>>> In your scenario above, you mention that the target thread was
>>>>>>>>>>>> suspended, GetObjectMonitorUsage() was called while the target
>>>>>>>>>>>> was suspended, and then the target thread was resumed after
>>>>>>>>>>>> GetObjectMonitorUsage() checked for suspension, but before
>>>>>>>>>>>> GetObjectMonitorUsage() was able to gather the info.
>>>>>>>>>>>>
>>>>>>>>>>>> All three of those calls: SuspendThread(), 
>>>>>>>>>>>> GetObjectMonitorUsage()
>>>>>>>>>>>> and ResumeThread() are made by the agent and the agent 
>>>>>>>>>>>> should not
>>>>>>>>>>>> resume the target thread while also calling 
>>>>>>>>>>>> GetObjectMonitorUsage().
>>>>>>>>>>>> The calls were allowed to be made out of order so agent bug.
>>>>>>>>>>>
>>>>>>>>>>> Perhaps. I was thinking more generally about an independent 
>>>>>>>>>>> resume, but you're right that doesn't really make a lot of 
>>>>>>>>>>> sense. But when the spec says nothing about suspension ...
>>>>>>>>>>
>>>>>>>>>> And it is intentional that suspension is not required. JVM/DI 
>>>>>>>>>> and JVM/PI
>>>>>>>>>> used to require suspension for these kinds of get-the-info 
>>>>>>>>>> APIs. JVM/TI
>>>>>>>>>> intentionally was designed to not require suspension.
>>>>>>>>>>
>>>>>>>>>> As I've said before, we could add a note about the data being 
>>>>>>>>>> potentially
>>>>>>>>>> stale unless SuspendThread is used. I think of it like 
>>>>>>>>>> stat(2). You can
>>>>>>>>>> fetch the file's info, but there's no guarantee that the info 
>>>>>>>>>> is current
>>>>>>>>>> by the time you process what you got back. Is it too much 
>>>>>>>>>> motherhood to
>>>>>>>>>> state that the data might be stale? I could go either way...
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>> Using a handshake on the owner thread will allow this to be 
>>>>>>>>>>>>> fixed in the future without forcing/using any safepoints.
>>>>>>>>>>>>
>>>>>>>>>>>> I have to think about that which is why I'm avoiding talking 
>>>>>>>>>>>> about
>>>>>>>>>>>> handshakes in this thread.
>>>>>>>>>>>
>>>>>>>>>>> Effectively the handshake can "suspend" the thread whilst the 
>>>>>>>>>>> monitor is queried. In effect the operation would create a 
>>>>>>>>>>> per-thread safepoint.
>>>>>>>>>>
>>>>>>>>>> I "know" that, but I still need time to think about it and 
>>>>>>>>>> probably
>>>>>>>>>> see the code to see if there are holes...
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> Semantically it is no different to the code actually 
>>>>>>>>>>> suspending the owner thread, but it can't actually do that 
>>>>>>>>>>> because suspends/resume don't nest.
>>>>>>>>>>
>>>>>>>>>> Yeah... we used have a suspend count back when we tracked 
>>>>>>>>>> internal and
>>>>>>>>>> external suspends separately. That was a nightmare...
>>>>>>>>>>
>>>>>>>>>> Dan
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Cheers,
>>>>>>>>>>> David
>>>>>>>>>>>
>>>>>>>>>>>> Dan
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>> David
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Dan
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> JavaThread::is_ext_suspend_completed() is used to check 
>>>>>>>>>>>>>>>> thread state, it returns `true` when the thread is 
>>>>>>>>>>>>>>>> sleeping [3], or when it performs in native [4].
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Sure but if the thread is actually suspended it can't 
>>>>>>>>>>>>>>> continue execution in the VM or in Java code.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> This appears to be an optimisation for the assumed 
>>>>>>>>>>>>>>>>> common case where threads are first suspended and then 
>>>>>>>>>>>>>>>>> the monitors are queried.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I agree with this, but I could find out it from JVMTI 
>>>>>>>>>>>>>>>> spec - it just says "Get information about the object's 
>>>>>>>>>>>>>>>> monitor."
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Yes it was just an implementation optimisation, nothing 
>>>>>>>>>>>>>>> to do with the spec.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> GetObjectMonitorUsage() might return incorrect 
>>>>>>>>>>>>>>>> information in some case.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> It starts with finding owner thread, but the owner might 
>>>>>>>>>>>>>>>> be just before wakeup.
>>>>>>>>>>>>>>>> So I think it is more safe if GetObjectMonitorUsage() is 
>>>>>>>>>>>>>>>> called at safepoint in any case.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Except we're moving away from safepoints to using 
>>>>>>>>>>>>>>> Handshakes, so this particular operation will require 
>>>>>>>>>>>>>>> that the apparent owner is Handshake-safe (by entering a 
>>>>>>>>>>>>>>> handshake with it) before querying the monitor. This 
>>>>>>>>>>>>>>> would still be preferable I think to always using a 
>>>>>>>>>>>>>>> safepoint for the entire operation.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>> -----
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> [3] 
>>>>>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/jdk/file/76a17c8143d8/src/hotspot/share/runtime/thread.cpp#l671 
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> [4] 
>>>>>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/jdk/file/76a17c8143d8/src/hotspot/share/runtime/thread.cpp#l684 
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> However there is still a potential bug as the thread 
>>>>>>>>>>>>>>>>> reported as the owner may not be suspended at the time 
>>>>>>>>>>>>>>>>> we first see it, and may release the monitor, but then 
>>>>>>>>>>>>>>>>> it may get suspended before we call:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>   owning_thread = 
>>>>>>>>>>>>>>>>> Threads::owning_thread_from_monitor_owner(tlh.list(), 
>>>>>>>>>>>>>>>>> owner);
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> and so we think it is still the monitor owner and 
>>>>>>>>>>>>>>>>> proceed to query the monitor information in a racy way. 
>>>>>>>>>>>>>>>>> This can't happen when suspension itself requires a 
>>>>>>>>>>>>>>>>> safepoint as the current thread won't go to that 
>>>>>>>>>>>>>>>>> safepoint during this code. However, if suspension is 
>>>>>>>>>>>>>>>>> implemented via a direct handshake with the target 
>>>>>>>>>>>>>>>>> thread then we have a problem.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>> -----
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> [1] 
>>>>>>>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/jdk/file/76a17c8143d8/src/hotspot/share/prims/jvmtiEnvBase.cpp#l973 
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> [2] 
>>>>>>>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/jdk/file/76a17c8143d8/src/hotspot/share/prims/jvmtiEnvBase.cpp#l996 
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>
>>


More information about the serviceability-dev mailing list