RFR(XS) 8036823: Stack trace sometimes shows 'locked' instead of 'waiting to lock'

Tue May 20 02:09:23 UTC 2014

Dan,

My understanding of the existing code and hence the modified code is 
that we have already established that we are dealing with an inflated 
monitor and hence the owner really must be the owning thread - not a 
stack address for a BasicLock. Isn't that what mark->has_monitor() 
establishes?

Note that we can not have reached a safepoint inflation is in progress.

Thanks,
David

On 20/05/2014 9:47 AM, Daniel D. Daugherty wrote:
> On 5/19/14 7:58 AM, Zhengyu Gu wrote:
>> This is a simple fix for incorrect lock state.
>>
>> The timing on setting thread's pending monitor can result stack trace
>> dump reporting incorrect lock state. The solution is to check the
>> monitor's owner, if the owner is other than the current thread, then
>> the monitor, is or is in process of being, set the pending monitor of
>> current thread.
>>
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8036823
>> Webrev: http://cr.openjdk.java.net/~zgu/8036823/webrev.00/
>
> src/share/vm/runtime/vframe.cpp
>      First off, I concur that the existing code has a problem.
>      However, I think the proposed solution has at least one
>      different problem.
>
>      Here's the original code:
>
>   183   // Print out all monitors that we have locked or are trying to lock
>   184   GrowableArray<MonitorInfo*>* mons = monitors();
>   185   if (!mons->is_empty()) {
>   186     bool found_first_monitor = false;
>   187     for (int index = (mons->length()-1); index >= 0; index--) {
>   188       MonitorInfo* monitor = mons->at(index);
>   189       if (monitor->eliminated() && is_compiled_frame()) { //
> Eliminated in compiled code
>   190         if (monitor->owner_is_scalar_replaced()) {
>   191           Klass* k =
> java_lang_Class::as_Klass(monitor->owner_klass());
>   192           st->print("\t- eliminated <owner is scalar replaced> (a
> %s)", k->external_name());
>   193         } else {
>   194           oop obj = monitor->owner();
>   195           if (obj != NULL) {
>   196             print_locked_object_class_name(st, obj, "eliminated");
>   197           }
>   198         }
>   199         continue;
>   200       }
>   201       if (monitor->owner() != NULL) {
>   202
>   203         // First, assume we have the monitor locked. If we haven't
> found an
>   204         // owned monitor before and this is the first frame, then
> we need to
>   205         // see if we have completed the lock or we are blocked
> trying to
>   206         // acquire it - we can only be blocked if the monitor is
> inflated
>   207
>   208         const char *lock_state = "locked"; // assume we have the
> monitor locked
>   209         if (!found_first_monitor && frame_count == 0) {
>   210           markOop mark = monitor->owner()->mark();
>   211           if (mark->has_monitor() &&
>   212               mark->monitor() ==
> thread()->current_pending_monitor()) {
>   213             lock_state = "waiting to lock";
>   214           }
>   215         }
>   216
>   217         found_first_monitor = true;
>   218         print_locked_object_class_name(st, monitor->owner(),
> lock_state);
>   219       }
>   220     }
>   221   }
>
>      The algorithm is pretty basic (no pun intended):
>
>      - for each monitor (BasicLock) in the Java frame
>        - if the monitor has been eliminated and this is
>          a compiled frame
>          - do compiler specific handling
>          - continue
>        - if the monitor has a owner
>          - set lock_state = "locked"
>          - if this is the first monitor and the top most frame
>            - set mark = associated Object's mark
>            - if the mark has an ObjectMonitor and
>                 the Java thread is pending on that monitor
>               - set lock_state = "waiting to lock"
>        - set found_first_monitor flag
>        - print info about current monitor including
>          the lock_state
>
>      A BasicLock is a Java monitor from the Java frame's point of
>      view. It might be implemented as a stack lock, a biased lock
>      or an inflated lock (ObjectMonitor).
>
>      The problem that I see with the algorithm is that it is
>      traversing a collection of BasicLocks and then asking
>      lock implementation specific questions in a an effort to
>      determine if the BasicLock is owned by the Java thread
>      whose frames we are traversing.
>
>      We already know that this code:
>
>          mark->monitor() == thread()->current_pending_monitor()
>
>      is specific to inflated monitors (ObjectMonitor) and it
>      has a flaw because the target thread can be stopped at
>      a safepoint before the _current_pending_monitor field
>      is set.
>
>      The proposed code:
>
>          mark->monitor()->owner() != thread()
>
>      compares the owner of the inflated monitor to the Java
>      thread whose frames we are traversing. The problem is
>      that the ObjectMonitor::_owner field is not always a
>      pointer to the JavaThread. Right after a stack lock
>      is inflated, the ObjectMonitor::_owner field is set
>      to the mark->locker() stack address; see:
>
>      src/share/vm/runtime/synchronizer.cpp:
>      line 1283: m->set_owner(mark->locker());
>
>      So the proposed code will report "waiting to lock" for a
>      Java monitor where the monitor is just transitioning
>      from a stack lock to an inflated lock. Of course, the
>      race is even tighter in this case. The target Java thread
>      (T1) has to have just acquired the uninflated monitor
>      and still be in the same frame in which that monitor was
>      acquired at the same time that another thread (T2) lost
>      the race to acquire the monitor. So T2 has to have just
>      inflated the monitor, T1 has still be in the same frame
>      in which it acquired the monitor and someone has to ask
>      for a thread dump. Pretty cool.
>
>      Another thought has occurred to me. T1 owns the monitor,
>      T2 is contending for the monitor and is starting to inflate
>      it, we request a thread dump. Is it possible to see the
>      BasicLock in T2's Java frame, see that the Object does not
>      yet have an inflated monitor (!mark->has_monitor()) and
>      report that T2 has the monitor "locked"?
>
>      This last scenario would again result in more than one
>      thread being reported as having the same monitor locked.
>
>      To sum up:
>
>      I think both the current code and the proposed code have
>      issues and we need to look at this a different way. We
>      probably something like:
>
>          MonitorInfo::is_locked_by_thread(JavaThread jt)
>
>      that encapsulates and isolates the logic needed to
>      determine if JavaThread 'jt' owns the BasicLock. It
>      will need to know how to check:
>
>      - stack lock ownership (see JavaThread::is_lock_owned())
>      - biased lock ownership (don't know how to do this one)
>      - inflated lock ownership (_owner field == jt)
>
>      The stack lock ownership check needs to be done before
>      the inflated lock ownership check. That's because the
>      _owner field in the inflated lock can be the
>      mark->locker() stack address when it is initially
>      inflated.
>
>      I couldn't figure out the biased locking ownership
>      check quickly so I don't know where it fits in the
>      checking ownership pecking order.
>
>
> Sorry this turned out to be not so simple... Welcome to
> the joy of locking...
>
> Dan
>
> P.S.
> Of course, now that we've discovered at least one
> misuse of current_pending_monitor(), we should probably
> audit the other uses of it also...
>
>
>
>>
>> Thanks,
>>
>> -Zhengyu
>