RFR(XS) 8036823: Stack trace sometimes shows 'locked' instead of 'waiting to lock'

Tue May 20 05:04:55 UTC 2014

David,

I'll have to politely disagree...

Here's the code in ObjectMonitor::exit() that handles the case
when a thread finds itself exiting a monitor that was stack
locked when it entered, but is now inflated:

src/share/vm/runtime/objectMonitor.cpp:

    912  void ATTR ObjectMonitor::exit(bool not_suspended, TRAPS) {
    913     Thread * Self = THREAD ;
    914     if (THREAD != _owner) {
    915       if (THREAD->is_lock_owned((address) _owner)) {
    916         // Transmute _owner from a BasicLock pointer to a Thread 
address
.
    917         // We don't need to hold _mutex for this transition.
    918         // Non-null to Non-null is safe as long as all readers can
    919         // tolerate either flavor.
    920         assert (_recursions == 0, "invariant") ;
    921         _owner = THREAD ;
    922         _recursions = 0 ;
    923         OwnerIsThread = 1 ;
    924       } else {
    925         // NOTE: we need to handle unbalanced monitor enter/exit
    926         // in native code by throwing an exception.
    927         // TODO: Throw an IllegalMonitorStateException ?
    928         TEVENT (Exit - Throw IMSX) ;
    929         assert(false, "Non-balanced monitor enter/exit!");
    930         if (false) {
    931 THROW(vmSymbols::java_lang_IllegalMonitorStateException());
    932         }
    933         return;
    934       }
    935     }

so that _owner field when we did the stack dump still contained
the BaseLock pointer instead of the JavaThread pointer...

Dan

On 5/19/14 8:09 PM, David Holmes wrote:
> Dan,
>
> My understanding of the existing code and hence the modified code is 
> that we have already established that we are dealing with an inflated 
> monitor and hence the owner really must be the owning thread - not a 
> stack address for a BasicLock. Isn't that what mark->has_monitor() 
> establishes?
>
> Note that we can not have reached a safepoint inflation is in progress.
>
> Thanks,
> David
>
> On 20/05/2014 9:47 AM, Daniel D. Daugherty wrote:
>> On 5/19/14 7:58 AM, Zhengyu Gu wrote:
>>> This is a simple fix for incorrect lock state.
>>>
>>> The timing on setting thread's pending monitor can result stack trace
>>> dump reporting incorrect lock state. The solution is to check the
>>> monitor's owner, if the owner is other than the current thread, then
>>> the monitor, is or is in process of being, set the pending monitor of
>>> current thread.
>>>
>>>
>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8036823
>>> Webrev: http://cr.openjdk.java.net/~zgu/8036823/webrev.00/
>>
>> src/share/vm/runtime/vframe.cpp
>>      First off, I concur that the existing code has a problem.
>>      However, I think the proposed solution has at least one
>>      different problem.
>>
>>      Here's the original code:
>>
>>   183   // Print out all monitors that we have locked or are trying 
>> to lock
>>   184   GrowableArray<MonitorInfo*>* mons = monitors();
>>   185   if (!mons->is_empty()) {
>>   186     bool found_first_monitor = false;
>>   187     for (int index = (mons->length()-1); index >= 0; index--) {
>>   188       MonitorInfo* monitor = mons->at(index);
>>   189       if (monitor->eliminated() && is_compiled_frame()) { //
>> Eliminated in compiled code
>>   190         if (monitor->owner_is_scalar_replaced()) {
>>   191           Klass* k =
>> java_lang_Class::as_Klass(monitor->owner_klass());
>>   192           st->print("\t- eliminated <owner is scalar replaced> (a
>> %s)", k->external_name());
>>   193         } else {
>>   194           oop obj = monitor->owner();
>>   195           if (obj != NULL) {
>>   196             print_locked_object_class_name(st, obj, "eliminated");
>>   197           }
>>   198         }
>>   199         continue;
>>   200       }
>>   201       if (monitor->owner() != NULL) {
>>   202
>>   203         // First, assume we have the monitor locked. If we haven't
>> found an
>>   204         // owned monitor before and this is the first frame, then
>> we need to
>>   205         // see if we have completed the lock or we are blocked
>> trying to
>>   206         // acquire it - we can only be blocked if the monitor is
>> inflated
>>   207
>>   208         const char *lock_state = "locked"; // assume we have the
>> monitor locked
>>   209         if (!found_first_monitor && frame_count == 0) {
>>   210           markOop mark = monitor->owner()->mark();
>>   211           if (mark->has_monitor() &&
>>   212               mark->monitor() ==
>> thread()->current_pending_monitor()) {
>>   213             lock_state = "waiting to lock";
>>   214           }
>>   215         }
>>   216
>>   217         found_first_monitor = true;
>>   218         print_locked_object_class_name(st, monitor->owner(),
>> lock_state);
>>   219       }
>>   220     }
>>   221   }
>>
>>      The algorithm is pretty basic (no pun intended):
>>
>>      - for each monitor (BasicLock) in the Java frame
>>        - if the monitor has been eliminated and this is
>>          a compiled frame
>>          - do compiler specific handling
>>          - continue
>>        - if the monitor has a owner
>>          - set lock_state = "locked"
>>          - if this is the first monitor and the top most frame
>>            - set mark = associated Object's mark
>>            - if the mark has an ObjectMonitor and
>>                 the Java thread is pending on that monitor
>>               - set lock_state = "waiting to lock"
>>        - set found_first_monitor flag
>>        - print info about current monitor including
>>          the lock_state
>>
>>      A BasicLock is a Java monitor from the Java frame's point of
>>      view. It might be implemented as a stack lock, a biased lock
>>      or an inflated lock (ObjectMonitor).
>>
>>      The problem that I see with the algorithm is that it is
>>      traversing a collection of BasicLocks and then asking
>>      lock implementation specific questions in a an effort to
>>      determine if the BasicLock is owned by the Java thread
>>      whose frames we are traversing.
>>
>>      We already know that this code:
>>
>>          mark->monitor() == thread()->current_pending_monitor()
>>
>>      is specific to inflated monitors (ObjectMonitor) and it
>>      has a flaw because the target thread can be stopped at
>>      a safepoint before the _current_pending_monitor field
>>      is set.
>>
>>      The proposed code:
>>
>>          mark->monitor()->owner() != thread()
>>
>>      compares the owner of the inflated monitor to the Java
>>      thread whose frames we are traversing. The problem is
>>      that the ObjectMonitor::_owner field is not always a
>>      pointer to the JavaThread. Right after a stack lock
>>      is inflated, the ObjectMonitor::_owner field is set
>>      to the mark->locker() stack address; see:
>>
>>      src/share/vm/runtime/synchronizer.cpp:
>>      line 1283: m->set_owner(mark->locker());
>>
>>      So the proposed code will report "waiting to lock" for a
>>      Java monitor where the monitor is just transitioning
>>      from a stack lock to an inflated lock. Of course, the
>>      race is even tighter in this case. The target Java thread
>>      (T1) has to have just acquired the uninflated monitor
>>      and still be in the same frame in which that monitor was
>>      acquired at the same time that another thread (T2) lost
>>      the race to acquire the monitor. So T2 has to have just
>>      inflated the monitor, T1 has still be in the same frame
>>      in which it acquired the monitor and someone has to ask
>>      for a thread dump. Pretty cool.
>>
>>      Another thought has occurred to me. T1 owns the monitor,
>>      T2 is contending for the monitor and is starting to inflate
>>      it, we request a thread dump. Is it possible to see the
>>      BasicLock in T2's Java frame, see that the Object does not
>>      yet have an inflated monitor (!mark->has_monitor()) and
>>      report that T2 has the monitor "locked"?
>>
>>      This last scenario would again result in more than one
>>      thread being reported as having the same monitor locked.
>>
>>      To sum up:
>>
>>      I think both the current code and the proposed code have
>>      issues and we need to look at this a different way. We
>>      probably something like:
>>
>>          MonitorInfo::is_locked_by_thread(JavaThread jt)
>>
>>      that encapsulates and isolates the logic needed to
>>      determine if JavaThread 'jt' owns the BasicLock. It
>>      will need to know how to check:
>>
>>      - stack lock ownership (see JavaThread::is_lock_owned())
>>      - biased lock ownership (don't know how to do this one)
>>      - inflated lock ownership (_owner field == jt)
>>
>>      The stack lock ownership check needs to be done before
>>      the inflated lock ownership check. That's because the
>>      _owner field in the inflated lock can be the
>>      mark->locker() stack address when it is initially
>>      inflated.
>>
>>      I couldn't figure out the biased locking ownership
>>      check quickly so I don't know where it fits in the
>>      checking ownership pecking order.
>>
>>
>> Sorry this turned out to be not so simple... Welcome to
>> the joy of locking...
>>
>> Dan
>>
>> P.S.
>> Of course, now that we've discovered at least one
>> misuse of current_pending_monitor(), we should probably
>> audit the other uses of it also...
>>
>>
>>
>>>
>>> Thanks,
>>>
>>> -Zhengyu
>>