RFR(XS) 8036823: Stack trace sometimes shows 'locked' instead of 'waiting to lock'
David Holmes
david.holmes at oracle.com
Tue May 20 09:24:22 UTC 2014
On 20/05/2014 3:28 PM, David Holmes wrote:
> Hi Dan,
>
> On 20/05/2014 3:04 PM, Daniel D. Daugherty wrote:
>> David,
>>
>> I'll have to politely disagree...
>
> That's why this code is so much fun ;-)
>
> You are of course correct. I misread this part of the inflate() code:
>
> m->set_owner(mark->locker());
>
> to be setting the thread id as the owner, but in fact it is the
> BasicLock address as you note.
>
> As I flagged with Zhengyu in email, and as per the July 2012 discussion
> Kris Mok referred back to, I was wondering why this code did not do the
> "obvious" thing - and indeed why no one suggested checking the owner
> directly when I made the check_pending_monitor change back in 2006. Now
> I know. :)
>
> However I don't think we need to be concerned about BiasedLocking
> because that goes away as soon as we inflate - and we have inflated. So
> _owner is either a Thread ID or a BasicLock address.
>
> But even so isn't the check:
>
> mark->monitor()->owner() != thread()
>
> perfectly valid because a BasicLock address will never match a thread
> pointer?
Ah but the thread being examined may in fact be the owner that has this
monitor stack-locked - which Dan also pointed out in his email. So we do
need to do additional checking for the BasicLock case.
David
> Thanks,
> David
>
>> Here's the code in ObjectMonitor::exit() that handles the case
>> when a thread finds itself exiting a monitor that was stack
>> locked when it entered, but is now inflated:
>>
>> src/share/vm/runtime/objectMonitor.cpp:
>>
>> 912 void ATTR ObjectMonitor::exit(bool not_suspended, TRAPS) {
>> 913 Thread * Self = THREAD ;
>> 914 if (THREAD != _owner) {
>> 915 if (THREAD->is_lock_owned((address) _owner)) {
>> 916 // Transmute _owner from a BasicLock pointer to a Thread
>> address
>> .
>> 917 // We don't need to hold _mutex for this transition.
>> 918 // Non-null to Non-null is safe as long as all readers
>> can
>> 919 // tolerate either flavor.
>> 920 assert (_recursions == 0, "invariant") ;
>> 921 _owner = THREAD ;
>> 922 _recursions = 0 ;
>> 923 OwnerIsThread = 1 ;
>> 924 } else {
>> 925 // NOTE: we need to handle unbalanced monitor enter/exit
>> 926 // in native code by throwing an exception.
>> 927 // TODO: Throw an IllegalMonitorStateException ?
>> 928 TEVENT (Exit - Throw IMSX) ;
>> 929 assert(false, "Non-balanced monitor enter/exit!");
>> 930 if (false) {
>> 931 THROW(vmSymbols::java_lang_IllegalMonitorStateException());
>> 932 }
>> 933 return;
>> 934 }
>> 935 }
>>
>> so that _owner field when we did the stack dump still contained
>> the BaseLock pointer instead of the JavaThread pointer...
>>
>> Dan
>>
>>
>> On 5/19/14 8:09 PM, David Holmes wrote:
>>> Dan,
>>>
>>> My understanding of the existing code and hence the modified code is
>>> that we have already established that we are dealing with an inflated
>>> monitor and hence the owner really must be the owning thread - not a
>>> stack address for a BasicLock. Isn't that what mark->has_monitor()
>>> establishes?
>>>
>>> Note that we can not have reached a safepoint inflation is in progress.
>>>
>>> Thanks,
>>> David
>>>
>>> On 20/05/2014 9:47 AM, Daniel D. Daugherty wrote:
>>>> On 5/19/14 7:58 AM, Zhengyu Gu wrote:
>>>>> This is a simple fix for incorrect lock state.
>>>>>
>>>>> The timing on setting thread's pending monitor can result stack trace
>>>>> dump reporting incorrect lock state. The solution is to check the
>>>>> monitor's owner, if the owner is other than the current thread, then
>>>>> the monitor, is or is in process of being, set the pending monitor of
>>>>> current thread.
>>>>>
>>>>>
>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8036823
>>>>> Webrev: http://cr.openjdk.java.net/~zgu/8036823/webrev.00/
>>>>
>>>> src/share/vm/runtime/vframe.cpp
>>>> First off, I concur that the existing code has a problem.
>>>> However, I think the proposed solution has at least one
>>>> different problem.
>>>>
>>>> Here's the original code:
>>>>
>>>> 183 // Print out all monitors that we have locked or are trying
>>>> to lock
>>>> 184 GrowableArray<MonitorInfo*>* mons = monitors();
>>>> 185 if (!mons->is_empty()) {
>>>> 186 bool found_first_monitor = false;
>>>> 187 for (int index = (mons->length()-1); index >= 0; index--) {
>>>> 188 MonitorInfo* monitor = mons->at(index);
>>>> 189 if (monitor->eliminated() && is_compiled_frame()) { //
>>>> Eliminated in compiled code
>>>> 190 if (monitor->owner_is_scalar_replaced()) {
>>>> 191 Klass* k =
>>>> java_lang_Class::as_Klass(monitor->owner_klass());
>>>> 192 st->print("\t- eliminated <owner is scalar replaced> (a
>>>> %s)", k->external_name());
>>>> 193 } else {
>>>> 194 oop obj = monitor->owner();
>>>> 195 if (obj != NULL) {
>>>> 196 print_locked_object_class_name(st, obj,
>>>> "eliminated");
>>>> 197 }
>>>> 198 }
>>>> 199 continue;
>>>> 200 }
>>>> 201 if (monitor->owner() != NULL) {
>>>> 202
>>>> 203 // First, assume we have the monitor locked. If we
>>>> haven't
>>>> found an
>>>> 204 // owned monitor before and this is the first frame, then
>>>> we need to
>>>> 205 // see if we have completed the lock or we are blocked
>>>> trying to
>>>> 206 // acquire it - we can only be blocked if the monitor is
>>>> inflated
>>>> 207
>>>> 208 const char *lock_state = "locked"; // assume we have the
>>>> monitor locked
>>>> 209 if (!found_first_monitor && frame_count == 0) {
>>>> 210 markOop mark = monitor->owner()->mark();
>>>> 211 if (mark->has_monitor() &&
>>>> 212 mark->monitor() ==
>>>> thread()->current_pending_monitor()) {
>>>> 213 lock_state = "waiting to lock";
>>>> 214 }
>>>> 215 }
>>>> 216
>>>> 217 found_first_monitor = true;
>>>> 218 print_locked_object_class_name(st, monitor->owner(),
>>>> lock_state);
>>>> 219 }
>>>> 220 }
>>>> 221 }
>>>>
>>>> The algorithm is pretty basic (no pun intended):
>>>>
>>>> - for each monitor (BasicLock) in the Java frame
>>>> - if the monitor has been eliminated and this is
>>>> a compiled frame
>>>> - do compiler specific handling
>>>> - continue
>>>> - if the monitor has a owner
>>>> - set lock_state = "locked"
>>>> - if this is the first monitor and the top most frame
>>>> - set mark = associated Object's mark
>>>> - if the mark has an ObjectMonitor and
>>>> the Java thread is pending on that monitor
>>>> - set lock_state = "waiting to lock"
>>>> - set found_first_monitor flag
>>>> - print info about current monitor including
>>>> the lock_state
>>>>
>>>> A BasicLock is a Java monitor from the Java frame's point of
>>>> view. It might be implemented as a stack lock, a biased lock
>>>> or an inflated lock (ObjectMonitor).
>>>>
>>>> The problem that I see with the algorithm is that it is
>>>> traversing a collection of BasicLocks and then asking
>>>> lock implementation specific questions in a an effort to
>>>> determine if the BasicLock is owned by the Java thread
>>>> whose frames we are traversing.
>>>>
>>>> We already know that this code:
>>>>
>>>> mark->monitor() == thread()->current_pending_monitor()
>>>>
>>>> is specific to inflated monitors (ObjectMonitor) and it
>>>> has a flaw because the target thread can be stopped at
>>>> a safepoint before the _current_pending_monitor field
>>>> is set.
>>>>
>>>> The proposed code:
>>>>
>>>> mark->monitor()->owner() != thread()
>>>>
>>>> compares the owner of the inflated monitor to the Java
>>>> thread whose frames we are traversing. The problem is
>>>> that the ObjectMonitor::_owner field is not always a
>>>> pointer to the JavaThread. Right after a stack lock
>>>> is inflated, the ObjectMonitor::_owner field is set
>>>> to the mark->locker() stack address; see:
>>>>
>>>> src/share/vm/runtime/synchronizer.cpp:
>>>> line 1283: m->set_owner(mark->locker());
>>>>
>>>> So the proposed code will report "waiting to lock" for a
>>>> Java monitor where the monitor is just transitioning
>>>> from a stack lock to an inflated lock. Of course, the
>>>> race is even tighter in this case. The target Java thread
>>>> (T1) has to have just acquired the uninflated monitor
>>>> and still be in the same frame in which that monitor was
>>>> acquired at the same time that another thread (T2) lost
>>>> the race to acquire the monitor. So T2 has to have just
>>>> inflated the monitor, T1 has still be in the same frame
>>>> in which it acquired the monitor and someone has to ask
>>>> for a thread dump. Pretty cool.
>>>>
>>>> Another thought has occurred to me. T1 owns the monitor,
>>>> T2 is contending for the monitor and is starting to inflate
>>>> it, we request a thread dump. Is it possible to see the
>>>> BasicLock in T2's Java frame, see that the Object does not
>>>> yet have an inflated monitor (!mark->has_monitor()) and
>>>> report that T2 has the monitor "locked"?
>>>>
>>>> This last scenario would again result in more than one
>>>> thread being reported as having the same monitor locked.
>>>>
>>>> To sum up:
>>>>
>>>> I think both the current code and the proposed code have
>>>> issues and we need to look at this a different way. We
>>>> probably something like:
>>>>
>>>> MonitorInfo::is_locked_by_thread(JavaThread jt)
>>>>
>>>> that encapsulates and isolates the logic needed to
>>>> determine if JavaThread 'jt' owns the BasicLock. It
>>>> will need to know how to check:
>>>>
>>>> - stack lock ownership (see JavaThread::is_lock_owned())
>>>> - biased lock ownership (don't know how to do this one)
>>>> - inflated lock ownership (_owner field == jt)
>>>>
>>>> The stack lock ownership check needs to be done before
>>>> the inflated lock ownership check. That's because the
>>>> _owner field in the inflated lock can be the
>>>> mark->locker() stack address when it is initially
>>>> inflated.
>>>>
>>>> I couldn't figure out the biased locking ownership
>>>> check quickly so I don't know where it fits in the
>>>> checking ownership pecking order.
>>>>
>>>>
>>>> Sorry this turned out to be not so simple... Welcome to
>>>> the joy of locking...
>>>>
>>>> Dan
>>>>
>>>> P.S.
>>>> Of course, now that we've discovered at least one
>>>> misuse of current_pending_monitor(), we should probably
>>>> audit the other uses of it also...
>>>>
>>>>
>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> -Zhengyu
>>>>
>>
More information about the hotspot-dev
mailing list