RFR(XS) 8036823: Stack trace sometimes shows 'locked' instead of 'waiting to lock'
David Holmes
david.holmes at oracle.com
Tue May 20 05:28:12 UTC 2014
Hi Dan,
On 20/05/2014 3:04 PM, Daniel D. Daugherty wrote:
> David,
>
> I'll have to politely disagree...
That's why this code is so much fun ;-)
You are of course correct. I misread this part of the inflate() code:
m->set_owner(mark->locker());
to be setting the thread id as the owner, but in fact it is the
BasicLock address as you note.
As I flagged with Zhengyu in email, and as per the July 2012 discussion
Kris Mok referred back to, I was wondering why this code did not do the
"obvious" thing - and indeed why no one suggested checking the owner
directly when I made the check_pending_monitor change back in 2006. Now
I know. :)
However I don't think we need to be concerned about BiasedLocking
because that goes away as soon as we inflate - and we have inflated. So
_owner is either a Thread ID or a BasicLock address.
But even so isn't the check:
mark->monitor()->owner() != thread()
perfectly valid because a BasicLock address will never match a thread
pointer?
Thanks,
David
> Here's the code in ObjectMonitor::exit() that handles the case
> when a thread finds itself exiting a monitor that was stack
> locked when it entered, but is now inflated:
>
> src/share/vm/runtime/objectMonitor.cpp:
>
> 912 void ATTR ObjectMonitor::exit(bool not_suspended, TRAPS) {
> 913 Thread * Self = THREAD ;
> 914 if (THREAD != _owner) {
> 915 if (THREAD->is_lock_owned((address) _owner)) {
> 916 // Transmute _owner from a BasicLock pointer to a Thread
> address
> .
> 917 // We don't need to hold _mutex for this transition.
> 918 // Non-null to Non-null is safe as long as all readers can
> 919 // tolerate either flavor.
> 920 assert (_recursions == 0, "invariant") ;
> 921 _owner = THREAD ;
> 922 _recursions = 0 ;
> 923 OwnerIsThread = 1 ;
> 924 } else {
> 925 // NOTE: we need to handle unbalanced monitor enter/exit
> 926 // in native code by throwing an exception.
> 927 // TODO: Throw an IllegalMonitorStateException ?
> 928 TEVENT (Exit - Throw IMSX) ;
> 929 assert(false, "Non-balanced monitor enter/exit!");
> 930 if (false) {
> 931 THROW(vmSymbols::java_lang_IllegalMonitorStateException());
> 932 }
> 933 return;
> 934 }
> 935 }
>
> so that _owner field when we did the stack dump still contained
> the BaseLock pointer instead of the JavaThread pointer...
>
> Dan
>
>
> On 5/19/14 8:09 PM, David Holmes wrote:
>> Dan,
>>
>> My understanding of the existing code and hence the modified code is
>> that we have already established that we are dealing with an inflated
>> monitor and hence the owner really must be the owning thread - not a
>> stack address for a BasicLock. Isn't that what mark->has_monitor()
>> establishes?
>>
>> Note that we can not have reached a safepoint inflation is in progress.
>>
>> Thanks,
>> David
>>
>> On 20/05/2014 9:47 AM, Daniel D. Daugherty wrote:
>>> On 5/19/14 7:58 AM, Zhengyu Gu wrote:
>>>> This is a simple fix for incorrect lock state.
>>>>
>>>> The timing on setting thread's pending monitor can result stack trace
>>>> dump reporting incorrect lock state. The solution is to check the
>>>> monitor's owner, if the owner is other than the current thread, then
>>>> the monitor, is or is in process of being, set the pending monitor of
>>>> current thread.
>>>>
>>>>
>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8036823
>>>> Webrev: http://cr.openjdk.java.net/~zgu/8036823/webrev.00/
>>>
>>> src/share/vm/runtime/vframe.cpp
>>> First off, I concur that the existing code has a problem.
>>> However, I think the proposed solution has at least one
>>> different problem.
>>>
>>> Here's the original code:
>>>
>>> 183 // Print out all monitors that we have locked or are trying
>>> to lock
>>> 184 GrowableArray<MonitorInfo*>* mons = monitors();
>>> 185 if (!mons->is_empty()) {
>>> 186 bool found_first_monitor = false;
>>> 187 for (int index = (mons->length()-1); index >= 0; index--) {
>>> 188 MonitorInfo* monitor = mons->at(index);
>>> 189 if (monitor->eliminated() && is_compiled_frame()) { //
>>> Eliminated in compiled code
>>> 190 if (monitor->owner_is_scalar_replaced()) {
>>> 191 Klass* k =
>>> java_lang_Class::as_Klass(monitor->owner_klass());
>>> 192 st->print("\t- eliminated <owner is scalar replaced> (a
>>> %s)", k->external_name());
>>> 193 } else {
>>> 194 oop obj = monitor->owner();
>>> 195 if (obj != NULL) {
>>> 196 print_locked_object_class_name(st, obj, "eliminated");
>>> 197 }
>>> 198 }
>>> 199 continue;
>>> 200 }
>>> 201 if (monitor->owner() != NULL) {
>>> 202
>>> 203 // First, assume we have the monitor locked. If we haven't
>>> found an
>>> 204 // owned monitor before and this is the first frame, then
>>> we need to
>>> 205 // see if we have completed the lock or we are blocked
>>> trying to
>>> 206 // acquire it - we can only be blocked if the monitor is
>>> inflated
>>> 207
>>> 208 const char *lock_state = "locked"; // assume we have the
>>> monitor locked
>>> 209 if (!found_first_monitor && frame_count == 0) {
>>> 210 markOop mark = monitor->owner()->mark();
>>> 211 if (mark->has_monitor() &&
>>> 212 mark->monitor() ==
>>> thread()->current_pending_monitor()) {
>>> 213 lock_state = "waiting to lock";
>>> 214 }
>>> 215 }
>>> 216
>>> 217 found_first_monitor = true;
>>> 218 print_locked_object_class_name(st, monitor->owner(),
>>> lock_state);
>>> 219 }
>>> 220 }
>>> 221 }
>>>
>>> The algorithm is pretty basic (no pun intended):
>>>
>>> - for each monitor (BasicLock) in the Java frame
>>> - if the monitor has been eliminated and this is
>>> a compiled frame
>>> - do compiler specific handling
>>> - continue
>>> - if the monitor has a owner
>>> - set lock_state = "locked"
>>> - if this is the first monitor and the top most frame
>>> - set mark = associated Object's mark
>>> - if the mark has an ObjectMonitor and
>>> the Java thread is pending on that monitor
>>> - set lock_state = "waiting to lock"
>>> - set found_first_monitor flag
>>> - print info about current monitor including
>>> the lock_state
>>>
>>> A BasicLock is a Java monitor from the Java frame's point of
>>> view. It might be implemented as a stack lock, a biased lock
>>> or an inflated lock (ObjectMonitor).
>>>
>>> The problem that I see with the algorithm is that it is
>>> traversing a collection of BasicLocks and then asking
>>> lock implementation specific questions in a an effort to
>>> determine if the BasicLock is owned by the Java thread
>>> whose frames we are traversing.
>>>
>>> We already know that this code:
>>>
>>> mark->monitor() == thread()->current_pending_monitor()
>>>
>>> is specific to inflated monitors (ObjectMonitor) and it
>>> has a flaw because the target thread can be stopped at
>>> a safepoint before the _current_pending_monitor field
>>> is set.
>>>
>>> The proposed code:
>>>
>>> mark->monitor()->owner() != thread()
>>>
>>> compares the owner of the inflated monitor to the Java
>>> thread whose frames we are traversing. The problem is
>>> that the ObjectMonitor::_owner field is not always a
>>> pointer to the JavaThread. Right after a stack lock
>>> is inflated, the ObjectMonitor::_owner field is set
>>> to the mark->locker() stack address; see:
>>>
>>> src/share/vm/runtime/synchronizer.cpp:
>>> line 1283: m->set_owner(mark->locker());
>>>
>>> So the proposed code will report "waiting to lock" for a
>>> Java monitor where the monitor is just transitioning
>>> from a stack lock to an inflated lock. Of course, the
>>> race is even tighter in this case. The target Java thread
>>> (T1) has to have just acquired the uninflated monitor
>>> and still be in the same frame in which that monitor was
>>> acquired at the same time that another thread (T2) lost
>>> the race to acquire the monitor. So T2 has to have just
>>> inflated the monitor, T1 has still be in the same frame
>>> in which it acquired the monitor and someone has to ask
>>> for a thread dump. Pretty cool.
>>>
>>> Another thought has occurred to me. T1 owns the monitor,
>>> T2 is contending for the monitor and is starting to inflate
>>> it, we request a thread dump. Is it possible to see the
>>> BasicLock in T2's Java frame, see that the Object does not
>>> yet have an inflated monitor (!mark->has_monitor()) and
>>> report that T2 has the monitor "locked"?
>>>
>>> This last scenario would again result in more than one
>>> thread being reported as having the same monitor locked.
>>>
>>> To sum up:
>>>
>>> I think both the current code and the proposed code have
>>> issues and we need to look at this a different way. We
>>> probably something like:
>>>
>>> MonitorInfo::is_locked_by_thread(JavaThread jt)
>>>
>>> that encapsulates and isolates the logic needed to
>>> determine if JavaThread 'jt' owns the BasicLock. It
>>> will need to know how to check:
>>>
>>> - stack lock ownership (see JavaThread::is_lock_owned())
>>> - biased lock ownership (don't know how to do this one)
>>> - inflated lock ownership (_owner field == jt)
>>>
>>> The stack lock ownership check needs to be done before
>>> the inflated lock ownership check. That's because the
>>> _owner field in the inflated lock can be the
>>> mark->locker() stack address when it is initially
>>> inflated.
>>>
>>> I couldn't figure out the biased locking ownership
>>> check quickly so I don't know where it fits in the
>>> checking ownership pecking order.
>>>
>>>
>>> Sorry this turned out to be not so simple... Welcome to
>>> the joy of locking...
>>>
>>> Dan
>>>
>>> P.S.
>>> Of course, now that we've discovered at least one
>>> misuse of current_pending_monitor(), we should probably
>>> audit the other uses of it also...
>>>
>>>
>>>
>>>>
>>>> Thanks,
>>>>
>>>> -Zhengyu
>>>
>
More information about the hotspot-dev
mailing list