RFR(XS) 8036823: Stack trace sometimes shows 'locked' instead of 'waiting to lock'
Daniel D. Daugherty
daniel.daugherty at oracle.com
Mon May 19 23:47:14 UTC 2014
On 5/19/14 7:58 AM, Zhengyu Gu wrote:
> This is a simple fix for incorrect lock state.
>
> The timing on setting thread's pending monitor can result stack trace
> dump reporting incorrect lock state. The solution is to check the
> monitor's owner, if the owner is other than the current thread, then
> the monitor, is or is in process of being, set the pending monitor of
> current thread.
>
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8036823
> Webrev: http://cr.openjdk.java.net/~zgu/8036823/webrev.00/
src/share/vm/runtime/vframe.cpp
First off, I concur that the existing code has a problem.
However, I think the proposed solution has at least one
different problem.
Here's the original code:
183 // Print out all monitors that we have locked or are trying to lock
184 GrowableArray<MonitorInfo*>* mons = monitors();
185 if (!mons->is_empty()) {
186 bool found_first_monitor = false;
187 for (int index = (mons->length()-1); index >= 0; index--) {
188 MonitorInfo* monitor = mons->at(index);
189 if (monitor->eliminated() && is_compiled_frame()) { //
Eliminated in compiled code
190 if (monitor->owner_is_scalar_replaced()) {
191 Klass* k =
java_lang_Class::as_Klass(monitor->owner_klass());
192 st->print("\t- eliminated <owner is scalar replaced> (a
%s)", k->external_name());
193 } else {
194 oop obj = monitor->owner();
195 if (obj != NULL) {
196 print_locked_object_class_name(st, obj, "eliminated");
197 }
198 }
199 continue;
200 }
201 if (monitor->owner() != NULL) {
202
203 // First, assume we have the monitor locked. If we haven't
found an
204 // owned monitor before and this is the first frame, then
we need to
205 // see if we have completed the lock or we are blocked
trying to
206 // acquire it - we can only be blocked if the monitor is
inflated
207
208 const char *lock_state = "locked"; // assume we have the
monitor locked
209 if (!found_first_monitor && frame_count == 0) {
210 markOop mark = monitor->owner()->mark();
211 if (mark->has_monitor() &&
212 mark->monitor() ==
thread()->current_pending_monitor()) {
213 lock_state = "waiting to lock";
214 }
215 }
216
217 found_first_monitor = true;
218 print_locked_object_class_name(st, monitor->owner(),
lock_state);
219 }
220 }
221 }
The algorithm is pretty basic (no pun intended):
- for each monitor (BasicLock) in the Java frame
- if the monitor has been eliminated and this is
a compiled frame
- do compiler specific handling
- continue
- if the monitor has a owner
- set lock_state = "locked"
- if this is the first monitor and the top most frame
- set mark = associated Object's mark
- if the mark has an ObjectMonitor and
the Java thread is pending on that monitor
- set lock_state = "waiting to lock"
- set found_first_monitor flag
- print info about current monitor including
the lock_state
A BasicLock is a Java monitor from the Java frame's point of
view. It might be implemented as a stack lock, a biased lock
or an inflated lock (ObjectMonitor).
The problem that I see with the algorithm is that it is
traversing a collection of BasicLocks and then asking
lock implementation specific questions in a an effort to
determine if the BasicLock is owned by the Java thread
whose frames we are traversing.
We already know that this code:
mark->monitor() == thread()->current_pending_monitor()
is specific to inflated monitors (ObjectMonitor) and it
has a flaw because the target thread can be stopped at
a safepoint before the _current_pending_monitor field
is set.
The proposed code:
mark->monitor()->owner() != thread()
compares the owner of the inflated monitor to the Java
thread whose frames we are traversing. The problem is
that the ObjectMonitor::_owner field is not always a
pointer to the JavaThread. Right after a stack lock
is inflated, the ObjectMonitor::_owner field is set
to the mark->locker() stack address; see:
src/share/vm/runtime/synchronizer.cpp:
line 1283: m->set_owner(mark->locker());
So the proposed code will report "waiting to lock" for a
Java monitor where the monitor is just transitioning
from a stack lock to an inflated lock. Of course, the
race is even tighter in this case. The target Java thread
(T1) has to have just acquired the uninflated monitor
and still be in the same frame in which that monitor was
acquired at the same time that another thread (T2) lost
the race to acquire the monitor. So T2 has to have just
inflated the monitor, T1 has still be in the same frame
in which it acquired the monitor and someone has to ask
for a thread dump. Pretty cool.
Another thought has occurred to me. T1 owns the monitor,
T2 is contending for the monitor and is starting to inflate
it, we request a thread dump. Is it possible to see the
BasicLock in T2's Java frame, see that the Object does not
yet have an inflated monitor (!mark->has_monitor()) and
report that T2 has the monitor "locked"?
This last scenario would again result in more than one
thread being reported as having the same monitor locked.
To sum up:
I think both the current code and the proposed code have
issues and we need to look at this a different way. We
probably something like:
MonitorInfo::is_locked_by_thread(JavaThread jt)
that encapsulates and isolates the logic needed to
determine if JavaThread 'jt' owns the BasicLock. It
will need to know how to check:
- stack lock ownership (see JavaThread::is_lock_owned())
- biased lock ownership (don't know how to do this one)
- inflated lock ownership (_owner field == jt)
The stack lock ownership check needs to be done before
the inflated lock ownership check. That's because the
_owner field in the inflated lock can be the
mark->locker() stack address when it is initially
inflated.
I couldn't figure out the biased locking ownership
check quickly so I don't know where it fits in the
checking ownership pecking order.
Sorry this turned out to be not so simple... Welcome to
the joy of locking...
Dan
P.S.
Of course, now that we've discovered at least one
misuse of current_pending_monitor(), we should probably
audit the other uses of it also...
>
> Thanks,
>
> -Zhengyu
More information about the hotspot-dev
mailing list