RFR: 8225351(13): assert failed: Revoke failed, unhandled biased lock state
Daniel D. Daugherty
daniel.daugherty at oracle.com
Tue Jun 18 15:46:36 UTC 2019
Hi Robbin,
I'm still having trouble with this. Please bear with me below...
On 6/18/19 3:12 AM, Robbin Ehn wrote:
> Hi Dan,
>
> On 2019-06-17 22:34, Daniel D. Daugherty wrote:
>> Hi Robbin,
>>
>> Okay so I read this email and then the bug report. The conclusion sounds
>> like we have an assert() that is a little too aggressive so I moved
>> on to
>> the webrev... I expected to find a refinement to an assert(), but that's
>> not quite what I see...
>>
>> Let me re-read everything again...
>>
>> Hmmm... still reads the same, but let me dive into the code instead...
>>
>>
>> On 6/14/19 4:03 AM, Robbin Ehn wrote:
>>> Hi all, please review.
>
> To answer:
> > I'm having problems with this sanity check. That function
> expects
> > to find a BasicLock (via a MonitorInfo) that refers to the
> > object whose bias we are trying to revoke. That check is
> assuming
> > that a biased lock must be in the thread's top frame and I
> don't
> > think that is a valid assumption.
>
>>>
>>> When looking at a JavaThreads locks we retrieve them per frame via
>>> it's monitors
>>> list. How this list actually populated differs from frame type. If a
>>> JavaThread tries to enter a new monitor a basic lock is
>>> directly/indirectly via e.g. scope info added to the virtual monitor
>>> list.
>
> The lock is taken in this current java frame, the scope desc about the
> lock is thus added to current frame. When fast path fails we call into
> to VM to get the
> lock. This MonitorInfo (generated from the scope desc) points to the
> "biased to
> other thread lock" must thus be in top java frame.
> This is the only case when a biased lock towards another thread will
> be on our
> stack, when we have not yet had time to revoke it.
>
> I don't want another case falling through the assert, if anything I'm
> saying
> would be incorrect.
>
> Unfortunate the monitor info is not guaranteed to be in order
> otherwise this
> assert would also check this is the last lock in the top frame.
> (at least interpreter can have them out-of-order, this is compiled
> frames, but I didn't feel sure about it)
I attached the hs_err_pid file from Mikael's original Mach5 sighting
to the bug report. Let me get oriented here by looking at the crashing
stack:
V [jvm.dll+0x50bd22] report_vm_error+0x102 (debug.cpp:264)
V [jvm.dll+0x2d553e] BiasedLocking::revoke_own_locks_in_handshake+0xfe
(biasedlocking.cpp:640)
Deoptimization::revoke_using_handshake()
optimized out
V [jvm.dll+0x51a6f5] Deoptimization::deopt_thread+0x1b5
(deoptimization.cpp:1518)
V [jvm.dll+0x51a960] Deoptimization::deoptimize+0x50
(deoptimization.cpp:1500)
V [jvm.dll+0xc7bca2] JavaThread::deoptimize_marked_methods+0xe2
(thread.cpp:2840)
V [jvm.dll+0x680f89] HandshakeThreadsOperation::do_handshake+0x189
(handshake.cpp:248)
V [jvm.dll+0x68144f] VM_HandshakeAllThreads::doit+0x36f
(handshake.cpp:193)
V [jvm.dll+0xcdf18e] VM_Operation::evaluate+0xbe (vmoperations.cpp:71)
V [jvm.dll+0xce42a8] VMThread::evaluate_operation+0xb8 (vmthread.cpp:406)
V [jvm.dll+0xce507f] VMThread::loop+0x84f (vmthread.cpp:616)
V [jvm.dll+0xce5708] VMThread::run+0xd8 (vmthread.cpp:305)
V [jvm.dll+0xc79232] Thread::call_run+0x192 (thread.cpp:405)
V [jvm.dll+0xad6f0e] thread_native_entry+0x10e (os_windows.cpp:471)
Okay, so deopt_thread() is called with the 'frame fr' that we are trying
to deopt and it calls revoke_using_handshake() with the same 'frame fr'.
revoke_using_handshake() calls get_monitors_from_stack() passing the
same 'frame fr' and populates 'objects_to_revoke'. We go thru the list
of objects that need to be revoked and call revoke_own_locks_in_handshake()
on each one. It is in one of the revoke_own_locks_in_handshake() calls
that we fail the assertion because the object is biased towards another
thread.
Okay, so now I'm oriented in the code and see where the assertion fails
and where you changed to code so that we don't call that code if we see
that the assertion is going to fail. I'm good with that part of the fix
and I didn't make that clear before. Sorry about that.
For the else-branch where you added the call to the lock_in_top_frame():
L1488: if (mark->has_bias_pattern()) {
L1489: if (mark->biased_locker() == thread) {
<snip>
L1492: } else {
L1493: #ifdef ASSERT
L1494: lock_in_top_frame(obj, thread);
L1495: #endif
L1496: }
You are still in the "has_bias_pattern" if-statement so we know the
object is biased. You are in the else-branch of comparing the biased
locker to the target thread so you know that the object is biased to
some other thread.
Okay so now I'm oriented in the new checking code. The target thread
has a BasicLock on its stack that refers to the object, but the object
is biased to another thread so the target thread is contending on the
lock which means that the frame _must be_ the top frame because the
target thread is blocked.
I (finally) see why the new check is okay. Maybe I'm the only
reviewer that had trouble with the new check and, if so, I'm
sorry about that.
Please consider the following:
L1492: } else {
// Biased to another thread means 'thread' is
// contending for the lock in the top frame.
DEBUG_ONLY(verify_BasicLock_in_top_frame(obj, thread);
}
Thumbs up! Your call on whether to make more changes to this fix.
Dan
>
> If this lock is biased
>>> towards another
>>> JavaThread we try to revoke that bias with a safepoint. In this case
>>> a deopt
>>> handshake is already in queue. The handshake is thus executed before
>>> the revoke
>>> safepoint.
>>> The handshake goes over the monitors in compiled frames, find this
>>> lock and we
>>> hit the assert. The assert make sure we actual can revoke the lock.
>>> A basic lock
>>> on stack should always, if biased, be biased to current thread, with
>>> the
>>> exception:
>>> We may have a stack lock biased against another thread until
>>> ObjectSynchronizer::fast_enter returns.
>>>
>>> To handle this exception we can safely ignore biased lock towards
>>> other threads
>>> in the deopt handshake. Since such locks will always be revoked
>>> before we
>>> deopt/unpack stack.
>>>
>>> Code:
>>> http://cr.openjdk.java.net/~rehn/8225351/v1/webrev/index.html
>>
>> First a little analysis:
>>
>> Our failing assert is here:
>>
>> src/hotspot/share/runtime/biasedLocking.cpp:
>>
>> BiasedLocking::Condition
>> BiasedLocking::revoke_own_locks_in_handshake(Handle obj, TRAPS) {
>>
>> assert(mark->biased_locker() == THREAD &&
>> prototype_header->bias_epoch() == mark->bias_epoch(),
>> "Revoke failed, unhandled biased lock state");
>>
>> Since the function is called revoke_own_locks_in_handshake() it makes
>> sense that we're checking that the locks are biased to THREAD
>> (presumably
>> the current thread since that's the TRAPS protocol). Just checking...
>> Uhhhh... Looks like we can get to this code path from
>> Deoptimization::deoptimize_frame_internal() when the current thread is
>> the target thread *AND* when we are deoptimizing at a safepoint and the
>> current thread is not the target thread. To my brain that makes use of
>> the 'TRAPS' parameter "wrong", i.e., it should a "Thread* thread"
>> parameter, but that's probably a carry over from older biased locking
>> code.
>>
>> However, let's leave the topic of the "TRAPS" parameter alone for a
>> separate investigation. It's entirely possible that my memory about the
>> TRAPS protocol is rusty (or it has evolved since I last committed it to
>> memory).
>
> Old code, changed to "JavaThread *thread".
>
>>
>>
>> src/hotspot/share/runtime/deoptimization.cpp
>> old L1464: oop obj = (objects_to_revoke->at(i))();
>> old L1465:
>> BiasedLocking::revoke_own_locks_in_handshake(objects_to_revoke->at(i),
>> thread);
>>
>> new L1486: Handle obj = objects_to_revoke->at(i);
>> new L1487: markOop mark = obj->mark();
>> new L1488: if (mark->has_bias_pattern()) {
>> new L1489: if (mark->biased_locker() == thread) {
>> new L1490: BiasedLocking::revoke_own_locks_in_handshake(obj,
>> thread);
>>
>> So BiasedLocking::revoke_own_locks_in_handshake() has:
>>
>> > markOop mark = obj->mark();
>> > if (!mark->has_bias_pattern()) {
>> > return NOT_BIASED;
>> > }
>>
>> which shows that the code previously recognized that it
>> could be
>> called with an 'obj' parameter that wasn't biased. Perhaps it
>> would be better to change revoke_own_locks_in_handshake() to
>> something like:
>
> We may not call mark->biased_locker() on a non-biased lock.
> So this call is just to let us call biased_locker().
> There is not a biased_locker_if_biased() method.
>
>>
>> if (!mark->has_bias_pattern() || mark->biased_locker() !=
>> THREAD) {
>>
>> Of course, that means we would be returning "NOT_BIASED" for
>> the case where the target object _is_ biased, but just not
>> biased to our target thread.
>
> Yes, I did not like that. And since the method is called 'own_locks'
> it didn't
> make sense to feed locks owned by other threads.
>
>>
>> Another thought occurs to me: If this is happening during a
>> handshake
>> and not at a safepoint, then what prevents the object from
>> passing
>> new L1488 and L1489 in deoptimization.cpp, then we call
>> revoke_own_locks_in_handshake() and pass
>> "(!mark->has_bias_pattern())"
>> only discover that the object has been rebiased to another
>> thread at
>> the assert? In other words, what happens if this revocation
>> attempt
>> loses a race with another revocation & rebias attempt?
>
> In this case current thread going to revoke that bias in the safepoint
> happening
> after the handshake. Any other thread will need todo a safepoint to
> revoke it
> also. Since we are in a handshake we know there is not going to be a
> safepoint
> until we finishes the handshake.
> It's not possible to rebias to another thread without a safepoint when
> the lock is biased to another thread.
>
>>
>> L1493: #ifdef ASSERT
>> L1494: lock_in_top_frame(obj, thread);
>> L1495: #endif
>> I think this should be:
>>
>> DEBUG_ONLY(lock_in_top_frame(obj, thread);)
>
> Fixed.
>
>>
>> L1456: void lock_in_top_frame(Handle ho, JavaThread* thread) {
>> L1463: if (mon_info->eliminated()) {
>> L1464: continue;
>> L1465: }
>> L1473: assert(false, "Lock not found in top frame");
>> I'm having problems with this sanity check. That function
>> expects
>> to find a BasicLock (via a MonitorInfo) that refers to the
>> object whose bias we are trying to revoke. That check is
>> assuming
>> that a biased lock must be in the thread's top frame and I
>> don't
>> think that is a valid assumption.
>>
>> revoke_using_handshake() calls get_monitors_from_stack() and
>> that
>> function collects objects that have monitors from all
>> vframes into
>> 'objects_to_revoke'.
>>
>> Maybe I'm missing something obvious here?
>
> I hope above explains more.
>
> Thanks, Robbin
>
>>
>> Dan
>>
>>> Issue:
>>> https://bugs.openjdk.java.net/browse/JDK-8225351
>>>
>>> Passes t1-7
>>> The assert code tested with local code changes to HandshakeAlot
>>> handshake.
>>> We then see this state where last lock can be biased towards another
>>> thread and the thread is trying to execute revoke safepoint.
>>>
>>> Thanks, Robbin
>>
More information about the hotspot-compiler-dev
mailing list