RFR: 8225351(13): assert failed: Revoke failed, unhandled biased lock state
David Holmes
david.holmes at oracle.com
Thu Jun 20 04:03:28 UTC 2019
Hi Dan,
> Maybe I'm the only reviewer that had trouble with the new check and,
> if so, I'm sorry about that.
To this point you are the only reviewer and this potential reviewer was
waiting to see how this discussion resolved :) Now I have something to
base my review on - tomorrow (sorry Robbin, still lagging way behind
after my travels.)
Cheers,
David
On 18/06/2019 1:46 am, Daniel D. Daugherty wrote:
> Hi Robbin,
>
> I'm still having trouble with this. Please bear with me below...
>
>
> On 6/18/19 3:12 AM, Robbin Ehn wrote:
>> Hi Dan,
>>
>> On 2019-06-17 22:34, Daniel D. Daugherty wrote:
>>> Hi Robbin,
>>>
>>> Okay so I read this email and then the bug report. The conclusion sounds
>>> like we have an assert() that is a little too aggressive so I moved
>>> on to
>>> the webrev... I expected to find a refinement to an assert(), but that's
>>> not quite what I see...
>>>
>>> Let me re-read everything again...
>>>
>>> Hmmm... still reads the same, but let me dive into the code instead...
>>>
>>>
>>> On 6/14/19 4:03 AM, Robbin Ehn wrote:
>>>> Hi all, please review.
>>
>> To answer:
>> > I'm having problems with this sanity check. That function
>> expects
>> > to find a BasicLock (via a MonitorInfo) that refers to the
>> > object whose bias we are trying to revoke. That check is
>> assuming
>> > that a biased lock must be in the thread's top frame and I
>> don't
>> > think that is a valid assumption.
>>
>>>>
>>>> When looking at a JavaThreads locks we retrieve them per frame via
>>>> it's monitors
>>>> list. How this list actually populated differs from frame type. If a
>>>> JavaThread tries to enter a new monitor a basic lock is
>>>> directly/indirectly via e.g. scope info added to the virtual monitor
>>>> list.
>>
>> The lock is taken in this current java frame, the scope desc about the
>> lock is thus added to current frame. When fast path fails we call into
>> to VM to get the
>> lock. This MonitorInfo (generated from the scope desc) points to the
>> "biased to
>> other thread lock" must thus be in top java frame.
>> This is the only case when a biased lock towards another thread will
>> be on our
>> stack, when we have not yet had time to revoke it.
>>
>> I don't want another case falling through the assert, if anything I'm
>> saying
>> would be incorrect.
>>
>> Unfortunate the monitor info is not guaranteed to be in order
>> otherwise this
>> assert would also check this is the last lock in the top frame.
>> (at least interpreter can have them out-of-order, this is compiled
>> frames, but I didn't feel sure about it)
>
> I attached the hs_err_pid file from Mikael's original Mach5 sighting
> to the bug report. Let me get oriented here by looking at the crashing
> stack:
>
> V [jvm.dll+0x50bd22] report_vm_error+0x102 (debug.cpp:264)
> V [jvm.dll+0x2d553e] BiasedLocking::revoke_own_locks_in_handshake+0xfe
> (biasedlocking.cpp:640)
> Deoptimization::revoke_using_handshake()
> optimized out
> V [jvm.dll+0x51a6f5] Deoptimization::deopt_thread+0x1b5
> (deoptimization.cpp:1518)
> V [jvm.dll+0x51a960] Deoptimization::deoptimize+0x50
> (deoptimization.cpp:1500)
> V [jvm.dll+0xc7bca2] JavaThread::deoptimize_marked_methods+0xe2
> (thread.cpp:2840)
> V [jvm.dll+0x680f89] HandshakeThreadsOperation::do_handshake+0x189
> (handshake.cpp:248)
> V [jvm.dll+0x68144f] VM_HandshakeAllThreads::doit+0x36f
> (handshake.cpp:193)
> V [jvm.dll+0xcdf18e] VM_Operation::evaluate+0xbe (vmoperations.cpp:71)
> V [jvm.dll+0xce42a8] VMThread::evaluate_operation+0xb8 (vmthread.cpp:406)
> V [jvm.dll+0xce507f] VMThread::loop+0x84f (vmthread.cpp:616)
> V [jvm.dll+0xce5708] VMThread::run+0xd8 (vmthread.cpp:305)
> V [jvm.dll+0xc79232] Thread::call_run+0x192 (thread.cpp:405)
> V [jvm.dll+0xad6f0e] thread_native_entry+0x10e (os_windows.cpp:471)
>
> Okay, so deopt_thread() is called with the 'frame fr' that we are trying
> to deopt and it calls revoke_using_handshake() with the same 'frame fr'.
> revoke_using_handshake() calls get_monitors_from_stack() passing the
> same 'frame fr' and populates 'objects_to_revoke'. We go thru the list
> of objects that need to be revoked and call revoke_own_locks_in_handshake()
> on each one. It is in one of the revoke_own_locks_in_handshake() calls
> that we fail the assertion because the object is biased towards another
> thread.
>
> Okay, so now I'm oriented in the code and see where the assertion fails
> and where you changed to code so that we don't call that code if we see
> that the assertion is going to fail. I'm good with that part of the fix
> and I didn't make that clear before. Sorry about that.
>
>
> For the else-branch where you added the call to the lock_in_top_frame():
>
> L1488: if (mark->has_bias_pattern()) {
> L1489: if (mark->biased_locker() == thread) {
> <snip>
> L1492: } else {
> L1493: #ifdef ASSERT
> L1494: lock_in_top_frame(obj, thread);
> L1495: #endif
> L1496: }
>
> You are still in the "has_bias_pattern" if-statement so we know the
> object is biased. You are in the else-branch of comparing the biased
> locker to the target thread so you know that the object is biased to
> some other thread.
>
> Okay so now I'm oriented in the new checking code. The target thread
> has a BasicLock on its stack that refers to the object, but the object
> is biased to another thread so the target thread is contending on the
> lock which means that the frame _must be_ the top frame because the
> target thread is blocked.
>
> I (finally) see why the new check is okay. Maybe I'm the only
> reviewer that had trouble with the new check and, if so, I'm
> sorry about that.
>
> Please consider the following:
>
> L1492: } else {
> // Biased to another thread means 'thread' is
> // contending for the lock in the top frame.
> DEBUG_ONLY(verify_BasicLock_in_top_frame(obj, thread);
> }
>
> Thumbs up! Your call on whether to make more changes to this fix.
>
> Dan
>
>>
>> If this lock is biased
>>>> towards another
>>>> JavaThread we try to revoke that bias with a safepoint. In this case
>>>> a deopt
>>>> handshake is already in queue. The handshake is thus executed before
>>>> the revoke
>>>> safepoint.
>>>> The handshake goes over the monitors in compiled frames, find this
>>>> lock and we
>>>> hit the assert. The assert make sure we actual can revoke the lock.
>>>> A basic lock
>>>> on stack should always, if biased, be biased to current thread, with
>>>> the
>>>> exception:
>>>> We may have a stack lock biased against another thread until
>>>> ObjectSynchronizer::fast_enter returns.
>>>>
>>>> To handle this exception we can safely ignore biased lock towards
>>>> other threads
>>>> in the deopt handshake. Since such locks will always be revoked
>>>> before we
>>>> deopt/unpack stack.
>>>>
>>>> Code:
>>>> http://cr.openjdk.java.net/~rehn/8225351/v1/webrev/index.html
>>>
>>> First a little analysis:
>>>
>>> Our failing assert is here:
>>>
>>> src/hotspot/share/runtime/biasedLocking.cpp:
>>>
>>> BiasedLocking::Condition
>>> BiasedLocking::revoke_own_locks_in_handshake(Handle obj, TRAPS) {
>>>
>>> assert(mark->biased_locker() == THREAD &&
>>> prototype_header->bias_epoch() == mark->bias_epoch(),
>>> "Revoke failed, unhandled biased lock state");
>>>
>>> Since the function is called revoke_own_locks_in_handshake() it makes
>>> sense that we're checking that the locks are biased to THREAD
>>> (presumably
>>> the current thread since that's the TRAPS protocol). Just checking...
>>> Uhhhh... Looks like we can get to this code path from
>>> Deoptimization::deoptimize_frame_internal() when the current thread is
>>> the target thread *AND* when we are deoptimizing at a safepoint and the
>>> current thread is not the target thread. To my brain that makes use of
>>> the 'TRAPS' parameter "wrong", i.e., it should a "Thread* thread"
>>> parameter, but that's probably a carry over from older biased locking
>>> code.
>>>
>>> However, let's leave the topic of the "TRAPS" parameter alone for a
>>> separate investigation. It's entirely possible that my memory about the
>>> TRAPS protocol is rusty (or it has evolved since I last committed it to
>>> memory).
>>
>> Old code, changed to "JavaThread *thread".
>>
>>>
>>>
>>> src/hotspot/share/runtime/deoptimization.cpp
>>> old L1464: oop obj = (objects_to_revoke->at(i))();
>>> old L1465:
>>> BiasedLocking::revoke_own_locks_in_handshake(objects_to_revoke->at(i), thread);
>>>
>>>
>>> new L1486: Handle obj = objects_to_revoke->at(i);
>>> new L1487: markOop mark = obj->mark();
>>> new L1488: if (mark->has_bias_pattern()) {
>>> new L1489: if (mark->biased_locker() == thread) {
>>> new L1490: BiasedLocking::revoke_own_locks_in_handshake(obj,
>>> thread);
>>>
>>> So BiasedLocking::revoke_own_locks_in_handshake() has:
>>>
>>> > markOop mark = obj->mark();
>>> > if (!mark->has_bias_pattern()) {
>>> > return NOT_BIASED;
>>> > }
>>>
>>> which shows that the code previously recognized that it
>>> could be
>>> called with an 'obj' parameter that wasn't biased. Perhaps it
>>> would be better to change revoke_own_locks_in_handshake() to
>>> something like:
>>
>> We may not call mark->biased_locker() on a non-biased lock.
>> So this call is just to let us call biased_locker().
>> There is not a biased_locker_if_biased() method.
>>
>>>
>>> if (!mark->has_bias_pattern() || mark->biased_locker() !=
>>> THREAD) {
>>>
>>> Of course, that means we would be returning "NOT_BIASED" for
>>> the case where the target object _is_ biased, but just not
>>> biased to our target thread.
>>
>> Yes, I did not like that. And since the method is called 'own_locks'
>> it didn't
>> make sense to feed locks owned by other threads.
>>
>>>
>>> Another thought occurs to me: If this is happening during a
>>> handshake
>>> and not at a safepoint, then what prevents the object from
>>> passing
>>> new L1488 and L1489 in deoptimization.cpp, then we call
>>> revoke_own_locks_in_handshake() and pass
>>> "(!mark->has_bias_pattern())"
>>> only discover that the object has been rebiased to another
>>> thread at
>>> the assert? In other words, what happens if this revocation
>>> attempt
>>> loses a race with another revocation & rebias attempt?
>>
>> In this case current thread going to revoke that bias in the safepoint
>> happening
>> after the handshake. Any other thread will need todo a safepoint to
>> revoke it
>> also. Since we are in a handshake we know there is not going to be a
>> safepoint
>> until we finishes the handshake.
>> It's not possible to rebias to another thread without a safepoint when
>> the lock is biased to another thread.
>>
>>>
>>> L1493: #ifdef ASSERT
>>> L1494: lock_in_top_frame(obj, thread);
>>> L1495: #endif
>>> I think this should be:
>>>
>>> DEBUG_ONLY(lock_in_top_frame(obj, thread);)
>>
>> Fixed.
>>
>>>
>>> L1456: void lock_in_top_frame(Handle ho, JavaThread* thread) {
>>> L1463: if (mon_info->eliminated()) {
>>> L1464: continue;
>>> L1465: }
>>> L1473: assert(false, "Lock not found in top frame");
>>> I'm having problems with this sanity check. That function
>>> expects
>>> to find a BasicLock (via a MonitorInfo) that refers to the
>>> object whose bias we are trying to revoke. That check is
>>> assuming
>>> that a biased lock must be in the thread's top frame and I
>>> don't
>>> think that is a valid assumption.
>>>
>>> revoke_using_handshake() calls get_monitors_from_stack() and
>>> that
>>> function collects objects that have monitors from all
>>> vframes into
>>> 'objects_to_revoke'.
>>>
>>> Maybe I'm missing something obvious here?
>>
>> I hope above explains more.
>>
>> Thanks, Robbin
>>
>>>
>>> Dan
>>>
>>>> Issue:
>>>> https://bugs.openjdk.java.net/browse/JDK-8225351
>>>>
>>>> Passes t1-7
>>>> The assert code tested with local code changes to HandshakeAlot
>>>> handshake.
>>>> We then see this state where last lock can be biased towards another
>>>> thread and the thread is trying to execute revoke safepoint.
>>>>
>>>> Thanks, Robbin
>>>
More information about the hotspot-compiler-dev
mailing list