RFR: 8225351(13): assert failed: Revoke failed, unhandled biased lock state

David Holmes david.holmes at oracle.com
Thu Jun 20 04:03:28 UTC 2019


Hi Dan,

 > Maybe I'm the only reviewer that had trouble with the new check and,
 > if so, I'm sorry about that.

To this point you are the only reviewer and this potential reviewer was 
waiting to see how this discussion resolved :) Now I have something to 
base my review on - tomorrow (sorry Robbin, still lagging way behind 
after my travels.)

Cheers,
David

On 18/06/2019 1:46 am, Daniel D. Daugherty wrote:
> Hi Robbin,
> 
> I'm still having trouble with this. Please bear with me below...
> 
> 
> On 6/18/19 3:12 AM, Robbin Ehn wrote:
>> Hi Dan,
>>
>> On 2019-06-17 22:34, Daniel D. Daugherty wrote:
>>> Hi Robbin,
>>>
>>> Okay so I read this email and then the bug report. The conclusion sounds
>>> like we have an assert() that is a little too aggressive so I moved 
>>> on to
>>> the webrev... I expected to find a refinement to an assert(), but that's
>>> not quite what I see...
>>>
>>> Let me re-read everything again...
>>>
>>> Hmmm... still reads the same, but let me dive into the code instead...
>>>
>>>
>>> On 6/14/19 4:03 AM, Robbin Ehn wrote:
>>>> Hi all, please review.
>>
>> To answer:
>> >          I'm having problems with this sanity check. That function 
>> expects
>> >          to find a BasicLock (via a MonitorInfo) that refers to the
>> >          object whose bias we are trying to revoke. That check is 
>> assuming
>> >          that a biased lock must be in the thread's top frame and I 
>> don't
>> >          think that is a valid assumption.
>>
>>>>
>>>> When looking at a JavaThreads locks we retrieve them per frame via 
>>>> it's monitors
>>>> list. How this list actually populated differs from frame type. If a 
>>>> JavaThread tries to enter a new monitor a basic lock is 
>>>> directly/indirectly via e.g. scope info added to the virtual monitor 
>>>> list. 
>>
>> The lock is taken in this current java frame, the scope desc about the 
>> lock is thus added to current frame. When fast path fails we call into 
>> to VM to get the
>> lock. This MonitorInfo (generated from the scope desc) points to the 
>> "biased to
>> other thread lock" must thus be in top java frame.
>> This is the only case when a biased lock towards another thread will 
>> be on our
>> stack, when we have not yet had time to revoke it.
>>
>> I don't want another case falling through the assert, if anything I'm 
>> saying
>> would be incorrect.
>>
>> Unfortunate the monitor info is not guaranteed to be in order 
>> otherwise this
>> assert would also check this is the last lock in the top frame.
>> (at least interpreter can have them out-of-order, this is compiled 
>> frames, but I didn't feel sure about it)
> 
> I attached the hs_err_pid file from Mikael's original Mach5 sighting
> to the bug report. Let me get oriented here by looking at the crashing
> stack:
> 
> V  [jvm.dll+0x50bd22]  report_vm_error+0x102  (debug.cpp:264)
> V  [jvm.dll+0x2d553e] BiasedLocking::revoke_own_locks_in_handshake+0xfe 
> (biasedlocking.cpp:640)
>                         Deoptimization::revoke_using_handshake() 
> optimized out
> V  [jvm.dll+0x51a6f5]  Deoptimization::deopt_thread+0x1b5 
> (deoptimization.cpp:1518)
> V  [jvm.dll+0x51a960]  Deoptimization::deoptimize+0x50 
> (deoptimization.cpp:1500)
> V  [jvm.dll+0xc7bca2]  JavaThread::deoptimize_marked_methods+0xe2 
> (thread.cpp:2840)
> V  [jvm.dll+0x680f89] HandshakeThreadsOperation::do_handshake+0x189 
> (handshake.cpp:248)
> V  [jvm.dll+0x68144f]  VM_HandshakeAllThreads::doit+0x36f 
> (handshake.cpp:193)
> V  [jvm.dll+0xcdf18e]  VM_Operation::evaluate+0xbe (vmoperations.cpp:71)
> V  [jvm.dll+0xce42a8]  VMThread::evaluate_operation+0xb8 (vmthread.cpp:406)
> V  [jvm.dll+0xce507f]  VMThread::loop+0x84f  (vmthread.cpp:616)
> V  [jvm.dll+0xce5708]  VMThread::run+0xd8  (vmthread.cpp:305)
> V  [jvm.dll+0xc79232]  Thread::call_run+0x192  (thread.cpp:405)
> V  [jvm.dll+0xad6f0e]  thread_native_entry+0x10e (os_windows.cpp:471)
> 
> Okay, so deopt_thread() is called with the 'frame fr' that we are trying
> to deopt and it calls revoke_using_handshake() with the same 'frame fr'.
> revoke_using_handshake() calls  get_monitors_from_stack() passing the
> same 'frame fr' and populates 'objects_to_revoke'. We go thru the list
> of objects that need to be revoked and call revoke_own_locks_in_handshake()
> on each one. It is in one of the revoke_own_locks_in_handshake() calls
> that we fail the assertion because the object is biased towards another
> thread.
> 
> Okay, so now I'm oriented in the code and see where the assertion fails
> and where you changed to code so that we don't call that code if we see
> that the assertion is going to fail. I'm good with that part of the fix
> and I didn't make that clear before. Sorry about that.
> 
> 
> For the else-branch where you added the call to the lock_in_top_frame():
> 
> L1488:     if (mark->has_bias_pattern()) {
> L1489:       if (mark->biased_locker() == thread) {
> <snip>
> L1492:       } else {
> L1493:         #ifdef ASSERT
> L1494:         lock_in_top_frame(obj, thread);
> L1495:         #endif
> L1496:       }
> 
> You are still in the "has_bias_pattern" if-statement so we know the
> object is biased. You are in the else-branch of comparing the biased
> locker to the target thread so you know that the object is biased to
> some other thread.
> 
> Okay so now I'm oriented in the new checking code. The target thread
> has a BasicLock on its stack that refers to the object, but the object
> is biased to another thread so the target thread is contending on the
> lock which means that the frame _must be_ the top frame because the
> target thread is blocked.
> 
> I (finally) see why the new check is okay. Maybe I'm the only
> reviewer that had trouble with the new check and, if so, I'm
> sorry about that.
> 
> Please consider the following:
> 
> L1492:       } else {
>                 // Biased to another thread means 'thread' is
>                 // contending for the lock in the top frame.
>                 DEBUG_ONLY(verify_BasicLock_in_top_frame(obj, thread);
>               }
> 
> Thumbs up! Your call on whether to make more changes to this fix.
> 
> Dan
> 
>>
>> If this lock is biased
>>>> towards another
>>>> JavaThread we try to revoke that bias with a safepoint. In this case 
>>>> a deopt
>>>> handshake is already in queue. The handshake is thus executed before 
>>>> the revoke
>>>> safepoint.
>>>> The handshake goes over the monitors in compiled frames, find this 
>>>> lock and we
>>>> hit the assert. The assert make sure we actual can revoke the lock. 
>>>> A basic lock
>>>> on stack should always, if biased, be biased to current thread, with 
>>>> the
>>>> exception:
>>>> We may have a stack lock biased against another thread until
>>>> ObjectSynchronizer::fast_enter returns.
>>>>
>>>> To handle this exception we can safely ignore biased lock towards 
>>>> other threads
>>>> in the deopt handshake. Since such locks will always be revoked 
>>>> before we
>>>> deopt/unpack stack.
>>>>
>>>> Code:
>>>> http://cr.openjdk.java.net/~rehn/8225351/v1/webrev/index.html
>>>
>>> First a little analysis:
>>>
>>> Our failing assert is here:
>>>
>>> src/hotspot/share/runtime/biasedLocking.cpp:
>>>
>>> BiasedLocking::Condition 
>>> BiasedLocking::revoke_own_locks_in_handshake(Handle obj, TRAPS) {
>>>
>>>    assert(mark->biased_locker() == THREAD &&
>>>           prototype_header->bias_epoch() == mark->bias_epoch(), 
>>> "Revoke failed, unhandled biased lock state");
>>>
>>> Since the function is called revoke_own_locks_in_handshake() it makes
>>> sense that we're checking that the locks are biased to THREAD 
>>> (presumably
>>> the current thread since that's the TRAPS protocol). Just checking...
>>> Uhhhh... Looks like we can get to this code path from
>>> Deoptimization::deoptimize_frame_internal() when the current thread is
>>> the target thread *AND* when we are deoptimizing at a safepoint and the
>>> current thread is not the target thread. To my brain that makes use of
>>> the 'TRAPS' parameter "wrong", i.e., it should a "Thread* thread"
>>> parameter, but that's probably a carry over from older biased locking
>>> code.
>>>
>>> However, let's leave the topic of the "TRAPS" parameter alone for a
>>> separate investigation. It's entirely possible that my memory about the
>>> TRAPS protocol is rusty (or it has evolved since I last committed it to
>>> memory).
>>
>> Old code, changed to "JavaThread *thread".
>>
>>>
>>>
>>> src/hotspot/share/runtime/deoptimization.cpp
>>>      old L1464:     oop obj = (objects_to_revoke->at(i))();
>>>      old L1465: 
>>> BiasedLocking::revoke_own_locks_in_handshake(objects_to_revoke->at(i), thread); 
>>>
>>>
>>>      new L1486:     Handle obj = objects_to_revoke->at(i);
>>>      new L1487:     markOop mark = obj->mark();
>>>      new L1488:     if (mark->has_bias_pattern()) {
>>>      new L1489:       if (mark->biased_locker() == thread) {
>>>      new L1490: BiasedLocking::revoke_own_locks_in_handshake(obj, 
>>> thread);
>>>
>>>          So BiasedLocking::revoke_own_locks_in_handshake() has:
>>>
>>>          >  markOop mark = obj->mark();
>>>          >  if (!mark->has_bias_pattern()) {
>>>          >    return NOT_BIASED;
>>>          >  }
>>>
>>>          which shows that the code previously recognized that it 
>>> could be
>>>          called with an 'obj' parameter that wasn't biased. Perhaps it
>>>          would be better to change revoke_own_locks_in_handshake() to
>>>          something like:
>>
>> We may not call mark->biased_locker() on a non-biased lock.
>> So this call is just to let us call biased_locker().
>> There is not a biased_locker_if_biased() method.
>>
>>>
>>>          if (!mark->has_bias_pattern() || mark->biased_locker() != 
>>> THREAD) {
>>>
>>>          Of course, that means we would be returning "NOT_BIASED" for
>>>          the case where the target object _is_ biased, but just not
>>>          biased to our target thread.
>>
>> Yes, I did not like that. And since the method is called 'own_locks' 
>> it didn't
>> make sense to feed locks owned by other threads.
>>
>>>
>>>          Another thought occurs to me: If this is happening during a 
>>> handshake
>>>          and not at a safepoint, then what prevents the object from 
>>> passing
>>>          new L1488 and L1489 in deoptimization.cpp, then we call
>>>          revoke_own_locks_in_handshake() and pass 
>>> "(!mark->has_bias_pattern())"
>>>          only discover that the object has been rebiased to another 
>>> thread at
>>>          the assert? In other words, what happens if this revocation 
>>> attempt
>>>          loses a race with another revocation & rebias attempt?
>>
>> In this case current thread going to revoke that bias in the safepoint 
>> happening
>> after the handshake. Any other thread will need todo a safepoint to 
>> revoke it
>> also. Since we are in a handshake we know there is not going to be a 
>> safepoint
>> until we finishes the handshake.
>> It's not possible to rebias to another thread without a safepoint when 
>> the lock is biased to another thread.
>>
>>>
>>>      L1493:         #ifdef ASSERT
>>>      L1494:         lock_in_top_frame(obj, thread);
>>>      L1495:         #endif
>>>          I think this should be:
>>>
>>>                      DEBUG_ONLY(lock_in_top_frame(obj, thread);)
>>
>> Fixed.
>>
>>>
>>>      L1456: void lock_in_top_frame(Handle ho, JavaThread* thread) {
>>>      L1463:     if (mon_info->eliminated()) {
>>>      L1464:       continue;
>>>      L1465:     }
>>>      L1473:   assert(false, "Lock not found in top frame");
>>>          I'm having problems with this sanity check. That function 
>>> expects
>>>          to find a BasicLock (via a MonitorInfo) that refers to the
>>>          object whose bias we are trying to revoke. That check is 
>>> assuming
>>>          that a biased lock must be in the thread's top frame and I 
>>> don't
>>>          think that is a valid assumption.
>>>
>>>          revoke_using_handshake() calls get_monitors_from_stack() and 
>>> that
>>>          function collects objects that have monitors from all 
>>> vframes into
>>>          'objects_to_revoke'.
>>>
>>>          Maybe I'm missing something obvious here?
>>
>> I hope above explains more.
>>
>> Thanks, Robbin
>>
>>>
>>> Dan
>>>
>>>> Issue:
>>>> https://bugs.openjdk.java.net/browse/JDK-8225351
>>>>
>>>> Passes t1-7
>>>> The assert code tested with local code changes to HandshakeAlot 
>>>> handshake.
>>>> We then see this state where last lock can be biased towards another 
>>>> thread and the thread is trying to execute revoke safepoint.
>>>>
>>>> Thanks, Robbin
>>>


More information about the hotspot-compiler-dev mailing list