RFR 8191890: Biased locking still uses the inferior stop the world safepoint for revocation

Fri Jun 21 11:21:32 UTC 2019

Hi Patricio,

The JFR related stuff seems fine.

Thanks
Markus

-----Original Message-----
From: Patricio Chilano 
Sent: den 19 juni 2019 07:09
To: Coleen Phillimore <coleen.phillimore at oracle.com>; Markus Grönlund <markus.gronlund at oracle.com>; Robbin Ehn <robbin.ehn at oracle.com>; daniel.daugherty <daniel.daugherty at oracle.com>; David Holmes <david.holmes at oracle.com>
Cc: hotspot-runtime-dev at openjdk.java.net runtime <hotspot-runtime-dev at openjdk.java.net>
Subject: Re: RFR 8191890: Biased locking still uses the inferior stop the world safepoint for revocation

Hi all,

Here is v03, it contains the fixes suggested by Markus and Coleen:

Full: http://cr.openjdk.java.net/~pchilanomate/8191890/v03/webrev/
<http://cr.openjdk.java.net/%7Epchilanomate/8191890/v03/webrev/>
Inc: http://cr.openjdk.java.net/~pchilanomate/8191890/v03/inc/webrev/
<http://cr.openjdk.java.net/%7Epchilanomate/8191890/v03/inc/webrev/>

The only extra change that I made was moving the clean of the cache and the declaration of ResourceMark outside of walk_stack_and_revoke(), since that was preventing the optimization of using the cache when revoking with revoke(GrowableArray<Handle>* objs, JavaThread *).

Coleen: I didn't simplified methods revoke(GrowableArray<Handle>*
objs...) and revoke_at_safepoint(GrowableArray<Handle>* objs) after all. 
I tried to do it by replacing them with a single method that would mostly call walk_stack_and_revoke() for revocations, but that causes problems similar to 8225351, since the passed array might contain an object not biased towards the expected JavaThread.

Tested with mach5 tiers1-6.

Thanks!

Patricio

On 6/17/19 6:55 PM, Patricio Chilano wrote:
> Hi Coleen,
>
> On 6/17/19 4:55 PM, coleen.phillimore at oracle.com wrote:
>>
>> On 6/17/19 2:14 PM, Patricio Chilano wrote:
>>> Hi Coleen,
>>>
>>> On 6/14/19 7:08 PM, coleen.phillimore at oracle.com wrote:
>>>>
>>>> Sorry for being late to the party.
>>>>
>>>> http://cr.openjdk.java.net/~pchilanomate/8191890/v02/webrev/src/hot
>>>> spot/share/runtime/biasedLocking.cpp.frames.html
>>>>
>>>>
>>>> 586 if (_biased_locker == mark->biased_locker()) {
>>>> 587 if (mark->bias_epoch() == prototype->bias_epoch()) {
>>>>
>>>> Can you add a comment what this means?   The object's biased locker 
>>>> matches what we thought it was, and the epoch being the same means?
>>>> The epoch being equal means that this biaser actually might have 
>>>> this lock?   A comment would be good here.
>>> Yes, if the epoch is still valid it means the biaser could be 
>>> currently synchronized on this object. If that's the case then we 
>>> must walk its stack and change those monitor records into thin 
>>> locks. Added comment.
>>>
>>>
>>>> 785 mark = res_mark; // Refresh mark with the latest value.
>>>>
>>>>
>>>> I don't see what this does either. I had to download your patch. 
>>>> 'mark' isn't used outside the loop and it is reloaded at the top of 
>>>> the loop.
>>> If the CAS fails, the mark needs to be updated with the new value so 
>>> that when we get the current biaser (HR_SINGLE_REVOKE case) we 
>>> actually get the updated biaser and not the old one. If we don't do 
>>> that we could be handshaking the wrong thread, or worst we could hit 
>>> an assert in walk_stack_and_revoke() for the "blt == THREAD", since 
>>> the old thread could be ourselves.
>>
>> I found where it's used below now.  Maybe once this function is 
>> refactored a bit, it'll be easier to see for next time.  Looks good!
>>>
>>>
>>>> 796 obj->cas_set_mark(prototype_header->set_age(mark->age()), 
>>>> mark);
>>>>
>>>>
>>>> As an later enhancement, there should be some inline function in 
>>>> markOop that returns the prototype header preserving the age of the 
>>>> object, but I'll leave it to you to name.
>>> Ok, sounds good. That particular line was preexistent but I did 
>>> added in some places 
>>> "markOopDesc::prototype()->set_age(mark->age())", which is doing the 
>>> same thing.
>>>
>>>
>>>> 555 RevokeOneBias(Handle* obj, JavaThread* requesting_thread,
>>>> JavaThread* biased_locker) ... 565 oop o = (*_obj)();
>>>>
>>>>
>>>> This was pre-existing your change, but passing Handle* is not 
>>>> generally done, and is suspicious when it is because it must be 
>>>> allocated with the thread calling the function.  Can you change 
>>>> this to Handle (not pointer)?  I can't think why this would be done 
>>>> this way.
>>> Fixed.
>>>
>>>
>>
>> Great.  I was afraid there was some subtlety I didn't see. Please 
>> retest with this though just in case.  Sometimes bits of wierdness 
>> have a strange reason that isn't documented.
> Yes, I'll retest just in case.
>
>
>>>> 870 // All objects in objs should be locked by biaser
>>>> 871 void BiasedLocking::revoke(GrowableArray<Handle>* objs, 
>>>> JavaThread *biaser) {
>>>>
>>>>
>>>> I don't see why BiasedLocking::revoke() and
>>>> BiasedLocking::revoke_at_safepoint() are so different.
>>>>
>>>> The name "revoke" should be something more descriptive of the 
>>>> situation though, like revoke_for_current_thread() or something 
>>>> like that (revoke_at_safepoint's objects are from the stack too for 
>>>> the current thread...)  I keep thinking "revoke" should be a leaf 
>>>> function in biasedLocking.
>>> Yes, nice observation. Method
>>> revoke_at_safepoint(GrowableArray<Handle>* objs) could be removed 
>>> and we could just use BiasedLocking::revoke(GrowableArray<Handle>*
>>> objs ...) instead, since it's called from deoptimization.cpp where 
>>> all the objects in the array belong to the same JavaThread. The 
>>> difference is that we don't do update_heuristics() for the 
>>> non-safepoint case since it might trigger a bulk operation. For the 
>>> safepoint case it doesn't matter because we are already at one, we 
>>> don't have the overhead of requesting it. But I could combine them 
>>> into one method and do the update_heuristics() only if we are at a 
>>> safepoint, what do you think?
>>
>> You could file a follow-up RFE for this if you want, since the 
>> current version has gone through all the testing.
> I need to test again anyways and I think this could be a nice 
> simplification. I'll test it and try to include it in v3.
>
>> The change looks great!
> Thanks Coleen! : )
>
>
> Patricio
>> Coleen
>>>
>>>> The change looks really good to me and I look forward to further 
>>>> cleanups so maybe it'll make sense someday!
>>> Thanks for looking at this Coleen!
>>>
>>>
>>> Patricio
>>>> Thanks!
>>>> Coleen
>>>>
>>>> On 6/7/19 12:56 AM, Patricio Chilano wrote:
>>>>> Hi all,
>>>>>
>>>>> Here is v02 addressing comments made by Dan and David.
>>>>>
>>>>> Full webrev:
>>>>> http://cr.openjdk.java.net/~pchilanomate/8191890/v02/webrev/
>>>>> <http://cr.openjdk.java.net/%7Epchilanomate/8191890/v02/webrev/>
>>>>> Inc webrev:
>>>>> http://cr.openjdk.java.net/~pchilanomate/8191890/v02/inc/webrev/
>>>>> <http://cr.openjdk.java.net/%7Epchilanomate/8191890/v02/inc/>
>>>>>
>>>>> Thanks!
>>>>>
>>>>> Patricio
>>>>>
>>>>> On 6/6/19 7:37 PM, David Holmes wrote:
>>>>>> Hi Patricio,
>>>>>>
>>>>>> On 7/06/2019 9:19 am, Patricio Chilano wrote:
>>>>>>> Hi David,
>>>>>>>
>>>>>>>
>>>>>>> On 6/6/19 3:37 AM, David Holmes wrote:
>>>>>>>> Hi Patricio,
>>>>>>>>
>>>>>>>> First thanks for taking this on!
>>>>>>>>
>>>>>>>> I have some higher-level general discussion around this before 
>>>>>>>> deep diving into the actual code review (not that I have much 
>>>>>>>> there either :)).
>>>>>>>>
>>>>>>>> First to clarify how biased-locking works. I'm unclear when an 
>>>>>>>> object can be rebiased after having its bias revoked? This 
>>>>>>>> particularly relates to some of your assertions (as Markus has
>>>>>>>> queried) after the CAS to update the mark so that the bias is 
>>>>>>>> revoked, and you then re-read the mark and assert the bias has 
>>>>>>>> been revoked - what stops another thread from rebiasing the 
>>>>>>>> object in between those two statements? Is it that rebiasing 
>>>>>>>> cannot happen, or that it could only happen if there were an 
>>>>>>>> intervening safepoint which in turn cannot happen?
>>>>>>> Once the bias of the object is revoked it will stay like that 
>>>>>>> forever, it cannot happen that it goes back to having the 0x5 
>>>>>>> pattern.
>>>>>>> Also, once the bias pattern in the prototype header for a class 
>>>>>>> is revoked during a bulk revocation operation, if there is an 
>>>>>>> object of that class that still has the bias pattern, a 
>>>>>>> JavaThread that wants to synchronize on that object will always 
>>>>>>> revoke the bias first. This is why I don't check if the CAS 
>>>>>>> succeeded if the prototype of the class does not has the bias 
>>>>>>> pattern, I just assert that the object is not biased anymore.
>>>>>>>
>>>>>>> Below I describe the cases where an object can be rebiased.
>>>>>>>
>>>>>>> Once a JavaThread biases an object for the first time, there are 
>>>>>>> two cases that allows for that object to be rebiased:
>>>>>>> 1) If the epoch in the markword becomes invalid. For this to 
>>>>>>> happen a bulk rebias operation is needed. This is why I do check 
>>>>>>> if the CAS succeeded or not for these cases, since some other 
>>>>>>> JavaThread could have rebiased it.
>>>>>>> 2) During a full GC, objects that are biased ( some JavaThread 
>>>>>>> is set in the biaser bits) could have their markword be reset to 
>>>>>>> 0x5. This means they will become anonymously biased again and so 
>>>>>>> will look as if they were not biased yet. As to how this logic
>>>>>>> works: At the beginning of the full GC,
>>>>>>> BiasedLocking::preserve_marks() saves all the markwords for 
>>>>>>> those objects that are currently locked and have a bias pattern.
>>>>>>> After that, markOopDesc::must_be_preserved_with_bias() will be 
>>>>>>> called to decide if the markword of an object should be 
>>>>>>> preserved or not. If the markword contains the bias pattern it 
>>>>>>> is never preserved. At the end BiasedLocking::restore_marks() is 
>>>>>>> called to restore the marks for those objects that we saved 
>>>>>>> before. So this means that even if an object has a valid biaser, 
>>>>>>> with valid epoch, if the object is not currently locked it could 
>>>>>>> be reset during the GC. I'm not sure though if whenever
>>>>>>> markOopDesc::must_be_preserved_with_bias() returns false the 
>>>>>>> garbage collector always does the reset or it just means it 
>>>>>>> could reset it if it wants to. In any case I've seen that reset 
>>>>>>> happening when doing handshakes. In fact, this is one of the 
>>>>>>> reasons why the handshake could return that the bias was not 
>>>>>>> revoked, since I don't check for the anonymously biased case in 
>>>>>>> RevokeOneBias.
>>>>>>
>>>>>> Thanks for that very detailed set of descriptions. I won't 
>>>>>> pretend to fully grok all the details as I'm not completely clear 
>>>>>> on the role of the "epoch" or being anonymously biased, but I'm 
>>>>>> convinced you have a full understanding of such things. :) In 
>>>>>> revoke_and_rebias it was always a struggle for me to figure out 
>>>>>> exactly when the "rebias" part could come into play.
>>>>>>
>>>>>>>
>>>>>>>> The main concern with a change like this (as with all the 
>>>>>>>> handshake changes) is what new races this may allow and whether 
>>>>>>>> they have all been accounted for. IIUC the handshake will still 
>>>>>>>> be conducted by the VMThread so that still ensures 
>>>>>>>> serialization wrt. safepoints (which makes it simpler to reason 
>>>>>>>> about things). I've looked at some of the races you anticipated 
>>>>>>>> (like the "resurrected" thread) and they seem to be handled 
>>>>>>>> correctly. I'm unable to construct other races that might be 
>>>>>>>> problematic (but that isn't saying a lot :) ).
>>>>>>> I agree that since we are now doing the revocation outside 
>>>>>>> safepoints there is potential for additional races. But also one 
>>>>>>> thing to note is that RevokeOneBias, which contains the logic of 
>>>>>>> the handshake and is now replacing what we used to do at a 
>>>>>>> safepoint, is not really different from the initial code in
>>>>>>> revoke_and_rebias() which is done outside safepoints. The 
>>>>>>> handshake logic is like executing that initial part but with the 
>>>>>>> right JavaThread so that if the object has a valid biaser, then 
>>>>>>> that biaser is either ourselves or we are the VMThread while the 
>>>>>>> biaser is blocked, so that we can execute revoke_own_lock(). In 
>>>>>>> fact I was thinking at some point to combine them in some method 
>>>>>>> (maybe try_fast_revoke()). The attempt_rebias flag and the
>>>>>>> update_heuristics() in revoke_and_rebias() complicated things so 
>>>>>>> I kept them separate.
>>>>>>> I have also tried to think on all possible racy scenarios and 
>>>>>>> couldn't find additional problems beside the "resurrected 
>>>>>>> thread" one (although it's also not a guarantee of anything).
>>>>>>> But that's why I was thinking to check this in 14, so that if 
>>>>>>> there are any problems we have plenty of testing time to detect 
>>>>>>> them.
>>>>>>
>>>>>> Yes that is a good idea. No need to rush this into 13.
>>>>>>
>>>>>>>
>>>>>>>> src/hotspot/share/jfr/metadata/metadata.xml
>>>>>>>>
>>>>>>>> Is it the case that this event is now never generated from a 
>>>>>>>> safepoint? Or have you just deleted the safepoint_id from the 
>>>>>>>> event because it might not be at a safepoint? If the latter 
>>>>>>>> can't we keep it and use 0 to indicate "not at a safepoint"? I 
>>>>>>>> think the JFR folk need to comment on this part of the change 
>>>>>>>> anyway.
>>>>>>> This event will be created and commited only from 
>>>>>>> BiasedLocking::single_revoke_with_handshake(). Now, the actual 
>>>>>>> handshake that revoked the bias could be executed at a safepoint 
>>>>>>> only if ThreadLocalHandshakes is false. But I understand that 
>>>>>>> this is true for all platforms so the handshake should always be 
>>>>>>> executed outside safepoints.
>>>>>>
>>>>>> Ok.
>>>>>>
>>>>>>> It would be great if JFR folks review this part.
>>>>>>
>>>>>> Try to grab Markus :)
>>>>>>
>>>>>>>
>>>>>>>> src/hotspot/share/runtime/biasedLocking.cpp
>>>>>>>>
>>>>>>>> I second Dan's comment about combining cleanup and code motion 
>>>>>>>> in a big change like this - it does make it much harder to spot 
>>>>>>>> the real difference.
>>>>>>> Ok, already two objections on this so I'll revert moving the 
>>>>>>> heuristics part. I think I also moved
>>>>>>> clean_up_cached_monitor_info() and I will double check any other 
>>>>>>> movements.
>>>>>>>
>>>>>>>
>>>>>>>> I note Dan picked up on the lack of p2i and other stuff related 
>>>>>>>> to the logging statements, and that you indicated they were 
>>>>>>>> fixed. I note that all that stuff is pre-existing so I'm 
>>>>>>>> unclear now whether you have fixed all the logging in the file 
>>>>>>>> or only the statements in the code you have changed or added?
>>>>>>>> Again such cleanup may be best done separately.
>>>>>>> I haven't fixed the existing log statements, only the ones Dan 
>>>>>>> mentioned which are in single_revoke_with_handshake(), 
>>>>>>> revoke_own_lock(), and in VM_HandshakeOneThread(). Ok, I can fix 
>>>>>>> the other ones in a cleanup later along with code movement and 
>>>>>>> the removal of the attemp_rebias flag which we are not using.
>>>>>>
>>>>>> Okay. To be clear I don't expect you to fix all the existing uses 
>>>>>> I just wanted to clarify which ones you had fixed.
>>>>>>
>>>>>>>
>>>>>>>> 640 void BiasedLocking::revoke_own_lock(oop obj, JavaThread*
>>>>>>>> biased_locker) {
>>>>>>>> 641 assert(!SafepointSynchronize::is_at_safepoint() || 
>>>>>>>> !ThreadLocalHandshakes,
>>>>>>>> 642          "if ThreadLocalHandshakes is enabled this should 
>>>>>>>> always be executed outside safepoints");
>>>>>>>> 643   assert(Thread::current() == biased_locker || 
>>>>>>>> Thread::current()->is_VM_thread(), "wrong thread");
>>>>>>>> 644
>>>>>>>>
>>>>>>>> This is called "revoke_own_lock" but it can also be executed by 
>>>>>>>> the VMThread - so its not its own lock. Also we don't revoke 
>>>>>>>> anything related to a "lock" - we revoke a bias from the 
>>>>>>>> markword of an oop. I think a better name is needed.
>>>>>>> Yes, I didn't really like it either. How about
>>>>>>> walk_stack_and_revoke() ?
>>>>>>
>>>>>> That sounds good to me. Roll on v2 :)
>>>>>>
>>>>>> Thanks,
>>>>>> David
>>>>>> -----
>>>>>>
>>>>>>>
>>>>>>>> src/hotspot/share/runtime/handshake.cpp
>>>>>>>>
>>>>>>>>  125       log_trace(handshake)("JavaThread " INTPTR_FORMAT " 
>>>>>>>> is not alive", (intptr_t)_target);
>>>>>>>>
>>>>>>>> Use p2i(_target) rather than cast to intptr_t.
>>>>>>> Fixed.
>>>>>>>
>>>>>>>
>>>>>>>> That's all from me.
>>>>>>> Thanks for looking into this David! If you are okay with the 
>>>>>>> "walk_stack_and_revoke()" name then I can send v2.
>>>>>>>
>>>>>>>
>>>>>>> Thanks!
>>>>>>>
>>>>>>>
>>>>>>> Patricio
>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> David
>>>>>>>> -----
>>>>>>>>
>>>>>>>>
>>>>>>>> On 30/05/2019 2:29 am, Patricio Chilano wrote:
>>>>>>>>> Hi all,
>>>>>>>>>
>>>>>>>>> Could you review this patch that uses thread local handshakes 
>>>>>>>>> instead of safepoints to revoke the biases of locked objects?
>>>>>>>>>
>>>>>>>>> Webrev:
>>>>>>>>> http://cr.openjdk.java.net/~pchilanomate/8191890/v01/webrev/
>>>>>>>>> Bug:
>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8191890
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Today whenever a JavaThread needs to revoke the bias of an 
>>>>>>>>> object that has been biased by another JavaThread (and where 
>>>>>>>>> the epoch is still valid and the prototype header of the class 
>>>>>>>>> still has the bias pattern) it needs to request a safepoint 
>>>>>>>>> operation. The VMThread inside the safepoint walks the stack 
>>>>>>>>> of the biaser looking for lock records associated with the 
>>>>>>>>> biased object, and converts them to thin locks if any are found.
>>>>>>>>> This patch uses thread local handshakes instead, since we 
>>>>>>>>> actually only need to be able to safely walk the stack of the 
>>>>>>>>> JavaThread that biased the object and not other JavaThreads.
>>>>>>>>>
>>>>>>>>> Some notes about the patch:
>>>>>>>>>
>>>>>>>>> - Thanks to Robbin for initial work on this patch and for 
>>>>>>>>> advice and feedback!
>>>>>>>>> - We still execute bulk rebias and bulk revoke operations 
>>>>>>>>> inside safepoints, since in those cases all the JavaThread's 
>>>>>>>>> stacks need to be walked to potentially update lock records.
>>>>>>>>> - The method revoke_bias() was renamed to 
>>>>>>>>> single_revoke_at_safepoint(). This method is still kept 
>>>>>>>>> because there are places where we check whether we are already 
>>>>>>>>> at safepoint when trying to revoke. In those cases, if we are 
>>>>>>>>> already at a safepoint we simply end up calling this method.
>>>>>>>>> - Handshakes are executed as VMOperations so the VMThread is 
>>>>>>>>> still involved in the revocation. This means we cannot have 
>>>>>>>>> different revocations being executed in parallel (same as with 
>>>>>>>>> safepoints). Ideally we would like to execute thread local 
>>>>>>>>> handshakes without needing for the VMThread to participate.
>>>>>>>>> However, now other JavaThreads that do not participate in the 
>>>>>>>>> revocation are allow to continue making progress.
>>>>>>>>>
>>>>>>>>> Run several benchmarks and mostly performance seems 
>>>>>>>>> unaffected. Measured the average time it takes for revoking 
>>>>>>>>> bias with a handshake and with a safepoint and numbers are 
>>>>>>>>> pretty similar varying between benchmarks. Some numbers are 
>>>>>>>>> shown below:
>>>>>>>>>
>>>>>>>>> specjbb2015
>>>>>>>>>                       Handshakes      Safepoints Linux                        
>>>>>>>>> 4ms            4.6ms Windows                 11ms             
>>>>>>>>> 19ms
>>>>>>>>>
>>>>>>>>> startup benchmarks
>>>>>>>>>                      Handshakes      Safepoints Linux                    
>>>>>>>>> 159us             248us Windows               150us             
>>>>>>>>> 111us
>>>>>>>>>
>>>>>>>>> Overall the variation is not enough to show significant 
>>>>>>>>> difference in performance, considering also that revocations 
>>>>>>>>> of a valid biaser are usually a fraction of the overall 
>>>>>>>>> running time of a benchmark (specially jbb2015). In any case 
>>>>>>>>> using handshakes allows other JavaThreads to make progress 
>>>>>>>>> during that time, minimizing STW operations.
>>>>>>>>>
>>>>>>>>> In terms of testing, the patch passed several runs of tiers1-6 
>>>>>>>>> in mach5 on Windows, Linux, MacOS and Solaris.
>>>>>>>>>
>>>>>>>>> Thanks!
>>>>>>>>>
>>>>>>>>> Patricio
>>>>>>>>>
>>>>>>>
>>>>>
>>>>
>>>
>>
>