RFR 8191890: Biased locking still uses the inferior stop the world safepoint for revocation

Mon Jun 17 22:55:17 UTC 2019

Hi Coleen,

On 6/17/19 4:55 PM, coleen.phillimore at oracle.com wrote:
>
> On 6/17/19 2:14 PM, Patricio Chilano wrote:
>> Hi Coleen,
>>
>> On 6/14/19 7:08 PM, coleen.phillimore at oracle.com wrote:
>>>
>>> Sorry for being late to the party.
>>>
>>> http://cr.openjdk.java.net/~pchilanomate/8191890/v02/webrev/src/hotspot/share/runtime/biasedLocking.cpp.frames.html 
>>>
>>>
>>> 586 if (_biased_locker == mark->biased_locker()) {
>>> 587 if (mark->bias_epoch() == prototype->bias_epoch()) {
>>>
>>> Can you add a comment what this means?   The object's biased locker 
>>> matches what we thought it was, and the epoch being the same means?
>>> The epoch being equal means that this biaser actually might have 
>>> this lock?   A comment would be good here.
>> Yes, if the epoch is still valid it means the biaser could be 
>> currently synchronized on this object. If that's the case then we 
>> must walk its stack and change those monitor records into thin locks. 
>> Added comment.
>>
>>
>>> 785 mark = res_mark; // Refresh mark with the latest value.
>>>
>>>
>>> I don't see what this does either. I had to download your patch. 
>>> 'mark' isn't used outside the loop and it is reloaded at the top of 
>>> the loop.
>> If the CAS fails, the mark needs to be updated with the new value so 
>> that when we get the current biaser (HR_SINGLE_REVOKE case) we 
>> actually get the updated biaser and not the old one. If we don't do 
>> that we could be handshaking the wrong thread, or worst we could hit 
>> an assert in walk_stack_and_revoke() for the "blt == THREAD", since 
>> the old thread could be ourselves.
>
> I found where it's used below now.  Maybe once this function is 
> refactored a bit, it'll be easier to see for next time.  Looks good!
>>
>>
>>> 796 obj->cas_set_mark(prototype_header->set_age(mark->age()), mark);
>>>
>>>
>>> As an later enhancement, there should be some inline function in 
>>> markOop that returns the prototype header preserving the age of the 
>>> object, but I'll leave it to you to name.
>> Ok, sounds good. That particular line was preexistent but I did added 
>> in some places "markOopDesc::prototype()->set_age(mark->age())", 
>> which is doing the same thing.
>>
>>
>>> 555 RevokeOneBias(Handle* obj, JavaThread* requesting_thread, 
>>> JavaThread* biased_locker) ... 565 oop o = (*_obj)();
>>>
>>>
>>> This was pre-existing your change, but passing Handle* is not 
>>> generally done, and is suspicious when it is because it must be 
>>> allocated with the thread calling the function.  Can you change this 
>>> to Handle (not pointer)?  I can't think why this would be done this 
>>> way.
>> Fixed.
>>
>>
>
> Great.  I was afraid there was some subtlety I didn't see.  Please 
> retest with this though just in case.  Sometimes bits of wierdness 
> have a strange reason that isn't documented.
Yes, I'll retest just in case.

>>> 870 // All objects in objs should be locked by biaser
>>> 871 void BiasedLocking::revoke(GrowableArray<Handle>* objs, 
>>> JavaThread *biaser) {
>>>
>>>
>>> I don't see why BiasedLocking::revoke() and 
>>> BiasedLocking::revoke_at_safepoint() are so different.
>>>
>>> The name "revoke" should be something more descriptive of the 
>>> situation though, like revoke_for_current_thread() or something like 
>>> that (revoke_at_safepoint's objects are from the stack too for the 
>>> current thread...)  I keep thinking "revoke" should be a leaf 
>>> function in biasedLocking.
>> Yes, nice observation. Method 
>> revoke_at_safepoint(GrowableArray<Handle>* objs) could be removed and 
>> we could just use BiasedLocking::revoke(GrowableArray<Handle>* objs 
>> ...) instead, since it's called from deoptimization.cpp where all the 
>> objects in the array belong to the same JavaThread. The difference is 
>> that we don't do update_heuristics() for the non-safepoint case since 
>> it might trigger a bulk operation. For the safepoint case it doesn't 
>> matter because we are already at one, we don't have the overhead of 
>> requesting it. But I could combine them into one method and do the 
>> update_heuristics() only if we are at a safepoint, what do you think?
>
> You could file a follow-up RFE for this if you want, since the current 
> version has gone through all the testing.
I need to test again anyways and I think this could be a nice 
simplification. I'll test it and try to include it in v3.

> The change looks great!
Thanks Coleen! : )

Patricio
> Coleen
>>
>>> The change looks really good to me and I look forward to further 
>>> cleanups so maybe it'll make sense someday!
>> Thanks for looking at this Coleen!
>>
>>
>> Patricio
>>> Thanks!
>>> Coleen
>>>
>>> On 6/7/19 12:56 AM, Patricio Chilano wrote:
>>>> Hi all,
>>>>
>>>> Here is v02 addressing comments made by Dan and David.
>>>>
>>>> Full webrev:
>>>> http://cr.openjdk.java.net/~pchilanomate/8191890/v02/webrev/ 
>>>> <http://cr.openjdk.java.net/%7Epchilanomate/8191890/v02/webrev/>
>>>> Inc webrev:
>>>> http://cr.openjdk.java.net/~pchilanomate/8191890/v02/inc/webrev/ 
>>>> <http://cr.openjdk.java.net/%7Epchilanomate/8191890/v02/inc/>
>>>>
>>>> Thanks!
>>>>
>>>> Patricio
>>>>
>>>> On 6/6/19 7:37 PM, David Holmes wrote:
>>>>> Hi Patricio,
>>>>>
>>>>> On 7/06/2019 9:19 am, Patricio Chilano wrote:
>>>>>> Hi David,
>>>>>>
>>>>>>
>>>>>> On 6/6/19 3:37 AM, David Holmes wrote:
>>>>>>> Hi Patricio,
>>>>>>>
>>>>>>> First thanks for taking this on!
>>>>>>>
>>>>>>> I have some higher-level general discussion around this before 
>>>>>>> deep diving into the actual code review (not that I have much 
>>>>>>> there either :)).
>>>>>>>
>>>>>>> First to clarify how biased-locking works. I'm unclear when an 
>>>>>>> object can be rebiased after having its bias revoked? This 
>>>>>>> particularly relates to some of your assertions (as Markus has 
>>>>>>> queried) after the CAS to update the mark so that the bias is 
>>>>>>> revoked, and you then re-read the mark and assert the bias has 
>>>>>>> been revoked - what stops another thread from rebiasing the 
>>>>>>> object in between those two statements? Is it that rebiasing 
>>>>>>> cannot happen, or that it could only happen if there were an 
>>>>>>> intervening safepoint which in turn cannot happen?
>>>>>> Once the bias of the object is revoked it will stay like that 
>>>>>> forever, it cannot happen that it goes back to having the 0x5 
>>>>>> pattern.
>>>>>> Also, once the bias pattern in the prototype header for a class 
>>>>>> is revoked during a bulk revocation operation, if there is an 
>>>>>> object of that class that still has the bias pattern, a 
>>>>>> JavaThread that wants to synchronize on that object will always 
>>>>>> revoke the bias first. This is why I don't check if the CAS 
>>>>>> succeeded if the prototype of the class does not has the bias 
>>>>>> pattern, I just assert that the object is not biased anymore.
>>>>>>
>>>>>> Below I describe the cases where an object can be rebiased.
>>>>>>
>>>>>> Once a JavaThread biases an object for the first time, there are 
>>>>>> two cases that allows for that object to be rebiased:
>>>>>> 1) If the epoch in the markword becomes invalid. For this to 
>>>>>> happen a bulk rebias operation is needed. This is why I do check 
>>>>>> if the CAS succeeded or not for these cases, since some other 
>>>>>> JavaThread could have rebiased it.
>>>>>> 2) During a full GC, objects that are biased ( some JavaThread is 
>>>>>> set in the biaser bits) could have their markword be reset to 
>>>>>> 0x5. This means they will become anonymously biased again and so 
>>>>>> will look as if they were not biased yet. As to how this logic 
>>>>>> works: At the beginning of the full GC, 
>>>>>> BiasedLocking::preserve_marks() saves all the markwords for those 
>>>>>> objects that are currently locked and have a bias pattern. After 
>>>>>> that, markOopDesc::must_be_preserved_with_bias() will be called 
>>>>>> to decide if the markword of an object should be preserved or 
>>>>>> not. If the markword contains the bias pattern it is never 
>>>>>> preserved. At the end BiasedLocking::restore_marks() is called to 
>>>>>> restore the marks for those objects that we saved before. So this 
>>>>>> means that even if an object has a valid biaser, with valid 
>>>>>> epoch, if the object is not currently locked it could be reset 
>>>>>> during the GC. I'm not sure though if whenever 
>>>>>> markOopDesc::must_be_preserved_with_bias() returns false the 
>>>>>> garbage collector always does the reset or it just means it could 
>>>>>> reset it if it wants to. In any case I've seen that reset 
>>>>>> happening when doing handshakes. In fact, this is one of the 
>>>>>> reasons why the handshake could return that the bias was not 
>>>>>> revoked, since I don't check for the anonymously biased case in 
>>>>>> RevokeOneBias.
>>>>>
>>>>> Thanks for that very detailed set of descriptions. I won't pretend 
>>>>> to fully grok all the details as I'm not completely clear on the 
>>>>> role of the "epoch" or being anonymously biased, but I'm convinced 
>>>>> you have a full understanding of such things. :) In 
>>>>> revoke_and_rebias it was always a struggle for me to figure out 
>>>>> exactly when the "rebias" part could come into play.
>>>>>
>>>>>>
>>>>>>> The main concern with a change like this (as with all the 
>>>>>>> handshake changes) is what new races this may allow and whether 
>>>>>>> they have all been accounted for. IIUC the handshake will still 
>>>>>>> be conducted by the VMThread so that still ensures serialization 
>>>>>>> wrt. safepoints (which makes it simpler to reason about things). 
>>>>>>> I've looked at some of the races you anticipated (like the 
>>>>>>> "resurrected" thread) and they seem to be handled correctly. I'm 
>>>>>>> unable to construct other races that might be problematic (but 
>>>>>>> that isn't saying a lot :) ).
>>>>>> I agree that since we are now doing the revocation outside 
>>>>>> safepoints there is potential for additional races. But also one 
>>>>>> thing to note is that RevokeOneBias, which contains the logic of 
>>>>>> the handshake and is now replacing what we used to do at a 
>>>>>> safepoint, is not really different from the initial code in 
>>>>>> revoke_and_rebias() which is done outside safepoints. The 
>>>>>> handshake logic is like executing that initial part but with the 
>>>>>> right JavaThread so that if the object has a valid biaser, then 
>>>>>> that biaser is either ourselves or we are the VMThread while the 
>>>>>> biaser is blocked, so that we can execute revoke_own_lock(). In 
>>>>>> fact I was thinking at some point to combine them in some method 
>>>>>> (maybe try_fast_revoke()). The attempt_rebias flag and the 
>>>>>> update_heuristics() in revoke_and_rebias() complicated things so 
>>>>>> I kept them separate.
>>>>>> I have also tried to think on all possible racy scenarios and 
>>>>>> couldn't find additional problems beside the "resurrected thread" 
>>>>>> one (although it's also not a guarantee of anything). But that's 
>>>>>> why I was thinking to check this in 14, so that if there are any 
>>>>>> problems we have plenty of testing time to detect them.
>>>>>
>>>>> Yes that is a good idea. No need to rush this into 13.
>>>>>
>>>>>>
>>>>>>> src/hotspot/share/jfr/metadata/metadata.xml
>>>>>>>
>>>>>>> Is it the case that this event is now never generated from a 
>>>>>>> safepoint? Or have you just deleted the safepoint_id from the 
>>>>>>> event because it might not be at a safepoint? If the latter 
>>>>>>> can't we keep it and use 0 to indicate "not at a safepoint"? I 
>>>>>>> think the JFR folk need to comment on this part of the change 
>>>>>>> anyway.
>>>>>> This event will be created and commited only from 
>>>>>> BiasedLocking::single_revoke_with_handshake(). Now, the actual 
>>>>>> handshake that revoked the bias could be executed at a safepoint 
>>>>>> only if ThreadLocalHandshakes is false. But I understand that 
>>>>>> this is true for all platforms so the handshake should always be 
>>>>>> executed outside safepoints.
>>>>>
>>>>> Ok.
>>>>>
>>>>>> It would be great if JFR folks review this part.
>>>>>
>>>>> Try to grab Markus :)
>>>>>
>>>>>>
>>>>>>> src/hotspot/share/runtime/biasedLocking.cpp
>>>>>>>
>>>>>>> I second Dan's comment about combining cleanup and code motion 
>>>>>>> in a big change like this - it does make it much harder to spot 
>>>>>>> the real difference.
>>>>>> Ok, already two objections on this so I'll revert moving the 
>>>>>> heuristics part. I think I also moved 
>>>>>> clean_up_cached_monitor_info() and I will double check any other 
>>>>>> movements.
>>>>>>
>>>>>>
>>>>>>> I note Dan picked up on the lack of p2i and other stuff related 
>>>>>>> to the logging statements, and that you indicated they were 
>>>>>>> fixed. I note that all that stuff is pre-existing so I'm unclear 
>>>>>>> now whether you have fixed all the logging in the file or only 
>>>>>>> the statements in the code you have changed or added? Again such 
>>>>>>> cleanup may be best done separately.
>>>>>> I haven't fixed the existing log statements, only the ones Dan 
>>>>>> mentioned which are in single_revoke_with_handshake(), 
>>>>>> revoke_own_lock(), and in VM_HandshakeOneThread(). Ok, I can fix 
>>>>>> the other ones in a cleanup later along with code movement and 
>>>>>> the removal of the attemp_rebias flag which we are not using.
>>>>>
>>>>> Okay. To be clear I don't expect you to fix all the existing uses 
>>>>> I just wanted to clarify which ones you had fixed.
>>>>>
>>>>>>
>>>>>>> 640 void BiasedLocking::revoke_own_lock(oop obj, JavaThread* 
>>>>>>> biased_locker) {
>>>>>>> 641   assert(!SafepointSynchronize::is_at_safepoint() || 
>>>>>>> !ThreadLocalHandshakes,
>>>>>>> 642          "if ThreadLocalHandshakes is enabled this should 
>>>>>>> always be executed outside safepoints");
>>>>>>> 643   assert(Thread::current() == biased_locker || 
>>>>>>> Thread::current()->is_VM_thread(), "wrong thread");
>>>>>>> 644
>>>>>>>
>>>>>>> This is called "revoke_own_lock" but it can also be executed by 
>>>>>>> the VMThread - so its not its own lock. Also we don't revoke 
>>>>>>> anything related to a "lock" - we revoke a bias from the 
>>>>>>> markword of an oop. I think a better name is needed.
>>>>>> Yes, I didn't really like it either. How about 
>>>>>> walk_stack_and_revoke() ?
>>>>>
>>>>> That sounds good to me. Roll on v2 :)
>>>>>
>>>>> Thanks,
>>>>> David
>>>>> -----
>>>>>
>>>>>>
>>>>>>> src/hotspot/share/runtime/handshake.cpp
>>>>>>>
>>>>>>>  125       log_trace(handshake)("JavaThread " INTPTR_FORMAT " is 
>>>>>>> not alive", (intptr_t)_target);
>>>>>>>
>>>>>>> Use p2i(_target) rather than cast to intptr_t.
>>>>>> Fixed.
>>>>>>
>>>>>>
>>>>>>> That's all from me.
>>>>>> Thanks for looking into this David! If you are okay with the 
>>>>>> "walk_stack_and_revoke()" name then I can send v2.
>>>>>>
>>>>>>
>>>>>> Thanks!
>>>>>>
>>>>>>
>>>>>> Patricio
>>>>>>
>>>>>>> Thanks,
>>>>>>> David
>>>>>>> -----
>>>>>>>
>>>>>>>
>>>>>>> On 30/05/2019 2:29 am, Patricio Chilano wrote:
>>>>>>>> Hi all,
>>>>>>>>
>>>>>>>> Could you review this patch that uses thread local handshakes 
>>>>>>>> instead of safepoints to revoke the biases of locked objects?
>>>>>>>>
>>>>>>>> Webrev:
>>>>>>>> http://cr.openjdk.java.net/~pchilanomate/8191890/v01/webrev/
>>>>>>>> Bug:
>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8191890
>>>>>>>>
>>>>>>>>
>>>>>>>> Today whenever a JavaThread needs to revoke the bias of an 
>>>>>>>> object that has been biased by another JavaThread (and where 
>>>>>>>> the epoch is still valid and the prototype header of the class 
>>>>>>>> still has the bias pattern) it needs to request a safepoint 
>>>>>>>> operation. The VMThread inside the safepoint walks the stack of 
>>>>>>>> the biaser looking for lock records associated with the biased 
>>>>>>>> object, and converts them to thin locks if any are found.
>>>>>>>> This patch uses thread local handshakes instead, since we 
>>>>>>>> actually only need to be able to safely walk the stack of the 
>>>>>>>> JavaThread that biased the object and not other JavaThreads.
>>>>>>>>
>>>>>>>> Some notes about the patch:
>>>>>>>>
>>>>>>>> - Thanks to Robbin for initial work on this patch and for 
>>>>>>>> advice and feedback!
>>>>>>>> - We still execute bulk rebias and bulk revoke operations 
>>>>>>>> inside safepoints, since in those cases all the JavaThread's 
>>>>>>>> stacks need to be walked to potentially update lock records.
>>>>>>>> - The method revoke_bias() was renamed to 
>>>>>>>> single_revoke_at_safepoint(). This method is still kept because 
>>>>>>>> there are places where we check whether we are already at 
>>>>>>>> safepoint when trying to revoke. In those cases, if we are 
>>>>>>>> already at a safepoint we simply end up calling this method.
>>>>>>>> - Handshakes are executed as VMOperations so the VMThread is 
>>>>>>>> still involved in the revocation. This means we cannot have 
>>>>>>>> different revocations being executed in parallel (same as with 
>>>>>>>> safepoints). Ideally we would like to execute thread local 
>>>>>>>> handshakes without needing for the VMThread to participate. 
>>>>>>>> However, now other JavaThreads that do not participate in the 
>>>>>>>> revocation are allow to continue making progress.
>>>>>>>>
>>>>>>>> Run several benchmarks and mostly performance seems unaffected. 
>>>>>>>> Measured the average time it takes for revoking bias with a 
>>>>>>>> handshake and with a safepoint and numbers are pretty similar 
>>>>>>>> varying between benchmarks. Some numbers are shown below:
>>>>>>>>
>>>>>>>> specjbb2015
>>>>>>>>                       Handshakes      Safepoints
>>>>>>>> Linux                        4ms            4.6ms
>>>>>>>> Windows                 11ms             19ms
>>>>>>>>
>>>>>>>> startup benchmarks
>>>>>>>>                      Handshakes      Safepoints
>>>>>>>> Linux                    159us             248us
>>>>>>>> Windows               150us             111us
>>>>>>>>
>>>>>>>> Overall the variation is not enough to show significant 
>>>>>>>> difference in performance, considering also that revocations of 
>>>>>>>> a valid biaser are usually a fraction of the overall running 
>>>>>>>> time of a benchmark (specially jbb2015). In any case using 
>>>>>>>> handshakes allows other JavaThreads to make progress during 
>>>>>>>> that time, minimizing STW operations.
>>>>>>>>
>>>>>>>> In terms of testing, the patch passed several runs of tiers1-6 
>>>>>>>> in mach5 on Windows, Linux, MacOS and Solaris.
>>>>>>>>
>>>>>>>> Thanks!
>>>>>>>>
>>>>>>>> Patricio
>>>>>>>>
>>>>>>
>>>>
>>>
>>
>