RFR 8191890: Biased locking still uses the inferior stop the world safepoint for revocation

Tue Jun 11 01:54:37 UTC 2019

Hi David,

On 6/10/19 12:52 AM, David Holmes wrote:
> Hi Patricio,
>
> Updates seem fine. No further comments from me.
Great. Thanks for reviewing this!

Patricio
> Thanks,
> David
>
> On 7/06/2019 2:56 pm, Patricio Chilano wrote:
>> Hi all,
>>
>> Here is v02 addressing comments made by Dan and David.
>>
>> Full webrev:
>> http://cr.openjdk.java.net/~pchilanomate/8191890/v02/webrev/ 
>> <http://cr.openjdk.java.net/%7Epchilanomate/8191890/v02/webrev/>
>> Inc webrev:
>> http://cr.openjdk.java.net/~pchilanomate/8191890/v02/inc/webrev/ 
>> <http://cr.openjdk.java.net/%7Epchilanomate/8191890/v02/inc/>
>>
>> Thanks!
>>
>> Patricio
>>
>> On 6/6/19 7:37 PM, David Holmes wrote:
>>> Hi Patricio,
>>>
>>> On 7/06/2019 9:19 am, Patricio Chilano wrote:
>>>> Hi David,
>>>>
>>>>
>>>> On 6/6/19 3:37 AM, David Holmes wrote:
>>>>> Hi Patricio,
>>>>>
>>>>> First thanks for taking this on!
>>>>>
>>>>> I have some higher-level general discussion around this before 
>>>>> deep diving into the actual code review (not that I have much 
>>>>> there either :)).
>>>>>
>>>>> First to clarify how biased-locking works. I'm unclear when an 
>>>>> object can be rebiased after having its bias revoked? This 
>>>>> particularly relates to some of your assertions (as Markus has 
>>>>> queried) after the CAS to update the mark so that the bias is 
>>>>> revoked, and you then re-read the mark and assert the bias has 
>>>>> been revoked - what stops another thread from rebiasing the object 
>>>>> in between those two statements? Is it that rebiasing cannot 
>>>>> happen, or that it could only happen if there were an intervening 
>>>>> safepoint which in turn cannot happen?
>>>> Once the bias of the object is revoked it will stay like that 
>>>> forever, it cannot happen that it goes back to having the 0x5 pattern.
>>>> Also, once the bias pattern in the prototype header for a class is 
>>>> revoked during a bulk revocation operation, if there is an object 
>>>> of that class that still has the bias pattern, a JavaThread that 
>>>> wants to synchronize on that object will always revoke the bias 
>>>> first. This is why I don't check if the CAS succeeded if the 
>>>> prototype of the class does not has the bias pattern, I just assert 
>>>> that the object is not biased anymore.
>>>>
>>>> Below I describe the cases where an object can be rebiased.
>>>>
>>>> Once a JavaThread biases an object for the first time, there are 
>>>> two cases that allows for that object to be rebiased:
>>>> 1) If the epoch in the markword becomes invalid. For this to happen 
>>>> a bulk rebias operation is needed. This is why I do check if the 
>>>> CAS succeeded or not for these cases, since some other JavaThread 
>>>> could have rebiased it.
>>>> 2) During a full GC, objects that are biased ( some JavaThread is 
>>>> set in the biaser bits) could have their markword be reset to 0x5. 
>>>> This means they will become anonymously biased again and so will 
>>>> look as if they were not biased yet. As to how this logic works: At 
>>>> the beginning of the full GC, BiasedLocking::preserve_marks() saves 
>>>> all the markwords for those objects that are currently locked and 
>>>> have a bias pattern. After that, 
>>>> markOopDesc::must_be_preserved_with_bias() will be called to decide 
>>>> if the markword of an object should be preserved or not. If the 
>>>> markword contains the bias pattern it is never preserved. At the 
>>>> end BiasedLocking::restore_marks() is called to restore the marks 
>>>> for those objects that we saved before. So this means that even if 
>>>> an object has a valid biaser, with valid epoch, if the object is 
>>>> not currently locked it could be reset during the GC. I'm not sure 
>>>> though if whenever markOopDesc::must_be_preserved_with_bias() 
>>>> returns false the garbage collector always does the reset or it 
>>>> just means it could reset it if it wants to. In any case I've seen 
>>>> that reset happening when doing handshakes. In fact, this is one of 
>>>> the reasons why the handshake could return that the bias was not 
>>>> revoked, since I don't check for the anonymously biased case in 
>>>> RevokeOneBias.
>>>
>>> Thanks for that very detailed set of descriptions. I won't pretend 
>>> to fully grok all the details as I'm not completely clear on the 
>>> role of the "epoch" or being anonymously biased, but I'm convinced 
>>> you have a full understanding of such things. :) In 
>>> revoke_and_rebias it was always a struggle for me to figure out 
>>> exactly when the "rebias" part could come into play.
>>>
>>>>
>>>>> The main concern with a change like this (as with all the 
>>>>> handshake changes) is what new races this may allow and whether 
>>>>> they have all been accounted for. IIUC the handshake will still be 
>>>>> conducted by the VMThread so that still ensures serialization wrt. 
>>>>> safepoints (which makes it simpler to reason about things). I've 
>>>>> looked at some of the races you anticipated (like the 
>>>>> "resurrected" thread) and they seem to be handled correctly. I'm 
>>>>> unable to construct other races that might be problematic (but 
>>>>> that isn't saying a lot :) ).
>>>> I agree that since we are now doing the revocation outside 
>>>> safepoints there is potential for additional races. But also one 
>>>> thing to note is that RevokeOneBias, which contains the logic of 
>>>> the handshake and is now replacing what we used to do at a 
>>>> safepoint, is not really different from the initial code in 
>>>> revoke_and_rebias() which is done outside safepoints. The handshake 
>>>> logic is like executing that initial part but with the right 
>>>> JavaThread so that if the object has a valid biaser, then that 
>>>> biaser is either ourselves or we are the VMThread while the biaser 
>>>> is blocked, so that we can execute revoke_own_lock(). In fact I was 
>>>> thinking at some point to combine them in some method (maybe 
>>>> try_fast_revoke()). The attempt_rebias flag and the 
>>>> update_heuristics() in revoke_and_rebias() complicated things so I 
>>>> kept them separate.
>>>> I have also tried to think on all possible racy scenarios and 
>>>> couldn't find additional problems beside the "resurrected thread" 
>>>> one (although it's also not a guarantee of anything). But that's 
>>>> why I was thinking to check this in 14, so that if there are any 
>>>> problems we have plenty of testing time to detect them.
>>>
>>> Yes that is a good idea. No need to rush this into 13.
>>>
>>>>
>>>>> src/hotspot/share/jfr/metadata/metadata.xml
>>>>>
>>>>> Is it the case that this event is now never generated from a 
>>>>> safepoint? Or have you just deleted the safepoint_id from the 
>>>>> event because it might not be at a safepoint? If the latter can't 
>>>>> we keep it and use 0 to indicate "not at a safepoint"? I think the 
>>>>> JFR folk need to comment on this part of the change anyway.
>>>> This event will be created and commited only from 
>>>> BiasedLocking::single_revoke_with_handshake(). Now, the actual 
>>>> handshake that revoked the bias could be executed at a safepoint 
>>>> only if ThreadLocalHandshakes is false. But I understand that this 
>>>> is true for all platforms so the handshake should always be 
>>>> executed outside safepoints.
>>>
>>> Ok.
>>>
>>>> It would be great if JFR folks review this part.
>>>
>>> Try to grab Markus :)
>>>
>>>>
>>>>> src/hotspot/share/runtime/biasedLocking.cpp
>>>>>
>>>>> I second Dan's comment about combining cleanup and code motion in 
>>>>> a big change like this - it does make it much harder to spot the 
>>>>> real difference.
>>>> Ok, already two objections on this so I'll revert moving the 
>>>> heuristics part. I think I also moved 
>>>> clean_up_cached_monitor_info() and I will double check any other 
>>>> movements.
>>>>
>>>>
>>>>> I note Dan picked up on the lack of p2i and other stuff related to 
>>>>> the logging statements, and that you indicated they were fixed. I 
>>>>> note that all that stuff is pre-existing so I'm unclear now 
>>>>> whether you have fixed all the logging in the file or only the 
>>>>> statements in the code you have changed or added? Again such 
>>>>> cleanup may be best done separately.
>>>> I haven't fixed the existing log statements, only the ones Dan 
>>>> mentioned which are in single_revoke_with_handshake(), 
>>>> revoke_own_lock(), and in VM_HandshakeOneThread(). Ok, I can fix 
>>>> the other ones in a cleanup later along with code movement and the 
>>>> removal of the attemp_rebias flag which we are not using.
>>>
>>> Okay. To be clear I don't expect you to fix all the existing uses I 
>>> just wanted to clarify which ones you had fixed.
>>>
>>>>
>>>>> 640 void BiasedLocking::revoke_own_lock(oop obj, JavaThread* 
>>>>> biased_locker) {
>>>>> 641   assert(!SafepointSynchronize::is_at_safepoint() || 
>>>>> !ThreadLocalHandshakes,
>>>>> 642          "if ThreadLocalHandshakes is enabled this should 
>>>>> always be executed outside safepoints");
>>>>> 643   assert(Thread::current() == biased_locker || 
>>>>> Thread::current()->is_VM_thread(), "wrong thread");
>>>>> 644
>>>>>
>>>>> This is called "revoke_own_lock" but it can also be executed by 
>>>>> the VMThread - so its not its own lock. Also we don't revoke 
>>>>> anything related to a "lock" - we revoke a bias from the markword 
>>>>> of an oop. I think a better name is needed.
>>>> Yes, I didn't really like it either. How about 
>>>> walk_stack_and_revoke() ?
>>>
>>> That sounds good to me. Roll on v2 :)
>>>
>>> Thanks,
>>> David
>>> -----
>>>
>>>>
>>>>> src/hotspot/share/runtime/handshake.cpp
>>>>>
>>>>>  125       log_trace(handshake)("JavaThread " INTPTR_FORMAT " is 
>>>>> not alive", (intptr_t)_target);
>>>>>
>>>>> Use p2i(_target) rather than cast to intptr_t.
>>>> Fixed.
>>>>
>>>>
>>>>> That's all from me.
>>>> Thanks for looking into this David! If you are okay with the 
>>>> "walk_stack_and_revoke()" name then I can send v2.
>>>>
>>>>
>>>> Thanks!
>>>>
>>>>
>>>> Patricio
>>>>
>>>>> Thanks,
>>>>> David
>>>>> -----
>>>>>
>>>>>
>>>>> On 30/05/2019 2:29 am, Patricio Chilano wrote:
>>>>>> Hi all,
>>>>>>
>>>>>> Could you review this patch that uses thread local handshakes 
>>>>>> instead of safepoints to revoke the biases of locked objects?
>>>>>>
>>>>>> Webrev:
>>>>>> http://cr.openjdk.java.net/~pchilanomate/8191890/v01/webrev/
>>>>>> Bug:
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8191890
>>>>>>
>>>>>>
>>>>>> Today whenever a JavaThread needs to revoke the bias of an object 
>>>>>> that has been biased by another JavaThread (and where the epoch 
>>>>>> is still valid and the prototype header of the class still has 
>>>>>> the bias pattern) it needs to request a safepoint operation. The 
>>>>>> VMThread inside the safepoint walks the stack of the biaser 
>>>>>> looking for lock records associated with the biased object, and 
>>>>>> converts them to thin locks if any are found.
>>>>>> This patch uses thread local handshakes instead, since we 
>>>>>> actually only need to be able to safely walk the stack of the 
>>>>>> JavaThread that biased the object and not other JavaThreads.
>>>>>>
>>>>>> Some notes about the patch:
>>>>>>
>>>>>> - Thanks to Robbin for initial work on this patch and for advice 
>>>>>> and feedback!
>>>>>> - We still execute bulk rebias and bulk revoke operations inside 
>>>>>> safepoints, since in those cases all the JavaThread's stacks need 
>>>>>> to be walked to potentially update lock records.
>>>>>> - The method revoke_bias() was renamed to 
>>>>>> single_revoke_at_safepoint(). This method is still kept because 
>>>>>> there are places where we check whether we are already at 
>>>>>> safepoint when trying to revoke. In those cases, if we are 
>>>>>> already at a safepoint we simply end up calling this method.
>>>>>> - Handshakes are executed as VMOperations so the VMThread is 
>>>>>> still involved in the revocation. This means we cannot have 
>>>>>> different revocations being executed in parallel (same as with 
>>>>>> safepoints). Ideally we would like to execute thread local 
>>>>>> handshakes without needing for the VMThread to participate. 
>>>>>> However, now other JavaThreads that do not participate in the 
>>>>>> revocation are allow to continue making progress.
>>>>>>
>>>>>> Run several benchmarks and mostly performance seems unaffected. 
>>>>>> Measured the average time it takes for revoking bias with a 
>>>>>> handshake and with a safepoint and numbers are pretty similar 
>>>>>> varying between benchmarks. Some numbers are shown below:
>>>>>>
>>>>>> specjbb2015
>>>>>>                       Handshakes      Safepoints
>>>>>> Linux                        4ms            4.6ms
>>>>>> Windows                 11ms             19ms
>>>>>>
>>>>>> startup benchmarks
>>>>>>                      Handshakes      Safepoints
>>>>>> Linux                    159us             248us
>>>>>> Windows               150us             111us
>>>>>>
>>>>>> Overall the variation is not enough to show significant 
>>>>>> difference in performance, considering also that revocations of a 
>>>>>> valid biaser are usually a fraction of the overall running time 
>>>>>> of a benchmark (specially jbb2015). In any case using handshakes 
>>>>>> allows other JavaThreads to make progress during that time, 
>>>>>> minimizing STW operations.
>>>>>>
>>>>>> In terms of testing, the patch passed several runs of tiers1-6 in 
>>>>>> mach5 on Windows, Linux, MacOS and Solaris.
>>>>>>
>>>>>> Thanks!
>>>>>>
>>>>>> Patricio
>>>>>>
>>>>
>>