RFR 8191890: Biased locking still uses the inferior stop the world safepoint for revocation

Tue Jun 11 18:13:14 UTC 2019

Patricio,

Sorry for the delay in getting back to this re-review.

On 6/7/19 12:56 AM, Patricio Chilano wrote:
> Hi all,
>
> Here is v02 addressing comments made by Dan and David.
>
> Full webrev:
> http://cr.openjdk.java.net/~pchilanomate/8191890/v02/webrev/ 
> <http://cr.openjdk.java.net/%7Epchilanomate/8191890/v02/webrev/>
> Inc webrev:
> http://cr.openjdk.java.net/~pchilanomate/8191890/v02/inc/webrev/ 
> <http://cr.openjdk.java.net/%7Epchilanomate/8191890/v02/inc/>

src/hotspot/share/runtime/biasedLocking.hpp
     No comments.

src/hotspot/share/runtime/biasedLocking.cpp
     No comments.

src/hotspot/share/runtime/handshake.cpp
     No comments.

test/jdk/jdk/jfr/event/runtime/TestBiasedLockRevocationEvents.java
     No comments.

Thumbs up!

Dan

>
> Thanks!
>
> Patricio
>
> On 6/6/19 7:37 PM, David Holmes wrote:
>> Hi Patricio,
>>
>> On 7/06/2019 9:19 am, Patricio Chilano wrote:
>>> Hi David,
>>>
>>>
>>> On 6/6/19 3:37 AM, David Holmes wrote:
>>>> Hi Patricio,
>>>>
>>>> First thanks for taking this on!
>>>>
>>>> I have some higher-level general discussion around this before deep 
>>>> diving into the actual code review (not that I have much there 
>>>> either :)).
>>>>
>>>> First to clarify how biased-locking works. I'm unclear when an 
>>>> object can be rebiased after having its bias revoked? This 
>>>> particularly relates to some of your assertions (as Markus has 
>>>> queried) after the CAS to update the mark so that the bias is 
>>>> revoked, and you then re-read the mark and assert the bias has been 
>>>> revoked - what stops another thread from rebiasing the object in 
>>>> between those two statements? Is it that rebiasing cannot happen, 
>>>> or that it could only happen if there were an intervening safepoint 
>>>> which in turn cannot happen?
>>> Once the bias of the object is revoked it will stay like that 
>>> forever, it cannot happen that it goes back to having the 0x5 pattern.
>>> Also, once the bias pattern in the prototype header for a class is 
>>> revoked during a bulk revocation operation, if there is an object of 
>>> that class that still has the bias pattern, a JavaThread that wants 
>>> to synchronize on that object will always revoke the bias first. 
>>> This is why I don't check if the CAS succeeded if the prototype of 
>>> the class does not has the bias pattern, I just assert that the 
>>> object is not biased anymore.
>>>
>>> Below I describe the cases where an object can be rebiased.
>>>
>>> Once a JavaThread biases an object for the first time, there are two 
>>> cases that allows for that object to be rebiased:
>>> 1) If the epoch in the markword becomes invalid. For this to happen 
>>> a bulk rebias operation is needed. This is why I do check if the CAS 
>>> succeeded or not for these cases, since some other JavaThread could 
>>> have rebiased it.
>>> 2) During a full GC, objects that are biased ( some JavaThread is 
>>> set in the biaser bits) could have their markword be reset to 0x5. 
>>> This means they will become anonymously biased again and so will 
>>> look as if they were not biased yet. As to how this logic works: At 
>>> the beginning of the full GC, BiasedLocking::preserve_marks() saves 
>>> all the markwords for those objects that are currently locked and 
>>> have a bias pattern. After that, 
>>> markOopDesc::must_be_preserved_with_bias() will be called to decide 
>>> if the markword of an object should be preserved or not. If the 
>>> markword contains the bias pattern it is never preserved. At the end 
>>> BiasedLocking::restore_marks() is called to restore the marks for 
>>> those objects that we saved before. So this means that even if an 
>>> object has a valid biaser, with valid epoch, if the object is not 
>>> currently locked it could be reset during the GC. I'm not sure 
>>> though if whenever markOopDesc::must_be_preserved_with_bias() 
>>> returns false the garbage collector always does the reset or it just 
>>> means it could reset it if it wants to. In any case I've seen that 
>>> reset happening when doing handshakes. In fact, this is one of the 
>>> reasons why the handshake could return that the bias was not 
>>> revoked, since I don't check for the anonymously biased case in 
>>> RevokeOneBias.
>>
>> Thanks for that very detailed set of descriptions. I won't pretend to 
>> fully grok all the details as I'm not completely clear on the role of 
>> the "epoch" or being anonymously biased, but I'm convinced you have a 
>> full understanding of such things. :) In revoke_and_rebias it was 
>> always a struggle for me to figure out exactly when the "rebias" part 
>> could come into play.
>>
>>>
>>>> The main concern with a change like this (as with all the handshake 
>>>> changes) is what new races this may allow and whether they have all 
>>>> been accounted for. IIUC the handshake will still be conducted by 
>>>> the VMThread so that still ensures serialization wrt. safepoints 
>>>> (which makes it simpler to reason about things). I've looked at 
>>>> some of the races you anticipated (like the "resurrected" thread) 
>>>> and they seem to be handled correctly. I'm unable to construct 
>>>> other races that might be problematic (but that isn't saying a lot 
>>>> :) ).
>>> I agree that since we are now doing the revocation outside 
>>> safepoints there is potential for additional races. But also one 
>>> thing to note is that RevokeOneBias, which contains the logic of the 
>>> handshake and is now replacing what we used to do at a safepoint, is 
>>> not really different from the initial code in revoke_and_rebias() 
>>> which is done outside safepoints. The handshake logic is like 
>>> executing that initial part but with the right JavaThread so that if 
>>> the object has a valid biaser, then that biaser is either ourselves 
>>> or we are the VMThread while the biaser is blocked, so that we can 
>>> execute revoke_own_lock(). In fact I was thinking at some point to 
>>> combine them in some method (maybe try_fast_revoke()). The 
>>> attempt_rebias flag and the update_heuristics() in 
>>> revoke_and_rebias() complicated things so I kept them separate.
>>> I have also tried to think on all possible racy scenarios and 
>>> couldn't find additional problems beside the "resurrected thread" 
>>> one (although it's also not a guarantee of anything). But that's why 
>>> I was thinking to check this in 14, so that if there are any 
>>> problems we have plenty of testing time to detect them.
>>
>> Yes that is a good idea. No need to rush this into 13.
>>
>>>
>>>> src/hotspot/share/jfr/metadata/metadata.xml
>>>>
>>>> Is it the case that this event is now never generated from a 
>>>> safepoint? Or have you just deleted the safepoint_id from the event 
>>>> because it might not be at a safepoint? If the latter can't we keep 
>>>> it and use 0 to indicate "not at a safepoint"? I think the JFR folk 
>>>> need to comment on this part of the change anyway.
>>> This event will be created and commited only from 
>>> BiasedLocking::single_revoke_with_handshake(). Now, the actual 
>>> handshake that revoked the bias could be executed at a safepoint 
>>> only if ThreadLocalHandshakes is false. But I understand that this 
>>> is true for all platforms so the handshake should always be executed 
>>> outside safepoints.
>>
>> Ok.
>>
>>> It would be great if JFR folks review this part.
>>
>> Try to grab Markus :)
>>
>>>
>>>> src/hotspot/share/runtime/biasedLocking.cpp
>>>>
>>>> I second Dan's comment about combining cleanup and code motion in a 
>>>> big change like this - it does make it much harder to spot the real 
>>>> difference.
>>> Ok, already two objections on this so I'll revert moving the 
>>> heuristics part. I think I also moved clean_up_cached_monitor_info() 
>>> and I will double check any other movements.
>>>
>>>
>>>> I note Dan picked up on the lack of p2i and other stuff related to 
>>>> the logging statements, and that you indicated they were fixed. I 
>>>> note that all that stuff is pre-existing so I'm unclear now whether 
>>>> you have fixed all the logging in the file or only the statements 
>>>> in the code you have changed or added? Again such cleanup may be 
>>>> best done separately.
>>> I haven't fixed the existing log statements, only the ones Dan 
>>> mentioned which are in single_revoke_with_handshake(), 
>>> revoke_own_lock(), and in VM_HandshakeOneThread(). Ok, I can fix the 
>>> other ones in a cleanup later along with code movement and the 
>>> removal of the attemp_rebias flag which we are not using.
>>
>> Okay. To be clear I don't expect you to fix all the existing uses I 
>> just wanted to clarify which ones you had fixed.
>>
>>>
>>>> 640 void BiasedLocking::revoke_own_lock(oop obj, JavaThread* 
>>>> biased_locker) {
>>>> 641   assert(!SafepointSynchronize::is_at_safepoint() || 
>>>> !ThreadLocalHandshakes,
>>>> 642          "if ThreadLocalHandshakes is enabled this should 
>>>> always be executed outside safepoints");
>>>> 643   assert(Thread::current() == biased_locker || 
>>>> Thread::current()->is_VM_thread(), "wrong thread");
>>>> 644
>>>>
>>>> This is called "revoke_own_lock" but it can also be executed by the 
>>>> VMThread - so its not its own lock. Also we don't revoke anything 
>>>> related to a "lock" - we revoke a bias from the markword of an oop. 
>>>> I think a better name is needed.
>>> Yes, I didn't really like it either. How about 
>>> walk_stack_and_revoke() ?
>>
>> That sounds good to me. Roll on v2 :)
>>
>> Thanks,
>> David
>> -----
>>
>>>
>>>> src/hotspot/share/runtime/handshake.cpp
>>>>
>>>>  125       log_trace(handshake)("JavaThread " INTPTR_FORMAT " is 
>>>> not alive", (intptr_t)_target);
>>>>
>>>> Use p2i(_target) rather than cast to intptr_t.
>>> Fixed.
>>>
>>>
>>>> That's all from me.
>>> Thanks for looking into this David! If you are okay with the 
>>> "walk_stack_and_revoke()" name then I can send v2.
>>>
>>>
>>> Thanks!
>>>
>>>
>>> Patricio
>>>
>>>> Thanks,
>>>> David
>>>> -----
>>>>
>>>>
>>>> On 30/05/2019 2:29 am, Patricio Chilano wrote:
>>>>> Hi all,
>>>>>
>>>>> Could you review this patch that uses thread local handshakes 
>>>>> instead of safepoints to revoke the biases of locked objects?
>>>>>
>>>>> Webrev:
>>>>> http://cr.openjdk.java.net/~pchilanomate/8191890/v01/webrev/
>>>>> Bug:
>>>>> https://bugs.openjdk.java.net/browse/JDK-8191890
>>>>>
>>>>>
>>>>> Today whenever a JavaThread needs to revoke the bias of an object 
>>>>> that has been biased by another JavaThread (and where the epoch is 
>>>>> still valid and the prototype header of the class still has the 
>>>>> bias pattern) it needs to request a safepoint operation. The 
>>>>> VMThread inside the safepoint walks the stack of the biaser 
>>>>> looking for lock records associated with the biased object, and 
>>>>> converts them to thin locks if any are found.
>>>>> This patch uses thread local handshakes instead, since we actually 
>>>>> only need to be able to safely walk the stack of the JavaThread 
>>>>> that biased the object and not other JavaThreads.
>>>>>
>>>>> Some notes about the patch:
>>>>>
>>>>> - Thanks to Robbin for initial work on this patch and for advice 
>>>>> and feedback!
>>>>> - We still execute bulk rebias and bulk revoke operations inside 
>>>>> safepoints, since in those cases all the JavaThread's stacks need 
>>>>> to be walked to potentially update lock records.
>>>>> - The method revoke_bias() was renamed to 
>>>>> single_revoke_at_safepoint(). This method is still kept because 
>>>>> there are places where we check whether we are already at 
>>>>> safepoint when trying to revoke. In those cases, if we are already 
>>>>> at a safepoint we simply end up calling this method.
>>>>> - Handshakes are executed as VMOperations so the VMThread is still 
>>>>> involved in the revocation. This means we cannot have different 
>>>>> revocations being executed in parallel (same as with safepoints). 
>>>>> Ideally we would like to execute thread local handshakes without 
>>>>> needing for the VMThread to participate. However, now other 
>>>>> JavaThreads that do not participate in the revocation are allow to 
>>>>> continue making progress.
>>>>>
>>>>> Run several benchmarks and mostly performance seems unaffected. 
>>>>> Measured the average time it takes for revoking bias with a 
>>>>> handshake and with a safepoint and numbers are pretty similar 
>>>>> varying between benchmarks. Some numbers are shown below:
>>>>>
>>>>> specjbb2015
>>>>>                       Handshakes      Safepoints
>>>>> Linux                        4ms            4.6ms
>>>>> Windows                 11ms             19ms
>>>>>
>>>>> startup benchmarks
>>>>>                      Handshakes      Safepoints
>>>>> Linux                    159us             248us
>>>>> Windows               150us             111us
>>>>>
>>>>> Overall the variation is not enough to show significant difference 
>>>>> in performance, considering also that revocations of a valid 
>>>>> biaser are usually a fraction of the overall running time of a 
>>>>> benchmark (specially jbb2015). In any case using handshakes allows 
>>>>> other JavaThreads to make progress during that time, minimizing 
>>>>> STW operations.
>>>>>
>>>>> In terms of testing, the patch passed several runs of tiers1-6 in 
>>>>> mach5 on Windows, Linux, MacOS and Solaris.
>>>>>
>>>>> Thanks!
>>>>>
>>>>> Patricio
>>>>>
>>>
>