RFR 8191890: Biased locking still uses the inferior stop the world safepoint for revocation

Mon Jun 10 08:01:37 UTC 2019

Hi Patricio, thanks for fixing this!

On 6/7/19 6:56 AM, Patricio Chilano wrote:
> Full webrev:
> http://cr.openjdk.java.net/~pchilanomate/8191890/v02/webrev/ 
> <http://cr.openjdk.java.net/%7Epchilanomate/8191890/v02/webrev/>

Two things, to at least consider.

biasedLocking.cpp
766   while (true) {
It's really hard to see where this while loop ends.
My preference would be two methods, one for the 'anonymously-biased' part
and one for the heuristic part.

Secondly, not directly your code, but clean_up_cached_monitor_info(), it's a bit 
hard to keep track that we do call clean at correct places.
Can we do something about it?

I can live with the while loop and the clean cache is not directly related, so 
if you prefer not fixing above it's fine.

Looks good, thanks!

/Robbin

> Inc webrev:
> http://cr.openjdk.java.net/~pchilanomate/8191890/v02/inc/webrev/ 
> <http://cr.openjdk.java.net/%7Epchilanomate/8191890/v02/inc/>
> 
> Thanks!
> 
> Patricio
> 
> On 6/6/19 7:37 PM, David Holmes wrote:
>> Hi Patricio,
>>
>> On 7/06/2019 9:19 am, Patricio Chilano wrote:
>>> Hi David,
>>>
>>>
>>> On 6/6/19 3:37 AM, David Holmes wrote:
>>>> Hi Patricio,
>>>>
>>>> First thanks for taking this on!
>>>>
>>>> I have some higher-level general discussion around this before deep diving 
>>>> into the actual code review (not that I have much there either :)).
>>>>
>>>> First to clarify how biased-locking works. I'm unclear when an object can be 
>>>> rebiased after having its bias revoked? This particularly relates to some of 
>>>> your assertions (as Markus has queried) after the CAS to update the mark so 
>>>> that the bias is revoked, and you then re-read the mark and assert the bias 
>>>> has been revoked - what stops another thread from rebiasing the object in 
>>>> between those two statements? Is it that rebiasing cannot happen, or that it 
>>>> could only happen if there were an intervening safepoint which in turn 
>>>> cannot happen?
>>> Once the bias of the object is revoked it will stay like that forever, it 
>>> cannot happen that it goes back to having the 0x5 pattern.
>>> Also, once the bias pattern in the prototype header for a class is revoked 
>>> during a bulk revocation operation, if there is an object of that class that 
>>> still has the bias pattern, a JavaThread that wants to synchronize on that 
>>> object will always revoke the bias first. This is why I don't check if the 
>>> CAS succeeded if the prototype of the class does not has the bias pattern, I 
>>> just assert that the object is not biased anymore.
>>>
>>> Below I describe the cases where an object can be rebiased.
>>>
>>> Once a JavaThread biases an object for the first time, there are two cases 
>>> that allows for that object to be rebiased:
>>> 1) If the epoch in the markword becomes invalid. For this to happen a bulk 
>>> rebias operation is needed. This is why I do check if the CAS succeeded or 
>>> not for these cases, since some other JavaThread could have rebiased it.
>>> 2) During a full GC, objects that are biased ( some JavaThread is set in the 
>>> biaser bits) could have their markword be reset to 0x5. This means they will 
>>> become anonymously biased again and so will look as if they were not biased 
>>> yet. As to how this logic works: At the beginning of the full GC, 
>>> BiasedLocking::preserve_marks() saves all the markwords for those objects 
>>> that are currently locked and have a bias pattern. After that, 
>>> markOopDesc::must_be_preserved_with_bias() will be called to decide if the 
>>> markword of an object should be preserved or not. If the markword contains 
>>> the bias pattern it is never preserved. At the end 
>>> BiasedLocking::restore_marks() is called to restore the marks for those 
>>> objects that we saved before. So this means that even if an object has a 
>>> valid biaser, with valid epoch, if the object is not currently locked it 
>>> could be reset during the GC. I'm not sure though if whenever 
>>> markOopDesc::must_be_preserved_with_bias() returns false the garbage 
>>> collector always does the reset or it just means it could reset it if it 
>>> wants to. In any case I've seen that reset happening when doing handshakes. 
>>> In fact, this is one of the reasons why the handshake could return that the 
>>> bias was not revoked, since I don't check for the anonymously biased case in 
>>> RevokeOneBias.
>>
>> Thanks for that very detailed set of descriptions. I won't pretend to fully 
>> grok all the details as I'm not completely clear on the role of the "epoch" or 
>> being anonymously biased, but I'm convinced you have a full understanding of 
>> such things. :) In revoke_and_rebias it was always a struggle for me to figure 
>> out exactly when the "rebias" part could come into play.
>>
>>>
>>>> The main concern with a change like this (as with all the handshake changes) 
>>>> is what new races this may allow and whether they have all been accounted 
>>>> for. IIUC the handshake will still be conducted by the VMThread so that 
>>>> still ensures serialization wrt. safepoints (which makes it simpler to 
>>>> reason about things). I've looked at some of the races you anticipated (like 
>>>> the "resurrected" thread) and they seem to be handled correctly. I'm unable 
>>>> to construct other races that might be problematic (but that isn't saying a 
>>>> lot :) ).
>>> I agree that since we are now doing the revocation outside safepoints there 
>>> is potential for additional races. But also one thing to note is that 
>>> RevokeOneBias, which contains the logic of the handshake and is now replacing 
>>> what we used to do at a safepoint, is not really different from the initial 
>>> code in revoke_and_rebias() which is done outside safepoints. The handshake 
>>> logic is like executing that initial part but with the right JavaThread so 
>>> that if the object has a valid biaser, then that biaser is either ourselves 
>>> or we are the VMThread while the biaser is blocked, so that we can execute 
>>> revoke_own_lock(). In fact I was thinking at some point to combine them in 
>>> some method (maybe try_fast_revoke()). The attempt_rebias flag and the 
>>> update_heuristics() in revoke_and_rebias() complicated things so I kept them 
>>> separate.
>>> I have also tried to think on all possible racy scenarios and couldn't find 
>>> additional problems beside the "resurrected thread" one (although it's also 
>>> not a guarantee of anything). But that's why I was thinking to check this in 
>>> 14, so that if there are any problems we have plenty of testing time to 
>>> detect them.
>>
>> Yes that is a good idea. No need to rush this into 13.
>>
>>>
>>>> src/hotspot/share/jfr/metadata/metadata.xml
>>>>
>>>> Is it the case that this event is now never generated from a safepoint? Or 
>>>> have you just deleted the safepoint_id from the event because it might not 
>>>> be at a safepoint? If the latter can't we keep it and use 0 to indicate "not 
>>>> at a safepoint"? I think the JFR folk need to comment on this part of the 
>>>> change anyway.
>>> This event will be created and commited only from 
>>> BiasedLocking::single_revoke_with_handshake(). Now, the actual handshake that 
>>> revoked the bias could be executed at a safepoint only if 
>>> ThreadLocalHandshakes is false. But I understand that this is true for all 
>>> platforms so the handshake should always be executed outside safepoints.
>>
>> Ok.
>>
>>> It would be great if JFR folks review this part.
>>
>> Try to grab Markus :)
>>
>>>
>>>> src/hotspot/share/runtime/biasedLocking.cpp
>>>>
>>>> I second Dan's comment about combining cleanup and code motion in a big 
>>>> change like this - it does make it much harder to spot the real difference.
>>> Ok, already two objections on this so I'll revert moving the heuristics part. 
>>> I think I also moved clean_up_cached_monitor_info() and I will double check 
>>> any other movements.
>>>
>>>
>>>> I note Dan picked up on the lack of p2i and other stuff related to the 
>>>> logging statements, and that you indicated they were fixed. I note that all 
>>>> that stuff is pre-existing so I'm unclear now whether you have fixed all the 
>>>> logging in the file or only the statements in the code you have changed or 
>>>> added? Again such cleanup may be best done separately.
>>> I haven't fixed the existing log statements, only the ones Dan mentioned 
>>> which are in single_revoke_with_handshake(), revoke_own_lock(), and in 
>>> VM_HandshakeOneThread(). Ok, I can fix the other ones in a cleanup later 
>>> along with code movement and the removal of the attemp_rebias flag which we 
>>> are not using.
>>
>> Okay. To be clear I don't expect you to fix all the existing uses I just 
>> wanted to clarify which ones you had fixed.
>>
>>>
>>>> 640 void BiasedLocking::revoke_own_lock(oop obj, JavaThread* biased_locker) {
>>>> 641   assert(!SafepointSynchronize::is_at_safepoint() || 
>>>> !ThreadLocalHandshakes,
>>>> 642          "if ThreadLocalHandshakes is enabled this should always be 
>>>> executed outside safepoints");
>>>> 643   assert(Thread::current() == biased_locker || 
>>>> Thread::current()->is_VM_thread(), "wrong thread");
>>>> 644
>>>>
>>>> This is called "revoke_own_lock" but it can also be executed by the VMThread 
>>>> - so its not its own lock. Also we don't revoke anything related to a "lock" 
>>>> - we revoke a bias from the markword of an oop. I think a better name is 
>>>> needed.
>>> Yes, I didn't really like it either. How about walk_stack_and_revoke() ?
>>
>> That sounds good to me. Roll on v2 :)
>>
>> Thanks,
>> David
>> -----
>>
>>>
>>>> src/hotspot/share/runtime/handshake.cpp
>>>>
>>>>  125       log_trace(handshake)("JavaThread " INTPTR_FORMAT " is not alive", 
>>>> (intptr_t)_target);
>>>>
>>>> Use p2i(_target) rather than cast to intptr_t.
>>> Fixed.
>>>
>>>
>>>> That's all from me.
>>> Thanks for looking into this David! If you are okay with the 
>>> "walk_stack_and_revoke()" name then I can send v2.
>>>
>>>
>>> Thanks!
>>>
>>>
>>> Patricio
>>>
>>>> Thanks,
>>>> David
>>>> -----
>>>>
>>>>
>>>> On 30/05/2019 2:29 am, Patricio Chilano wrote:
>>>>> Hi all,
>>>>>
>>>>> Could you review this patch that uses thread local handshakes instead of 
>>>>> safepoints to revoke the biases of locked objects?
>>>>>
>>>>> Webrev:
>>>>> http://cr.openjdk.java.net/~pchilanomate/8191890/v01/webrev/
>>>>> Bug:
>>>>> https://bugs.openjdk.java.net/browse/JDK-8191890
>>>>>
>>>>>
>>>>> Today whenever a JavaThread needs to revoke the bias of an object that has 
>>>>> been biased by another JavaThread (and where the epoch is still valid and 
>>>>> the prototype header of the class still has the bias pattern) it needs to 
>>>>> request a safepoint operation. The VMThread inside the safepoint walks the 
>>>>> stack of the biaser looking for lock records associated with the biased 
>>>>> object, and converts them to thin locks if any are found.
>>>>> This patch uses thread local handshakes instead, since we actually only 
>>>>> need to be able to safely walk the stack of the JavaThread that biased the 
>>>>> object and not other JavaThreads.
>>>>>
>>>>> Some notes about the patch:
>>>>>
>>>>> - Thanks to Robbin for initial work on this patch and for advice and feedback!
>>>>> - We still execute bulk rebias and bulk revoke operations inside 
>>>>> safepoints, since in those cases all the JavaThread's stacks need to be 
>>>>> walked to potentially update lock records.
>>>>> - The method revoke_bias() was renamed to single_revoke_at_safepoint(). 
>>>>> This method is still kept because there are places where we check whether 
>>>>> we are already at safepoint when trying to revoke. In those cases, if we 
>>>>> are already at a safepoint we simply end up calling this method.
>>>>> - Handshakes are executed as VMOperations so the VMThread is still involved 
>>>>> in the revocation. This means we cannot have different revocations being 
>>>>> executed in parallel (same as with safepoints). Ideally we would like to 
>>>>> execute thread local handshakes without needing for the VMThread to 
>>>>> participate. However, now other JavaThreads that do not participate in the 
>>>>> revocation are allow to continue making progress.
>>>>>
>>>>> Run several benchmarks and mostly performance seems unaffected. Measured 
>>>>> the average time it takes for revoking bias with a handshake and with a 
>>>>> safepoint and numbers are pretty similar varying between benchmarks. Some 
>>>>> numbers are shown below:
>>>>>
>>>>> specjbb2015
>>>>>                       Handshakes      Safepoints
>>>>> Linux                        4ms            4.6ms
>>>>> Windows                 11ms             19ms
>>>>>
>>>>> startup benchmarks
>>>>>                      Handshakes      Safepoints
>>>>> Linux                    159us             248us
>>>>> Windows               150us             111us
>>>>>
>>>>> Overall the variation is not enough to show significant difference in 
>>>>> performance, considering also that revocations of a valid biaser are 
>>>>> usually a fraction of the overall running time of a benchmark (specially 
>>>>> jbb2015). In any case using handshakes allows other JavaThreads to make 
>>>>> progress during that time, minimizing STW operations.
>>>>>
>>>>> In terms of testing, the patch passed several runs of tiers1-6 in mach5 on 
>>>>> Windows, Linux, MacOS and Solaris.
>>>>>
>>>>> Thanks!
>>>>>
>>>>> Patricio
>>>>>
>>>
>