RFR 8191890: Biased locking still uses the inferior stop the world safepoint for revocation

Thu Jun 6 23:37:10 UTC 2019

Hi Patricio,

On 7/06/2019 9:19 am, Patricio Chilano wrote:
> Hi David,
> 
> 
> On 6/6/19 3:37 AM, David Holmes wrote:
>> Hi Patricio,
>>
>> First thanks for taking this on!
>>
>> I have some higher-level general discussion around this before deep 
>> diving into the actual code review (not that I have much there either 
>> :)).
>>
>> First to clarify how biased-locking works. I'm unclear when an object 
>> can be rebiased after having its bias revoked? This particularly 
>> relates to some of your assertions (as Markus has queried) after the 
>> CAS to update the mark so that the bias is revoked, and you then 
>> re-read the mark and assert the bias has been revoked - what stops 
>> another thread from rebiasing the object in between those two 
>> statements? Is it that rebiasing cannot happen, or that it could only 
>> happen if there were an intervening safepoint which in turn cannot 
>> happen?
> Once the bias of the object is revoked it will stay like that forever, 
> it cannot happen that it goes back to having the 0x5 pattern.
> Also, once the bias pattern in the prototype header for a class is 
> revoked during a bulk revocation operation, if there is an object of 
> that class that still has the bias pattern, a JavaThread that wants to 
> synchronize on that object will always revoke the bias first. This is 
> why I don't check if the CAS succeeded if the prototype of the class 
> does not has the bias pattern, I just assert that the object is not 
> biased anymore.
> 
> Below I describe the cases where an object can be rebiased.
> 
> Once a JavaThread biases an object for the first time, there are two 
> cases that allows for that object to be rebiased:
> 1) If the epoch in the markword becomes invalid. For this to happen a 
> bulk rebias operation is needed. This is why I do check if the CAS 
> succeeded or not for these cases, since some other JavaThread could have 
> rebiased it.
> 2) During a full GC, objects that are biased ( some JavaThread is set in 
> the biaser bits) could have their markword be reset to 0x5. This means 
> they will become anonymously biased again and so will look as if they 
> were not biased yet. As to how this logic works: At the beginning of the 
> full GC, BiasedLocking::preserve_marks() saves all the markwords for 
> those objects that are currently locked and have a bias pattern. After 
> that, markOopDesc::must_be_preserved_with_bias() will be called to 
> decide if the markword of an object should be preserved or not. If the 
> markword contains the bias pattern it is never preserved. At the end 
> BiasedLocking::restore_marks() is called to restore the marks for those 
> objects that we saved before. So this means that even if an object has a 
> valid biaser, with valid epoch, if the object is not currently locked it 
> could be reset during the GC. I'm not sure though if whenever 
> markOopDesc::must_be_preserved_with_bias() returns false the garbage 
> collector always does the reset or it just means it could reset it if it 
> wants to. In any case I've seen that reset happening when doing 
> handshakes. In fact, this is one of the reasons why the handshake could 
> return that the bias was not revoked, since I don't check for the 
> anonymously biased case in RevokeOneBias.

Thanks for that very detailed set of descriptions. I won't pretend to 
fully grok all the details as I'm not completely clear on the role of 
the "epoch" or being anonymously biased, but I'm convinced you have a 
full understanding of such things. :) In revoke_and_rebias it was always 
a struggle for me to figure out exactly when the "rebias" part could 
come into play.

> 
>> The main concern with a change like this (as with all the handshake 
>> changes) is what new races this may allow and whether they have all 
>> been accounted for. IIUC the handshake will still be conducted by the 
>> VMThread so that still ensures serialization wrt. safepoints (which 
>> makes it simpler to reason about things). I've looked at some of the 
>> races you anticipated (like the "resurrected" thread) and they seem to 
>> be handled correctly. I'm unable to construct other races that might 
>> be problematic (but that isn't saying a lot :) ).
> I agree that since we are now doing the revocation outside safepoints 
> there is potential for additional races. But also one thing to note is 
> that RevokeOneBias, which contains the logic of the handshake and is now 
> replacing what we used to do at a safepoint, is not really different 
> from the initial code in revoke_and_rebias() which is done outside 
> safepoints. The handshake logic is like executing that initial part but 
> with the right JavaThread so that if the object has a valid biaser, then 
> that biaser is either ourselves or we are the VMThread while the biaser 
> is blocked, so that we can execute revoke_own_lock(). In fact I was 
> thinking at some point to combine them in some method (maybe 
> try_fast_revoke()). The attempt_rebias flag and the update_heuristics() 
> in revoke_and_rebias() complicated things so I kept them separate.
> I have also tried to think on all possible racy scenarios and couldn't 
> find additional problems beside the "resurrected thread" one (although 
> it's also not a guarantee of anything). But that's why I was thinking to 
> check this in 14, so that if there are any problems we have plenty of 
> testing time to detect them.

Yes that is a good idea. No need to rush this into 13.

> 
>> src/hotspot/share/jfr/metadata/metadata.xml
>>
>> Is it the case that this event is now never generated from a 
>> safepoint? Or have you just deleted the safepoint_id from the event 
>> because it might not be at a safepoint? If the latter can't we keep it 
>> and use 0 to indicate "not at a safepoint"? I think the JFR folk need 
>> to comment on this part of the change anyway.
> This event will be created and commited only from 
> BiasedLocking::single_revoke_with_handshake(). Now, the actual handshake 
> that revoked the bias could be executed at a safepoint only if 
> ThreadLocalHandshakes is false. But I understand that this is true for 
> all platforms so the handshake should always be executed outside 
> safepoints.

Ok.

> It would be great if JFR folks review this part.

Try to grab Markus :)

> 
>> src/hotspot/share/runtime/biasedLocking.cpp
>>
>> I second Dan's comment about combining cleanup and code motion in a 
>> big change like this - it does make it much harder to spot the real 
>> difference.
> Ok, already two objections on this so I'll revert moving the heuristics 
> part. I think I also moved clean_up_cached_monitor_info() and I will 
> double check any other movements.
> 
> 
>> I note Dan picked up on the lack of p2i and other stuff related to the 
>> logging statements, and that you indicated they were fixed. I note 
>> that all that stuff is pre-existing so I'm unclear now whether you 
>> have fixed all the logging in the file or only the statements in the 
>> code you have changed or added? Again such cleanup may be best done 
>> separately.
> I haven't fixed the existing log statements, only the ones Dan mentioned 
> which are in single_revoke_with_handshake(), revoke_own_lock(), and in 
> VM_HandshakeOneThread(). Ok, I can fix the other ones in a cleanup later 
> along with code movement and the removal of the attemp_rebias flag which 
> we are not using.

Okay. To be clear I don't expect you to fix all the existing uses I just 
wanted to clarify which ones you had fixed.

> 
>> 640 void BiasedLocking::revoke_own_lock(oop obj, JavaThread* 
>> biased_locker) {
>> 641   assert(!SafepointSynchronize::is_at_safepoint() || 
>> !ThreadLocalHandshakes,
>> 642          "if ThreadLocalHandshakes is enabled this should always 
>> be executed outside safepoints");
>> 643   assert(Thread::current() == biased_locker || 
>> Thread::current()->is_VM_thread(), "wrong thread");
>> 644
>>
>> This is called "revoke_own_lock" but it can also be executed by the 
>> VMThread - so its not its own lock. Also we don't revoke anything 
>> related to a "lock" - we revoke a bias from the markword of an oop. I 
>> think a better name is needed.
> Yes, I didn't really like it either. How about walk_stack_and_revoke() ?

That sounds good to me. Roll on v2 :)

Thanks,
David
-----

> 
>> src/hotspot/share/runtime/handshake.cpp
>>
>>  125       log_trace(handshake)("JavaThread " INTPTR_FORMAT " is not 
>> alive", (intptr_t)_target);
>>
>> Use p2i(_target) rather than cast to intptr_t.
> Fixed.
> 
> 
>> That's all from me.
> Thanks for looking into this David! If you are okay with the 
> "walk_stack_and_revoke()" name then I can send v2.
> 
> 
> Thanks!
> 
> 
> Patricio
> 
>> Thanks,
>> David
>> -----
>>
>>
>> On 30/05/2019 2:29 am, Patricio Chilano wrote:
>>> Hi all,
>>>
>>> Could you review this patch that uses thread local handshakes instead 
>>> of safepoints to revoke the biases of locked objects?
>>>
>>> Webrev:
>>> http://cr.openjdk.java.net/~pchilanomate/8191890/v01/webrev/
>>> Bug:
>>> https://bugs.openjdk.java.net/browse/JDK-8191890
>>>
>>>
>>> Today whenever a JavaThread needs to revoke the bias of an object 
>>> that has been biased by another JavaThread (and where the epoch is 
>>> still valid and the prototype header of the class still has the bias 
>>> pattern) it needs to request a safepoint operation. The VMThread 
>>> inside the safepoint walks the stack of the biaser looking for lock 
>>> records associated with the biased object, and converts them to thin 
>>> locks if any are found.
>>> This patch uses thread local handshakes instead, since we actually 
>>> only need to be able to safely walk the stack of the JavaThread that 
>>> biased the object and not other JavaThreads.
>>>
>>> Some notes about the patch:
>>>
>>> - Thanks to Robbin for initial work on this patch and for advice and 
>>> feedback!
>>> - We still execute bulk rebias and bulk revoke operations inside 
>>> safepoints, since in those cases all the JavaThread's stacks need to 
>>> be walked to potentially update lock records.
>>> - The method revoke_bias() was renamed to 
>>> single_revoke_at_safepoint(). This method is still kept because there 
>>> are places where we check whether we are already at safepoint when 
>>> trying to revoke. In those cases, if we are already at a safepoint we 
>>> simply end up calling this method.
>>> - Handshakes are executed as VMOperations so the VMThread is still 
>>> involved in the revocation. This means we cannot have different 
>>> revocations being executed in parallel (same as with safepoints). 
>>> Ideally we would like to execute thread local handshakes without 
>>> needing for the VMThread to participate. However, now other 
>>> JavaThreads that do not participate in the revocation are allow to 
>>> continue making progress.
>>>
>>> Run several benchmarks and mostly performance seems unaffected. 
>>> Measured the average time it takes for revoking bias with a handshake 
>>> and with a safepoint and numbers are pretty similar varying between 
>>> benchmarks. Some numbers are shown below:
>>>
>>> specjbb2015
>>>                       Handshakes      Safepoints
>>> Linux                        4ms            4.6ms
>>> Windows                 11ms             19ms
>>>
>>> startup benchmarks
>>>                      Handshakes      Safepoints
>>> Linux                    159us             248us
>>> Windows               150us             111us
>>>
>>> Overall the variation is not enough to show significant difference in 
>>> performance, considering also that revocations of a valid biaser are 
>>> usually a fraction of the overall running time of a benchmark 
>>> (specially jbb2015). In any case using handshakes allows other 
>>> JavaThreads to make progress during that time, minimizing STW 
>>> operations.
>>>
>>> In terms of testing, the patch passed several runs of tiers1-6 in 
>>> mach5 on Windows, Linux, MacOS and Solaris.
>>>
>>> Thanks!
>>>
>>> Patricio
>>>
>