RFR 8191890: Biased locking still uses the inferior stop the world safepoint for revocation
Markus Gronlund
markus.gronlund at oracle.com
Fri Jun 21 11:21:32 UTC 2019
Hi Patricio,
The JFR related stuff seems fine.
Thanks
Markus
-----Original Message-----
From: Patricio Chilano
Sent: den 19 juni 2019 07:09
To: Coleen Phillimore <coleen.phillimore at oracle.com>; Markus Grönlund <markus.gronlund at oracle.com>; Robbin Ehn <robbin.ehn at oracle.com>; daniel.daugherty <daniel.daugherty at oracle.com>; David Holmes <david.holmes at oracle.com>
Cc: hotspot-runtime-dev at openjdk.java.net runtime <hotspot-runtime-dev at openjdk.java.net>
Subject: Re: RFR 8191890: Biased locking still uses the inferior stop the world safepoint for revocation
Hi all,
Here is v03, it contains the fixes suggested by Markus and Coleen:
Full: http://cr.openjdk.java.net/~pchilanomate/8191890/v03/webrev/
<http://cr.openjdk.java.net/%7Epchilanomate/8191890/v03/webrev/>
Inc: http://cr.openjdk.java.net/~pchilanomate/8191890/v03/inc/webrev/
<http://cr.openjdk.java.net/%7Epchilanomate/8191890/v03/inc/webrev/>
The only extra change that I made was moving the clean of the cache and the declaration of ResourceMark outside of walk_stack_and_revoke(), since that was preventing the optimization of using the cache when revoking with revoke(GrowableArray<Handle>* objs, JavaThread *).
Coleen: I didn't simplified methods revoke(GrowableArray<Handle>*
objs...) and revoke_at_safepoint(GrowableArray<Handle>* objs) after all.
I tried to do it by replacing them with a single method that would mostly call walk_stack_and_revoke() for revocations, but that causes problems similar to 8225351, since the passed array might contain an object not biased towards the expected JavaThread.
Tested with mach5 tiers1-6.
Thanks!
Patricio
On 6/17/19 6:55 PM, Patricio Chilano wrote:
> Hi Coleen,
>
> On 6/17/19 4:55 PM, coleen.phillimore at oracle.com wrote:
>>
>> On 6/17/19 2:14 PM, Patricio Chilano wrote:
>>> Hi Coleen,
>>>
>>> On 6/14/19 7:08 PM, coleen.phillimore at oracle.com wrote:
>>>>
>>>> Sorry for being late to the party.
>>>>
>>>> http://cr.openjdk.java.net/~pchilanomate/8191890/v02/webrev/src/hot
>>>> spot/share/runtime/biasedLocking.cpp.frames.html
>>>>
>>>>
>>>> 586 if (_biased_locker == mark->biased_locker()) {
>>>> 587 if (mark->bias_epoch() == prototype->bias_epoch()) {
>>>>
>>>> Can you add a comment what this means? The object's biased locker
>>>> matches what we thought it was, and the epoch being the same means?
>>>> The epoch being equal means that this biaser actually might have
>>>> this lock? A comment would be good here.
>>> Yes, if the epoch is still valid it means the biaser could be
>>> currently synchronized on this object. If that's the case then we
>>> must walk its stack and change those monitor records into thin
>>> locks. Added comment.
>>>
>>>
>>>> 785 mark = res_mark; // Refresh mark with the latest value.
>>>>
>>>>
>>>> I don't see what this does either. I had to download your patch.
>>>> 'mark' isn't used outside the loop and it is reloaded at the top of
>>>> the loop.
>>> If the CAS fails, the mark needs to be updated with the new value so
>>> that when we get the current biaser (HR_SINGLE_REVOKE case) we
>>> actually get the updated biaser and not the old one. If we don't do
>>> that we could be handshaking the wrong thread, or worst we could hit
>>> an assert in walk_stack_and_revoke() for the "blt == THREAD", since
>>> the old thread could be ourselves.
>>
>> I found where it's used below now. Maybe once this function is
>> refactored a bit, it'll be easier to see for next time. Looks good!
>>>
>>>
>>>> 796 obj->cas_set_mark(prototype_header->set_age(mark->age()),
>>>> mark);
>>>>
>>>>
>>>> As an later enhancement, there should be some inline function in
>>>> markOop that returns the prototype header preserving the age of the
>>>> object, but I'll leave it to you to name.
>>> Ok, sounds good. That particular line was preexistent but I did
>>> added in some places
>>> "markOopDesc::prototype()->set_age(mark->age())", which is doing the
>>> same thing.
>>>
>>>
>>>> 555 RevokeOneBias(Handle* obj, JavaThread* requesting_thread,
>>>> JavaThread* biased_locker) ... 565 oop o = (*_obj)();
>>>>
>>>>
>>>> This was pre-existing your change, but passing Handle* is not
>>>> generally done, and is suspicious when it is because it must be
>>>> allocated with the thread calling the function. Can you change
>>>> this to Handle (not pointer)? I can't think why this would be done
>>>> this way.
>>> Fixed.
>>>
>>>
>>
>> Great. I was afraid there was some subtlety I didn't see. Please
>> retest with this though just in case. Sometimes bits of wierdness
>> have a strange reason that isn't documented.
> Yes, I'll retest just in case.
>
>
>>>> 870 // All objects in objs should be locked by biaser
>>>> 871 void BiasedLocking::revoke(GrowableArray<Handle>* objs,
>>>> JavaThread *biaser) {
>>>>
>>>>
>>>> I don't see why BiasedLocking::revoke() and
>>>> BiasedLocking::revoke_at_safepoint() are so different.
>>>>
>>>> The name "revoke" should be something more descriptive of the
>>>> situation though, like revoke_for_current_thread() or something
>>>> like that (revoke_at_safepoint's objects are from the stack too for
>>>> the current thread...) I keep thinking "revoke" should be a leaf
>>>> function in biasedLocking.
>>> Yes, nice observation. Method
>>> revoke_at_safepoint(GrowableArray<Handle>* objs) could be removed
>>> and we could just use BiasedLocking::revoke(GrowableArray<Handle>*
>>> objs ...) instead, since it's called from deoptimization.cpp where
>>> all the objects in the array belong to the same JavaThread. The
>>> difference is that we don't do update_heuristics() for the
>>> non-safepoint case since it might trigger a bulk operation. For the
>>> safepoint case it doesn't matter because we are already at one, we
>>> don't have the overhead of requesting it. But I could combine them
>>> into one method and do the update_heuristics() only if we are at a
>>> safepoint, what do you think?
>>
>> You could file a follow-up RFE for this if you want, since the
>> current version has gone through all the testing.
> I need to test again anyways and I think this could be a nice
> simplification. I'll test it and try to include it in v3.
>
>> The change looks great!
> Thanks Coleen! : )
>
>
> Patricio
>> Coleen
>>>
>>>> The change looks really good to me and I look forward to further
>>>> cleanups so maybe it'll make sense someday!
>>> Thanks for looking at this Coleen!
>>>
>>>
>>> Patricio
>>>> Thanks!
>>>> Coleen
>>>>
>>>> On 6/7/19 12:56 AM, Patricio Chilano wrote:
>>>>> Hi all,
>>>>>
>>>>> Here is v02 addressing comments made by Dan and David.
>>>>>
>>>>> Full webrev:
>>>>> http://cr.openjdk.java.net/~pchilanomate/8191890/v02/webrev/
>>>>> <http://cr.openjdk.java.net/%7Epchilanomate/8191890/v02/webrev/>
>>>>> Inc webrev:
>>>>> http://cr.openjdk.java.net/~pchilanomate/8191890/v02/inc/webrev/
>>>>> <http://cr.openjdk.java.net/%7Epchilanomate/8191890/v02/inc/>
>>>>>
>>>>> Thanks!
>>>>>
>>>>> Patricio
>>>>>
>>>>> On 6/6/19 7:37 PM, David Holmes wrote:
>>>>>> Hi Patricio,
>>>>>>
>>>>>> On 7/06/2019 9:19 am, Patricio Chilano wrote:
>>>>>>> Hi David,
>>>>>>>
>>>>>>>
>>>>>>> On 6/6/19 3:37 AM, David Holmes wrote:
>>>>>>>> Hi Patricio,
>>>>>>>>
>>>>>>>> First thanks for taking this on!
>>>>>>>>
>>>>>>>> I have some higher-level general discussion around this before
>>>>>>>> deep diving into the actual code review (not that I have much
>>>>>>>> there either :)).
>>>>>>>>
>>>>>>>> First to clarify how biased-locking works. I'm unclear when an
>>>>>>>> object can be rebiased after having its bias revoked? This
>>>>>>>> particularly relates to some of your assertions (as Markus has
>>>>>>>> queried) after the CAS to update the mark so that the bias is
>>>>>>>> revoked, and you then re-read the mark and assert the bias has
>>>>>>>> been revoked - what stops another thread from rebiasing the
>>>>>>>> object in between those two statements? Is it that rebiasing
>>>>>>>> cannot happen, or that it could only happen if there were an
>>>>>>>> intervening safepoint which in turn cannot happen?
>>>>>>> Once the bias of the object is revoked it will stay like that
>>>>>>> forever, it cannot happen that it goes back to having the 0x5
>>>>>>> pattern.
>>>>>>> Also, once the bias pattern in the prototype header for a class
>>>>>>> is revoked during a bulk revocation operation, if there is an
>>>>>>> object of that class that still has the bias pattern, a
>>>>>>> JavaThread that wants to synchronize on that object will always
>>>>>>> revoke the bias first. This is why I don't check if the CAS
>>>>>>> succeeded if the prototype of the class does not has the bias
>>>>>>> pattern, I just assert that the object is not biased anymore.
>>>>>>>
>>>>>>> Below I describe the cases where an object can be rebiased.
>>>>>>>
>>>>>>> Once a JavaThread biases an object for the first time, there are
>>>>>>> two cases that allows for that object to be rebiased:
>>>>>>> 1) If the epoch in the markword becomes invalid. For this to
>>>>>>> happen a bulk rebias operation is needed. This is why I do check
>>>>>>> if the CAS succeeded or not for these cases, since some other
>>>>>>> JavaThread could have rebiased it.
>>>>>>> 2) During a full GC, objects that are biased ( some JavaThread
>>>>>>> is set in the biaser bits) could have their markword be reset to
>>>>>>> 0x5. This means they will become anonymously biased again and so
>>>>>>> will look as if they were not biased yet. As to how this logic
>>>>>>> works: At the beginning of the full GC,
>>>>>>> BiasedLocking::preserve_marks() saves all the markwords for
>>>>>>> those objects that are currently locked and have a bias pattern.
>>>>>>> After that, markOopDesc::must_be_preserved_with_bias() will be
>>>>>>> called to decide if the markword of an object should be
>>>>>>> preserved or not. If the markword contains the bias pattern it
>>>>>>> is never preserved. At the end BiasedLocking::restore_marks() is
>>>>>>> called to restore the marks for those objects that we saved
>>>>>>> before. So this means that even if an object has a valid biaser,
>>>>>>> with valid epoch, if the object is not currently locked it could
>>>>>>> be reset during the GC. I'm not sure though if whenever
>>>>>>> markOopDesc::must_be_preserved_with_bias() returns false the
>>>>>>> garbage collector always does the reset or it just means it
>>>>>>> could reset it if it wants to. In any case I've seen that reset
>>>>>>> happening when doing handshakes. In fact, this is one of the
>>>>>>> reasons why the handshake could return that the bias was not
>>>>>>> revoked, since I don't check for the anonymously biased case in
>>>>>>> RevokeOneBias.
>>>>>>
>>>>>> Thanks for that very detailed set of descriptions. I won't
>>>>>> pretend to fully grok all the details as I'm not completely clear
>>>>>> on the role of the "epoch" or being anonymously biased, but I'm
>>>>>> convinced you have a full understanding of such things. :) In
>>>>>> revoke_and_rebias it was always a struggle for me to figure out
>>>>>> exactly when the "rebias" part could come into play.
>>>>>>
>>>>>>>
>>>>>>>> The main concern with a change like this (as with all the
>>>>>>>> handshake changes) is what new races this may allow and whether
>>>>>>>> they have all been accounted for. IIUC the handshake will still
>>>>>>>> be conducted by the VMThread so that still ensures
>>>>>>>> serialization wrt. safepoints (which makes it simpler to reason
>>>>>>>> about things). I've looked at some of the races you anticipated
>>>>>>>> (like the "resurrected" thread) and they seem to be handled
>>>>>>>> correctly. I'm unable to construct other races that might be
>>>>>>>> problematic (but that isn't saying a lot :) ).
>>>>>>> I agree that since we are now doing the revocation outside
>>>>>>> safepoints there is potential for additional races. But also one
>>>>>>> thing to note is that RevokeOneBias, which contains the logic of
>>>>>>> the handshake and is now replacing what we used to do at a
>>>>>>> safepoint, is not really different from the initial code in
>>>>>>> revoke_and_rebias() which is done outside safepoints. The
>>>>>>> handshake logic is like executing that initial part but with the
>>>>>>> right JavaThread so that if the object has a valid biaser, then
>>>>>>> that biaser is either ourselves or we are the VMThread while the
>>>>>>> biaser is blocked, so that we can execute revoke_own_lock(). In
>>>>>>> fact I was thinking at some point to combine them in some method
>>>>>>> (maybe try_fast_revoke()). The attempt_rebias flag and the
>>>>>>> update_heuristics() in revoke_and_rebias() complicated things so
>>>>>>> I kept them separate.
>>>>>>> I have also tried to think on all possible racy scenarios and
>>>>>>> couldn't find additional problems beside the "resurrected
>>>>>>> thread" one (although it's also not a guarantee of anything).
>>>>>>> But that's why I was thinking to check this in 14, so that if
>>>>>>> there are any problems we have plenty of testing time to detect
>>>>>>> them.
>>>>>>
>>>>>> Yes that is a good idea. No need to rush this into 13.
>>>>>>
>>>>>>>
>>>>>>>> src/hotspot/share/jfr/metadata/metadata.xml
>>>>>>>>
>>>>>>>> Is it the case that this event is now never generated from a
>>>>>>>> safepoint? Or have you just deleted the safepoint_id from the
>>>>>>>> event because it might not be at a safepoint? If the latter
>>>>>>>> can't we keep it and use 0 to indicate "not at a safepoint"? I
>>>>>>>> think the JFR folk need to comment on this part of the change
>>>>>>>> anyway.
>>>>>>> This event will be created and commited only from
>>>>>>> BiasedLocking::single_revoke_with_handshake(). Now, the actual
>>>>>>> handshake that revoked the bias could be executed at a safepoint
>>>>>>> only if ThreadLocalHandshakes is false. But I understand that
>>>>>>> this is true for all platforms so the handshake should always be
>>>>>>> executed outside safepoints.
>>>>>>
>>>>>> Ok.
>>>>>>
>>>>>>> It would be great if JFR folks review this part.
>>>>>>
>>>>>> Try to grab Markus :)
>>>>>>
>>>>>>>
>>>>>>>> src/hotspot/share/runtime/biasedLocking.cpp
>>>>>>>>
>>>>>>>> I second Dan's comment about combining cleanup and code motion
>>>>>>>> in a big change like this - it does make it much harder to spot
>>>>>>>> the real difference.
>>>>>>> Ok, already two objections on this so I'll revert moving the
>>>>>>> heuristics part. I think I also moved
>>>>>>> clean_up_cached_monitor_info() and I will double check any other
>>>>>>> movements.
>>>>>>>
>>>>>>>
>>>>>>>> I note Dan picked up on the lack of p2i and other stuff related
>>>>>>>> to the logging statements, and that you indicated they were
>>>>>>>> fixed. I note that all that stuff is pre-existing so I'm
>>>>>>>> unclear now whether you have fixed all the logging in the file
>>>>>>>> or only the statements in the code you have changed or added?
>>>>>>>> Again such cleanup may be best done separately.
>>>>>>> I haven't fixed the existing log statements, only the ones Dan
>>>>>>> mentioned which are in single_revoke_with_handshake(),
>>>>>>> revoke_own_lock(), and in VM_HandshakeOneThread(). Ok, I can fix
>>>>>>> the other ones in a cleanup later along with code movement and
>>>>>>> the removal of the attemp_rebias flag which we are not using.
>>>>>>
>>>>>> Okay. To be clear I don't expect you to fix all the existing uses
>>>>>> I just wanted to clarify which ones you had fixed.
>>>>>>
>>>>>>>
>>>>>>>> 640 void BiasedLocking::revoke_own_lock(oop obj, JavaThread*
>>>>>>>> biased_locker) {
>>>>>>>> 641 assert(!SafepointSynchronize::is_at_safepoint() ||
>>>>>>>> !ThreadLocalHandshakes,
>>>>>>>> 642 "if ThreadLocalHandshakes is enabled this should
>>>>>>>> always be executed outside safepoints");
>>>>>>>> 643 assert(Thread::current() == biased_locker ||
>>>>>>>> Thread::current()->is_VM_thread(), "wrong thread");
>>>>>>>> 644
>>>>>>>>
>>>>>>>> This is called "revoke_own_lock" but it can also be executed by
>>>>>>>> the VMThread - so its not its own lock. Also we don't revoke
>>>>>>>> anything related to a "lock" - we revoke a bias from the
>>>>>>>> markword of an oop. I think a better name is needed.
>>>>>>> Yes, I didn't really like it either. How about
>>>>>>> walk_stack_and_revoke() ?
>>>>>>
>>>>>> That sounds good to me. Roll on v2 :)
>>>>>>
>>>>>> Thanks,
>>>>>> David
>>>>>> -----
>>>>>>
>>>>>>>
>>>>>>>> src/hotspot/share/runtime/handshake.cpp
>>>>>>>>
>>>>>>>> 125 log_trace(handshake)("JavaThread " INTPTR_FORMAT "
>>>>>>>> is not alive", (intptr_t)_target);
>>>>>>>>
>>>>>>>> Use p2i(_target) rather than cast to intptr_t.
>>>>>>> Fixed.
>>>>>>>
>>>>>>>
>>>>>>>> That's all from me.
>>>>>>> Thanks for looking into this David! If you are okay with the
>>>>>>> "walk_stack_and_revoke()" name then I can send v2.
>>>>>>>
>>>>>>>
>>>>>>> Thanks!
>>>>>>>
>>>>>>>
>>>>>>> Patricio
>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> David
>>>>>>>> -----
>>>>>>>>
>>>>>>>>
>>>>>>>> On 30/05/2019 2:29 am, Patricio Chilano wrote:
>>>>>>>>> Hi all,
>>>>>>>>>
>>>>>>>>> Could you review this patch that uses thread local handshakes
>>>>>>>>> instead of safepoints to revoke the biases of locked objects?
>>>>>>>>>
>>>>>>>>> Webrev:
>>>>>>>>> http://cr.openjdk.java.net/~pchilanomate/8191890/v01/webrev/
>>>>>>>>> Bug:
>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8191890
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Today whenever a JavaThread needs to revoke the bias of an
>>>>>>>>> object that has been biased by another JavaThread (and where
>>>>>>>>> the epoch is still valid and the prototype header of the class
>>>>>>>>> still has the bias pattern) it needs to request a safepoint
>>>>>>>>> operation. The VMThread inside the safepoint walks the stack
>>>>>>>>> of the biaser looking for lock records associated with the
>>>>>>>>> biased object, and converts them to thin locks if any are found.
>>>>>>>>> This patch uses thread local handshakes instead, since we
>>>>>>>>> actually only need to be able to safely walk the stack of the
>>>>>>>>> JavaThread that biased the object and not other JavaThreads.
>>>>>>>>>
>>>>>>>>> Some notes about the patch:
>>>>>>>>>
>>>>>>>>> - Thanks to Robbin for initial work on this patch and for
>>>>>>>>> advice and feedback!
>>>>>>>>> - We still execute bulk rebias and bulk revoke operations
>>>>>>>>> inside safepoints, since in those cases all the JavaThread's
>>>>>>>>> stacks need to be walked to potentially update lock records.
>>>>>>>>> - The method revoke_bias() was renamed to
>>>>>>>>> single_revoke_at_safepoint(). This method is still kept
>>>>>>>>> because there are places where we check whether we are already
>>>>>>>>> at safepoint when trying to revoke. In those cases, if we are
>>>>>>>>> already at a safepoint we simply end up calling this method.
>>>>>>>>> - Handshakes are executed as VMOperations so the VMThread is
>>>>>>>>> still involved in the revocation. This means we cannot have
>>>>>>>>> different revocations being executed in parallel (same as with
>>>>>>>>> safepoints). Ideally we would like to execute thread local
>>>>>>>>> handshakes without needing for the VMThread to participate.
>>>>>>>>> However, now other JavaThreads that do not participate in the
>>>>>>>>> revocation are allow to continue making progress.
>>>>>>>>>
>>>>>>>>> Run several benchmarks and mostly performance seems
>>>>>>>>> unaffected. Measured the average time it takes for revoking
>>>>>>>>> bias with a handshake and with a safepoint and numbers are
>>>>>>>>> pretty similar varying between benchmarks. Some numbers are
>>>>>>>>> shown below:
>>>>>>>>>
>>>>>>>>> specjbb2015
>>>>>>>>> Handshakes Safepoints Linux
>>>>>>>>> 4ms 4.6ms Windows 11ms
>>>>>>>>> 19ms
>>>>>>>>>
>>>>>>>>> startup benchmarks
>>>>>>>>> Handshakes Safepoints Linux
>>>>>>>>> 159us 248us Windows 150us
>>>>>>>>> 111us
>>>>>>>>>
>>>>>>>>> Overall the variation is not enough to show significant
>>>>>>>>> difference in performance, considering also that revocations
>>>>>>>>> of a valid biaser are usually a fraction of the overall
>>>>>>>>> running time of a benchmark (specially jbb2015). In any case
>>>>>>>>> using handshakes allows other JavaThreads to make progress
>>>>>>>>> during that time, minimizing STW operations.
>>>>>>>>>
>>>>>>>>> In terms of testing, the patch passed several runs of tiers1-6
>>>>>>>>> in mach5 on Windows, Linux, MacOS and Solaris.
>>>>>>>>>
>>>>>>>>> Thanks!
>>>>>>>>>
>>>>>>>>> Patricio
>>>>>>>>>
>>>>>>>
>>>>>
>>>>
>>>
>>
>
More information about the hotspot-runtime-dev
mailing list