RFR 8191890: Biased locking still uses the inferior stop the world safepoint for revocation
Robbin Ehn
robbin.ehn at oracle.com
Thu Jun 20 09:58:06 UTC 2019
> Good. It seems like better to do once 8225351 is fixed and as its own change.
8225351 is going into 13 and being forward ported to master, jdk/jdk.
So it's good if it apply cleanly :)
>
> The change looks good to me!
+1
/Robbin
> Coleen
>
>>
>> Tested with mach5 tiers1-6.
>>
>> Thanks!
>>
>> Patricio
>>
>> On 6/17/19 6:55 PM, Patricio Chilano wrote:
>>> Hi Coleen,
>>>
>>> On 6/17/19 4:55 PM, coleen.phillimore at oracle.com wrote:
>>>>
>>>> On 6/17/19 2:14 PM, Patricio Chilano wrote:
>>>>> Hi Coleen,
>>>>>
>>>>> On 6/14/19 7:08 PM, coleen.phillimore at oracle.com wrote:
>>>>>>
>>>>>> Sorry for being late to the party.
>>>>>>
>>>>>> http://cr.openjdk.java.net/~pchilanomate/8191890/v02/webrev/src/hotspot/share/runtime/biasedLocking.cpp.frames.html
>>>>>>
>>>>>>
>>>>>> 586 if (_biased_locker == mark->biased_locker()) {
>>>>>> 587 if (mark->bias_epoch() == prototype->bias_epoch()) {
>>>>>>
>>>>>> Can you add a comment what this means? The object's biased locker
>>>>>> matches what we thought it was, and the epoch being the same means?
>>>>>> The epoch being equal means that this biaser actually might have this
>>>>>> lock? A comment would be good here.
>>>>> Yes, if the epoch is still valid it means the biaser could be currently
>>>>> synchronized on this object. If that's the case then we must walk its stack
>>>>> and change those monitor records into thin locks. Added comment.
>>>>>
>>>>>
>>>>>> 785 mark = res_mark; // Refresh mark with the latest value.
>>>>>>
>>>>>>
>>>>>> I don't see what this does either. I had to download your patch. 'mark'
>>>>>> isn't used outside the loop and it is reloaded at the top of the loop.
>>>>> If the CAS fails, the mark needs to be updated with the new value so that
>>>>> when we get the current biaser (HR_SINGLE_REVOKE case) we actually get the
>>>>> updated biaser and not the old one. If we don't do that we could be
>>>>> handshaking the wrong thread, or worst we could hit an assert in
>>>>> walk_stack_and_revoke() for the "blt == THREAD", since the old thread could
>>>>> be ourselves.
>>>>
>>>> I found where it's used below now. Maybe once this function is refactored a
>>>> bit, it'll be easier to see for next time. Looks good!
>>>>>
>>>>>
>>>>>> 796 obj->cas_set_mark(prototype_header->set_age(mark->age()), mark);
>>>>>>
>>>>>>
>>>>>> As an later enhancement, there should be some inline function in markOop
>>>>>> that returns the prototype header preserving the age of the object, but
>>>>>> I'll leave it to you to name.
>>>>> Ok, sounds good. That particular line was preexistent but I did added in
>>>>> some places "markOopDesc::prototype()->set_age(mark->age())", which is
>>>>> doing the same thing.
>>>>>
>>>>>
>>>>>> 555 RevokeOneBias(Handle* obj, JavaThread* requesting_thread, JavaThread*
>>>>>> biased_locker) ... 565 oop o = (*_obj)();
>>>>>>
>>>>>>
>>>>>> This was pre-existing your change, but passing Handle* is not generally
>>>>>> done, and is suspicious when it is because it must be allocated with the
>>>>>> thread calling the function. Can you change this to Handle (not
>>>>>> pointer)? I can't think why this would be done this way.
>>>>> Fixed.
>>>>>
>>>>>
>>>>
>>>> Great. I was afraid there was some subtlety I didn't see. Please retest
>>>> with this though just in case. Sometimes bits of wierdness have a strange
>>>> reason that isn't documented.
>>> Yes, I'll retest just in case.
>>>
>>>
>>>>>> 870 // All objects in objs should be locked by biaser
>>>>>> 871 void BiasedLocking::revoke(GrowableArray<Handle>* objs, JavaThread
>>>>>> *biaser) {
>>>>>>
>>>>>>
>>>>>> I don't see why BiasedLocking::revoke() and
>>>>>> BiasedLocking::revoke_at_safepoint() are so different.
>>>>>>
>>>>>> The name "revoke" should be something more descriptive of the situation
>>>>>> though, like revoke_for_current_thread() or something like that
>>>>>> (revoke_at_safepoint's objects are from the stack too for the current
>>>>>> thread...) I keep thinking "revoke" should be a leaf function in
>>>>>> biasedLocking.
>>>>> Yes, nice observation. Method revoke_at_safepoint(GrowableArray<Handle>*
>>>>> objs) could be removed and we could just use
>>>>> BiasedLocking::revoke(GrowableArray<Handle>* objs ...) instead, since it's
>>>>> called from deoptimization.cpp where all the objects in the array belong to
>>>>> the same JavaThread. The difference is that we don't do update_heuristics()
>>>>> for the non-safepoint case since it might trigger a bulk operation. For the
>>>>> safepoint case it doesn't matter because we are already at one, we don't
>>>>> have the overhead of requesting it. But I could combine them into one
>>>>> method and do the update_heuristics() only if we are at a safepoint, what
>>>>> do you think?
>>>>
>>>> You could file a follow-up RFE for this if you want, since the current
>>>> version has gone through all the testing.
>>> I need to test again anyways and I think this could be a nice simplification.
>>> I'll test it and try to include it in v3.
>>>
>>>> The change looks great!
>>> Thanks Coleen! : )
>>>
>>>
>>> Patricio
>>>> Coleen
>>>>>
>>>>>> The change looks really good to me and I look forward to further cleanups
>>>>>> so maybe it'll make sense someday!
>>>>> Thanks for looking at this Coleen!
>>>>>
>>>>>
>>>>> Patricio
>>>>>> Thanks!
>>>>>> Coleen
>>>>>>
>>>>>> On 6/7/19 12:56 AM, Patricio Chilano wrote:
>>>>>>> Hi all,
>>>>>>>
>>>>>>> Here is v02 addressing comments made by Dan and David.
>>>>>>>
>>>>>>> Full webrev:
>>>>>>> http://cr.openjdk.java.net/~pchilanomate/8191890/v02/webrev/
>>>>>>> <http://cr.openjdk.java.net/%7Epchilanomate/8191890/v02/webrev/>
>>>>>>> Inc webrev:
>>>>>>> http://cr.openjdk.java.net/~pchilanomate/8191890/v02/inc/webrev/
>>>>>>> <http://cr.openjdk.java.net/%7Epchilanomate/8191890/v02/inc/>
>>>>>>>
>>>>>>> Thanks!
>>>>>>>
>>>>>>> Patricio
>>>>>>>
>>>>>>> On 6/6/19 7:37 PM, David Holmes wrote:
>>>>>>>> Hi Patricio,
>>>>>>>>
>>>>>>>> On 7/06/2019 9:19 am, Patricio Chilano wrote:
>>>>>>>>> Hi David,
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 6/6/19 3:37 AM, David Holmes wrote:
>>>>>>>>>> Hi Patricio,
>>>>>>>>>>
>>>>>>>>>> First thanks for taking this on!
>>>>>>>>>>
>>>>>>>>>> I have some higher-level general discussion around this before deep
>>>>>>>>>> diving into the actual code review (not that I have much there either
>>>>>>>>>> :)).
>>>>>>>>>>
>>>>>>>>>> First to clarify how biased-locking works. I'm unclear when an object
>>>>>>>>>> can be rebiased after having its bias revoked? This particularly
>>>>>>>>>> relates to some of your assertions (as Markus has queried) after the
>>>>>>>>>> CAS to update the mark so that the bias is revoked, and you then
>>>>>>>>>> re-read the mark and assert the bias has been revoked - what stops
>>>>>>>>>> another thread from rebiasing the object in between those two
>>>>>>>>>> statements? Is it that rebiasing cannot happen, or that it could only
>>>>>>>>>> happen if there were an intervening safepoint which in turn cannot
>>>>>>>>>> happen?
>>>>>>>>> Once the bias of the object is revoked it will stay like that forever,
>>>>>>>>> it cannot happen that it goes back to having the 0x5 pattern.
>>>>>>>>> Also, once the bias pattern in the prototype header for a class is
>>>>>>>>> revoked during a bulk revocation operation, if there is an object of
>>>>>>>>> that class that still has the bias pattern, a JavaThread that wants to
>>>>>>>>> synchronize on that object will always revoke the bias first. This is
>>>>>>>>> why I don't check if the CAS succeeded if the prototype of the class
>>>>>>>>> does not has the bias pattern, I just assert that the object is not
>>>>>>>>> biased anymore.
>>>>>>>>>
>>>>>>>>> Below I describe the cases where an object can be rebiased.
>>>>>>>>>
>>>>>>>>> Once a JavaThread biases an object for the first time, there are two
>>>>>>>>> cases that allows for that object to be rebiased:
>>>>>>>>> 1) If the epoch in the markword becomes invalid. For this to happen a
>>>>>>>>> bulk rebias operation is needed. This is why I do check if the CAS
>>>>>>>>> succeeded or not for these cases, since some other JavaThread could
>>>>>>>>> have rebiased it.
>>>>>>>>> 2) During a full GC, objects that are biased ( some JavaThread is set
>>>>>>>>> in the biaser bits) could have their markword be reset to 0x5. This
>>>>>>>>> means they will become anonymously biased again and so will look as if
>>>>>>>>> they were not biased yet. As to how this logic works: At the beginning
>>>>>>>>> of the full GC, BiasedLocking::preserve_marks() saves all the markwords
>>>>>>>>> for those objects that are currently locked and have a bias pattern.
>>>>>>>>> After that, markOopDesc::must_be_preserved_with_bias() will be called
>>>>>>>>> to decide if the markword of an object should be preserved or not. If
>>>>>>>>> the markword contains the bias pattern it is never preserved. At the
>>>>>>>>> end BiasedLocking::restore_marks() is called to restore the marks for
>>>>>>>>> those objects that we saved before. So this means that even if an
>>>>>>>>> object has a valid biaser, with valid epoch, if the object is not
>>>>>>>>> currently locked it could be reset during the GC. I'm not sure though
>>>>>>>>> if whenever markOopDesc::must_be_preserved_with_bias() returns false
>>>>>>>>> the garbage collector always does the reset or it just means it could
>>>>>>>>> reset it if it wants to. In any case I've seen that reset happening
>>>>>>>>> when doing handshakes. In fact, this is one of the reasons why the
>>>>>>>>> handshake could return that the bias was not revoked, since I don't
>>>>>>>>> check for the anonymously biased case in RevokeOneBias.
>>>>>>>>
>>>>>>>> Thanks for that very detailed set of descriptions. I won't pretend to
>>>>>>>> fully grok all the details as I'm not completely clear on the role of
>>>>>>>> the "epoch" or being anonymously biased, but I'm convinced you have a
>>>>>>>> full understanding of such things. :) In revoke_and_rebias it was always
>>>>>>>> a struggle for me to figure out exactly when the "rebias" part could
>>>>>>>> come into play.
>>>>>>>>
>>>>>>>>>
>>>>>>>>>> The main concern with a change like this (as with all the handshake
>>>>>>>>>> changes) is what new races this may allow and whether they have all
>>>>>>>>>> been accounted for. IIUC the handshake will still be conducted by the
>>>>>>>>>> VMThread so that still ensures serialization wrt. safepoints (which
>>>>>>>>>> makes it simpler to reason about things). I've looked at some of the
>>>>>>>>>> races you anticipated (like the "resurrected" thread) and they seem to
>>>>>>>>>> be handled correctly. I'm unable to construct other races that might
>>>>>>>>>> be problematic (but that isn't saying a lot :) ).
>>>>>>>>> I agree that since we are now doing the revocation outside safepoints
>>>>>>>>> there is potential for additional races. But also one thing to note is
>>>>>>>>> that RevokeOneBias, which contains the logic of the handshake and is
>>>>>>>>> now replacing what we used to do at a safepoint, is not really
>>>>>>>>> different from the initial code in revoke_and_rebias() which is done
>>>>>>>>> outside safepoints. The handshake logic is like executing that initial
>>>>>>>>> part but with the right JavaThread so that if the object has a valid
>>>>>>>>> biaser, then that biaser is either ourselves or we are the VMThread
>>>>>>>>> while the biaser is blocked, so that we can execute revoke_own_lock().
>>>>>>>>> In fact I was thinking at some point to combine them in some method
>>>>>>>>> (maybe try_fast_revoke()). The attempt_rebias flag and the
>>>>>>>>> update_heuristics() in revoke_and_rebias() complicated things so I kept
>>>>>>>>> them separate.
>>>>>>>>> I have also tried to think on all possible racy scenarios and couldn't
>>>>>>>>> find additional problems beside the "resurrected thread" one (although
>>>>>>>>> it's also not a guarantee of anything). But that's why I was thinking
>>>>>>>>> to check this in 14, so that if there are any problems we have plenty
>>>>>>>>> of testing time to detect them.
>>>>>>>>
>>>>>>>> Yes that is a good idea. No need to rush this into 13.
>>>>>>>>
>>>>>>>>>
>>>>>>>>>> src/hotspot/share/jfr/metadata/metadata.xml
>>>>>>>>>>
>>>>>>>>>> Is it the case that this event is now never generated from a
>>>>>>>>>> safepoint? Or have you just deleted the safepoint_id from the event
>>>>>>>>>> because it might not be at a safepoint? If the latter can't we keep it
>>>>>>>>>> and use 0 to indicate "not at a safepoint"? I think the JFR folk need
>>>>>>>>>> to comment on this part of the change anyway.
>>>>>>>>> This event will be created and commited only from
>>>>>>>>> BiasedLocking::single_revoke_with_handshake(). Now, the actual
>>>>>>>>> handshake that revoked the bias could be executed at a safepoint only
>>>>>>>>> if ThreadLocalHandshakes is false. But I understand that this is true
>>>>>>>>> for all platforms so the handshake should always be executed outside
>>>>>>>>> safepoints.
>>>>>>>>
>>>>>>>> Ok.
>>>>>>>>
>>>>>>>>> It would be great if JFR folks review this part.
>>>>>>>>
>>>>>>>> Try to grab Markus :)
>>>>>>>>
>>>>>>>>>
>>>>>>>>>> src/hotspot/share/runtime/biasedLocking.cpp
>>>>>>>>>>
>>>>>>>>>> I second Dan's comment about combining cleanup and code motion in a
>>>>>>>>>> big change like this - it does make it much harder to spot the real
>>>>>>>>>> difference.
>>>>>>>>> Ok, already two objections on this so I'll revert moving the heuristics
>>>>>>>>> part. I think I also moved clean_up_cached_monitor_info() and I will
>>>>>>>>> double check any other movements.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> I note Dan picked up on the lack of p2i and other stuff related to the
>>>>>>>>>> logging statements, and that you indicated they were fixed. I note
>>>>>>>>>> that all that stuff is pre-existing so I'm unclear now whether you
>>>>>>>>>> have fixed all the logging in the file or only the statements in the
>>>>>>>>>> code you have changed or added? Again such cleanup may be best done
>>>>>>>>>> separately.
>>>>>>>>> I haven't fixed the existing log statements, only the ones Dan
>>>>>>>>> mentioned which are in single_revoke_with_handshake(),
>>>>>>>>> revoke_own_lock(), and in VM_HandshakeOneThread(). Ok, I can fix the
>>>>>>>>> other ones in a cleanup later along with code movement and the removal
>>>>>>>>> of the attemp_rebias flag which we are not using.
>>>>>>>>
>>>>>>>> Okay. To be clear I don't expect you to fix all the existing uses I just
>>>>>>>> wanted to clarify which ones you had fixed.
>>>>>>>>
>>>>>>>>>
>>>>>>>>>> 640 void BiasedLocking::revoke_own_lock(oop obj, JavaThread*
>>>>>>>>>> biased_locker) {
>>>>>>>>>> 641 assert(!SafepointSynchronize::is_at_safepoint() ||
>>>>>>>>>> !ThreadLocalHandshakes,
>>>>>>>>>> 642 "if ThreadLocalHandshakes is enabled this should always
>>>>>>>>>> be executed outside safepoints");
>>>>>>>>>> 643 assert(Thread::current() == biased_locker ||
>>>>>>>>>> Thread::current()->is_VM_thread(), "wrong thread");
>>>>>>>>>> 644
>>>>>>>>>>
>>>>>>>>>> This is called "revoke_own_lock" but it can also be executed by the
>>>>>>>>>> VMThread - so its not its own lock. Also we don't revoke anything
>>>>>>>>>> related to a "lock" - we revoke a bias from the markword of an oop. I
>>>>>>>>>> think a better name is needed.
>>>>>>>>> Yes, I didn't really like it either. How about walk_stack_and_revoke() ?
>>>>>>>>
>>>>>>>> That sounds good to me. Roll on v2 :)
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> David
>>>>>>>> -----
>>>>>>>>
>>>>>>>>>
>>>>>>>>>> src/hotspot/share/runtime/handshake.cpp
>>>>>>>>>>
>>>>>>>>>> 125 log_trace(handshake)("JavaThread " INTPTR_FORMAT " is not
>>>>>>>>>> alive", (intptr_t)_target);
>>>>>>>>>>
>>>>>>>>>> Use p2i(_target) rather than cast to intptr_t.
>>>>>>>>> Fixed.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> That's all from me.
>>>>>>>>> Thanks for looking into this David! If you are okay with the
>>>>>>>>> "walk_stack_and_revoke()" name then I can send v2.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks!
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Patricio
>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> David
>>>>>>>>>> -----
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 30/05/2019 2:29 am, Patricio Chilano wrote:
>>>>>>>>>>> Hi all,
>>>>>>>>>>>
>>>>>>>>>>> Could you review this patch that uses thread local handshakes instead
>>>>>>>>>>> of safepoints to revoke the biases of locked objects?
>>>>>>>>>>>
>>>>>>>>>>> Webrev:
>>>>>>>>>>> http://cr.openjdk.java.net/~pchilanomate/8191890/v01/webrev/
>>>>>>>>>>> Bug:
>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8191890
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Today whenever a JavaThread needs to revoke the bias of an object
>>>>>>>>>>> that has been biased by another JavaThread (and where the epoch is
>>>>>>>>>>> still valid and the prototype header of the class still has the bias
>>>>>>>>>>> pattern) it needs to request a safepoint operation. The VMThread
>>>>>>>>>>> inside the safepoint walks the stack of the biaser looking for lock
>>>>>>>>>>> records associated with the biased object, and converts them to thin
>>>>>>>>>>> locks if any are found.
>>>>>>>>>>> This patch uses thread local handshakes instead, since we actually
>>>>>>>>>>> only need to be able to safely walk the stack of the JavaThread that
>>>>>>>>>>> biased the object and not other JavaThreads.
>>>>>>>>>>>
>>>>>>>>>>> Some notes about the patch:
>>>>>>>>>>>
>>>>>>>>>>> - Thanks to Robbin for initial work on this patch and for advice and
>>>>>>>>>>> feedback!
>>>>>>>>>>> - We still execute bulk rebias and bulk revoke operations inside
>>>>>>>>>>> safepoints, since in those cases all the JavaThread's stacks need to
>>>>>>>>>>> be walked to potentially update lock records.
>>>>>>>>>>> - The method revoke_bias() was renamed to
>>>>>>>>>>> single_revoke_at_safepoint(). This method is still kept because there
>>>>>>>>>>> are places where we check whether we are already at safepoint when
>>>>>>>>>>> trying to revoke. In those cases, if we are already at a safepoint we
>>>>>>>>>>> simply end up calling this method.
>>>>>>>>>>> - Handshakes are executed as VMOperations so the VMThread is still
>>>>>>>>>>> involved in the revocation. This means we cannot have different
>>>>>>>>>>> revocations being executed in parallel (same as with safepoints).
>>>>>>>>>>> Ideally we would like to execute thread local handshakes without
>>>>>>>>>>> needing for the VMThread to participate. However, now other
>>>>>>>>>>> JavaThreads that do not participate in the revocation are allow to
>>>>>>>>>>> continue making progress.
>>>>>>>>>>>
>>>>>>>>>>> Run several benchmarks and mostly performance seems unaffected.
>>>>>>>>>>> Measured the average time it takes for revoking bias with a handshake
>>>>>>>>>>> and with a safepoint and numbers are pretty similar varying between
>>>>>>>>>>> benchmarks. Some numbers are shown below:
>>>>>>>>>>>
>>>>>>>>>>> specjbb2015
>>>>>>>>>>> Handshakes Safepoints
>>>>>>>>>>> Linux 4ms 4.6ms
>>>>>>>>>>> Windows 11ms 19ms
>>>>>>>>>>>
>>>>>>>>>>> startup benchmarks
>>>>>>>>>>> Handshakes Safepoints
>>>>>>>>>>> Linux 159us 248us
>>>>>>>>>>> Windows 150us 111us
>>>>>>>>>>>
>>>>>>>>>>> Overall the variation is not enough to show significant difference in
>>>>>>>>>>> performance, considering also that revocations of a valid biaser are
>>>>>>>>>>> usually a fraction of the overall running time of a benchmark
>>>>>>>>>>> (specially jbb2015). In any case using handshakes allows other
>>>>>>>>>>> JavaThreads to make progress during that time, minimizing STW
>>>>>>>>>>> operations.
>>>>>>>>>>>
>>>>>>>>>>> In terms of testing, the patch passed several runs of tiers1-6 in
>>>>>>>>>>> mach5 on Windows, Linux, MacOS and Solaris.
>>>>>>>>>>>
>>>>>>>>>>> Thanks!
>>>>>>>>>>>
>>>>>>>>>>> Patricio
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
More information about the hotspot-runtime-dev
mailing list