RFR 8191890: Biased locking still uses the inferior stop the world safepoint for revocation
coleen.phillimore at oracle.com
coleen.phillimore at oracle.com
Wed Jun 19 21:35:29 UTC 2019
On 6/19/19 1:08 AM, Patricio Chilano wrote:
> Hi all,
>
> Here is v03, it contains the fixes suggested by Markus and Coleen:
>
> Full: http://cr.openjdk.java.net/~pchilanomate/8191890/v03/webrev/
> <http://cr.openjdk.java.net/%7Epchilanomate/8191890/v03/webrev/>
> Inc: http://cr.openjdk.java.net/~pchilanomate/8191890/v03/inc/webrev/
> <http://cr.openjdk.java.net/%7Epchilanomate/8191890/v03/inc/webrev/>
>
> The only extra change that I made was moving the clean of the cache
> and the declaration of ResourceMark outside of
> walk_stack_and_revoke(), since that was preventing the optimization of
> using the cache when revoking with revoke(GrowableArray<Handle>* objs,
> JavaThread *).
>
> Coleen: I didn't simplified methods revoke(GrowableArray<Handle>*
> objs...) and revoke_at_safepoint(GrowableArray<Handle>* objs) after
> all. I tried to do it by replacing them with a single method that
> would mostly call walk_stack_and_revoke() for revocations, but that
> causes problems similar to 8225351, since the passed array might
> contain an object not biased towards the expected JavaThread.
Good. It seems like better to do once 8225351 is fixed and as its own
change.
The change looks good to me!
Coleen
>
> Tested with mach5 tiers1-6.
>
> Thanks!
>
> Patricio
>
> On 6/17/19 6:55 PM, Patricio Chilano wrote:
>> Hi Coleen,
>>
>> On 6/17/19 4:55 PM, coleen.phillimore at oracle.com wrote:
>>>
>>> On 6/17/19 2:14 PM, Patricio Chilano wrote:
>>>> Hi Coleen,
>>>>
>>>> On 6/14/19 7:08 PM, coleen.phillimore at oracle.com wrote:
>>>>>
>>>>> Sorry for being late to the party.
>>>>>
>>>>> http://cr.openjdk.java.net/~pchilanomate/8191890/v02/webrev/src/hotspot/share/runtime/biasedLocking.cpp.frames.html
>>>>>
>>>>>
>>>>> 586 if (_biased_locker == mark->biased_locker()) {
>>>>> 587 if (mark->bias_epoch() == prototype->bias_epoch()) {
>>>>>
>>>>> Can you add a comment what this means? The object's biased
>>>>> locker matches what we thought it was, and the epoch being the
>>>>> same means?
>>>>> The epoch being equal means that this biaser actually might have
>>>>> this lock? A comment would be good here.
>>>> Yes, if the epoch is still valid it means the biaser could be
>>>> currently synchronized on this object. If that's the case then we
>>>> must walk its stack and change those monitor records into thin
>>>> locks. Added comment.
>>>>
>>>>
>>>>> 785 mark = res_mark; // Refresh mark with the latest value.
>>>>>
>>>>>
>>>>> I don't see what this does either. I had to download your patch.
>>>>> 'mark' isn't used outside the loop and it is reloaded at the top
>>>>> of the loop.
>>>> If the CAS fails, the mark needs to be updated with the new value
>>>> so that when we get the current biaser (HR_SINGLE_REVOKE case) we
>>>> actually get the updated biaser and not the old one. If we don't do
>>>> that we could be handshaking the wrong thread, or worst we could
>>>> hit an assert in walk_stack_and_revoke() for the "blt == THREAD",
>>>> since the old thread could be ourselves.
>>>
>>> I found where it's used below now. Maybe once this function is
>>> refactored a bit, it'll be easier to see for next time. Looks good!
>>>>
>>>>
>>>>> 796 obj->cas_set_mark(prototype_header->set_age(mark->age()), mark);
>>>>>
>>>>>
>>>>> As an later enhancement, there should be some inline function in
>>>>> markOop that returns the prototype header preserving the age of
>>>>> the object, but I'll leave it to you to name.
>>>> Ok, sounds good. That particular line was preexistent but I did
>>>> added in some places
>>>> "markOopDesc::prototype()->set_age(mark->age())", which is doing
>>>> the same thing.
>>>>
>>>>
>>>>> 555 RevokeOneBias(Handle* obj, JavaThread* requesting_thread,
>>>>> JavaThread* biased_locker) ... 565 oop o = (*_obj)();
>>>>>
>>>>>
>>>>> This was pre-existing your change, but passing Handle* is not
>>>>> generally done, and is suspicious when it is because it must be
>>>>> allocated with the thread calling the function. Can you change
>>>>> this to Handle (not pointer)? I can't think why this would be
>>>>> done this way.
>>>> Fixed.
>>>>
>>>>
>>>
>>> Great. I was afraid there was some subtlety I didn't see. Please
>>> retest with this though just in case. Sometimes bits of wierdness
>>> have a strange reason that isn't documented.
>> Yes, I'll retest just in case.
>>
>>
>>>>> 870 // All objects in objs should be locked by biaser
>>>>> 871 void BiasedLocking::revoke(GrowableArray<Handle>* objs,
>>>>> JavaThread *biaser) {
>>>>>
>>>>>
>>>>> I don't see why BiasedLocking::revoke() and
>>>>> BiasedLocking::revoke_at_safepoint() are so different.
>>>>>
>>>>> The name "revoke" should be something more descriptive of the
>>>>> situation though, like revoke_for_current_thread() or something
>>>>> like that (revoke_at_safepoint's objects are from the stack too
>>>>> for the current thread...) I keep thinking "revoke" should be a
>>>>> leaf function in biasedLocking.
>>>> Yes, nice observation. Method
>>>> revoke_at_safepoint(GrowableArray<Handle>* objs) could be removed
>>>> and we could just use BiasedLocking::revoke(GrowableArray<Handle>*
>>>> objs ...) instead, since it's called from deoptimization.cpp where
>>>> all the objects in the array belong to the same JavaThread. The
>>>> difference is that we don't do update_heuristics() for the
>>>> non-safepoint case since it might trigger a bulk operation. For the
>>>> safepoint case it doesn't matter because we are already at one, we
>>>> don't have the overhead of requesting it. But I could combine them
>>>> into one method and do the update_heuristics() only if we are at a
>>>> safepoint, what do you think?
>>>
>>> You could file a follow-up RFE for this if you want, since the
>>> current version has gone through all the testing.
>> I need to test again anyways and I think this could be a nice
>> simplification. I'll test it and try to include it in v3.
>>
>>> The change looks great!
>> Thanks Coleen! : )
>>
>>
>> Patricio
>>> Coleen
>>>>
>>>>> The change looks really good to me and I look forward to further
>>>>> cleanups so maybe it'll make sense someday!
>>>> Thanks for looking at this Coleen!
>>>>
>>>>
>>>> Patricio
>>>>> Thanks!
>>>>> Coleen
>>>>>
>>>>> On 6/7/19 12:56 AM, Patricio Chilano wrote:
>>>>>> Hi all,
>>>>>>
>>>>>> Here is v02 addressing comments made by Dan and David.
>>>>>>
>>>>>> Full webrev:
>>>>>> http://cr.openjdk.java.net/~pchilanomate/8191890/v02/webrev/
>>>>>> <http://cr.openjdk.java.net/%7Epchilanomate/8191890/v02/webrev/>
>>>>>> Inc webrev:
>>>>>> http://cr.openjdk.java.net/~pchilanomate/8191890/v02/inc/webrev/
>>>>>> <http://cr.openjdk.java.net/%7Epchilanomate/8191890/v02/inc/>
>>>>>>
>>>>>> Thanks!
>>>>>>
>>>>>> Patricio
>>>>>>
>>>>>> On 6/6/19 7:37 PM, David Holmes wrote:
>>>>>>> Hi Patricio,
>>>>>>>
>>>>>>> On 7/06/2019 9:19 am, Patricio Chilano wrote:
>>>>>>>> Hi David,
>>>>>>>>
>>>>>>>>
>>>>>>>> On 6/6/19 3:37 AM, David Holmes wrote:
>>>>>>>>> Hi Patricio,
>>>>>>>>>
>>>>>>>>> First thanks for taking this on!
>>>>>>>>>
>>>>>>>>> I have some higher-level general discussion around this before
>>>>>>>>> deep diving into the actual code review (not that I have much
>>>>>>>>> there either :)).
>>>>>>>>>
>>>>>>>>> First to clarify how biased-locking works. I'm unclear when an
>>>>>>>>> object can be rebiased after having its bias revoked? This
>>>>>>>>> particularly relates to some of your assertions (as Markus has
>>>>>>>>> queried) after the CAS to update the mark so that the bias is
>>>>>>>>> revoked, and you then re-read the mark and assert the bias has
>>>>>>>>> been revoked - what stops another thread from rebiasing the
>>>>>>>>> object in between those two statements? Is it that rebiasing
>>>>>>>>> cannot happen, or that it could only happen if there were an
>>>>>>>>> intervening safepoint which in turn cannot happen?
>>>>>>>> Once the bias of the object is revoked it will stay like that
>>>>>>>> forever, it cannot happen that it goes back to having the 0x5
>>>>>>>> pattern.
>>>>>>>> Also, once the bias pattern in the prototype header for a class
>>>>>>>> is revoked during a bulk revocation operation, if there is an
>>>>>>>> object of that class that still has the bias pattern, a
>>>>>>>> JavaThread that wants to synchronize on that object will always
>>>>>>>> revoke the bias first. This is why I don't check if the CAS
>>>>>>>> succeeded if the prototype of the class does not has the bias
>>>>>>>> pattern, I just assert that the object is not biased anymore.
>>>>>>>>
>>>>>>>> Below I describe the cases where an object can be rebiased.
>>>>>>>>
>>>>>>>> Once a JavaThread biases an object for the first time, there
>>>>>>>> are two cases that allows for that object to be rebiased:
>>>>>>>> 1) If the epoch in the markword becomes invalid. For this to
>>>>>>>> happen a bulk rebias operation is needed. This is why I do
>>>>>>>> check if the CAS succeeded or not for these cases, since some
>>>>>>>> other JavaThread could have rebiased it.
>>>>>>>> 2) During a full GC, objects that are biased ( some JavaThread
>>>>>>>> is set in the biaser bits) could have their markword be reset
>>>>>>>> to 0x5. This means they will become anonymously biased again
>>>>>>>> and so will look as if they were not biased yet. As to how this
>>>>>>>> logic works: At the beginning of the full GC,
>>>>>>>> BiasedLocking::preserve_marks() saves all the markwords for
>>>>>>>> those objects that are currently locked and have a bias
>>>>>>>> pattern. After that, markOopDesc::must_be_preserved_with_bias()
>>>>>>>> will be called to decide if the markword of an object should be
>>>>>>>> preserved or not. If the markword contains the bias pattern it
>>>>>>>> is never preserved. At the end BiasedLocking::restore_marks()
>>>>>>>> is called to restore the marks for those objects that we saved
>>>>>>>> before. So this means that even if an object has a valid
>>>>>>>> biaser, with valid epoch, if the object is not currently locked
>>>>>>>> it could be reset during the GC. I'm not sure though if
>>>>>>>> whenever markOopDesc::must_be_preserved_with_bias() returns
>>>>>>>> false the garbage collector always does the reset or it just
>>>>>>>> means it could reset it if it wants to. In any case I've seen
>>>>>>>> that reset happening when doing handshakes. In fact, this is
>>>>>>>> one of the reasons why the handshake could return that the bias
>>>>>>>> was not revoked, since I don't check for the anonymously biased
>>>>>>>> case in RevokeOneBias.
>>>>>>>
>>>>>>> Thanks for that very detailed set of descriptions. I won't
>>>>>>> pretend to fully grok all the details as I'm not completely
>>>>>>> clear on the role of the "epoch" or being anonymously biased,
>>>>>>> but I'm convinced you have a full understanding of such things.
>>>>>>> :) In revoke_and_rebias it was always a struggle for me to
>>>>>>> figure out exactly when the "rebias" part could come into play.
>>>>>>>
>>>>>>>>
>>>>>>>>> The main concern with a change like this (as with all the
>>>>>>>>> handshake changes) is what new races this may allow and
>>>>>>>>> whether they have all been accounted for. IIUC the handshake
>>>>>>>>> will still be conducted by the VMThread so that still ensures
>>>>>>>>> serialization wrt. safepoints (which makes it simpler to
>>>>>>>>> reason about things). I've looked at some of the races you
>>>>>>>>> anticipated (like the "resurrected" thread) and they seem to
>>>>>>>>> be handled correctly. I'm unable to construct other races that
>>>>>>>>> might be problematic (but that isn't saying a lot :) ).
>>>>>>>> I agree that since we are now doing the revocation outside
>>>>>>>> safepoints there is potential for additional races. But also
>>>>>>>> one thing to note is that RevokeOneBias, which contains the
>>>>>>>> logic of the handshake and is now replacing what we used to do
>>>>>>>> at a safepoint, is not really different from the initial code
>>>>>>>> in revoke_and_rebias() which is done outside safepoints. The
>>>>>>>> handshake logic is like executing that initial part but with
>>>>>>>> the right JavaThread so that if the object has a valid biaser,
>>>>>>>> then that biaser is either ourselves or we are the VMThread
>>>>>>>> while the biaser is blocked, so that we can execute
>>>>>>>> revoke_own_lock(). In fact I was thinking at some point to
>>>>>>>> combine them in some method (maybe try_fast_revoke()). The
>>>>>>>> attempt_rebias flag and the update_heuristics() in
>>>>>>>> revoke_and_rebias() complicated things so I kept them separate.
>>>>>>>> I have also tried to think on all possible racy scenarios and
>>>>>>>> couldn't find additional problems beside the "resurrected
>>>>>>>> thread" one (although it's also not a guarantee of anything).
>>>>>>>> But that's why I was thinking to check this in 14, so that if
>>>>>>>> there are any problems we have plenty of testing time to detect
>>>>>>>> them.
>>>>>>>
>>>>>>> Yes that is a good idea. No need to rush this into 13.
>>>>>>>
>>>>>>>>
>>>>>>>>> src/hotspot/share/jfr/metadata/metadata.xml
>>>>>>>>>
>>>>>>>>> Is it the case that this event is now never generated from a
>>>>>>>>> safepoint? Or have you just deleted the safepoint_id from the
>>>>>>>>> event because it might not be at a safepoint? If the latter
>>>>>>>>> can't we keep it and use 0 to indicate "not at a safepoint"? I
>>>>>>>>> think the JFR folk need to comment on this part of the change
>>>>>>>>> anyway.
>>>>>>>> This event will be created and commited only from
>>>>>>>> BiasedLocking::single_revoke_with_handshake(). Now, the actual
>>>>>>>> handshake that revoked the bias could be executed at a
>>>>>>>> safepoint only if ThreadLocalHandshakes is false. But I
>>>>>>>> understand that this is true for all platforms so the handshake
>>>>>>>> should always be executed outside safepoints.
>>>>>>>
>>>>>>> Ok.
>>>>>>>
>>>>>>>> It would be great if JFR folks review this part.
>>>>>>>
>>>>>>> Try to grab Markus :)
>>>>>>>
>>>>>>>>
>>>>>>>>> src/hotspot/share/runtime/biasedLocking.cpp
>>>>>>>>>
>>>>>>>>> I second Dan's comment about combining cleanup and code motion
>>>>>>>>> in a big change like this - it does make it much harder to
>>>>>>>>> spot the real difference.
>>>>>>>> Ok, already two objections on this so I'll revert moving the
>>>>>>>> heuristics part. I think I also moved
>>>>>>>> clean_up_cached_monitor_info() and I will double check any
>>>>>>>> other movements.
>>>>>>>>
>>>>>>>>
>>>>>>>>> I note Dan picked up on the lack of p2i and other stuff
>>>>>>>>> related to the logging statements, and that you indicated they
>>>>>>>>> were fixed. I note that all that stuff is pre-existing so I'm
>>>>>>>>> unclear now whether you have fixed all the logging in the file
>>>>>>>>> or only the statements in the code you have changed or added?
>>>>>>>>> Again such cleanup may be best done separately.
>>>>>>>> I haven't fixed the existing log statements, only the ones Dan
>>>>>>>> mentioned which are in single_revoke_with_handshake(),
>>>>>>>> revoke_own_lock(), and in VM_HandshakeOneThread(). Ok, I can
>>>>>>>> fix the other ones in a cleanup later along with code movement
>>>>>>>> and the removal of the attemp_rebias flag which we are not using.
>>>>>>>
>>>>>>> Okay. To be clear I don't expect you to fix all the existing
>>>>>>> uses I just wanted to clarify which ones you had fixed.
>>>>>>>
>>>>>>>>
>>>>>>>>> 640 void BiasedLocking::revoke_own_lock(oop obj, JavaThread*
>>>>>>>>> biased_locker) {
>>>>>>>>> 641 assert(!SafepointSynchronize::is_at_safepoint() ||
>>>>>>>>> !ThreadLocalHandshakes,
>>>>>>>>> 642 "if ThreadLocalHandshakes is enabled this should
>>>>>>>>> always be executed outside safepoints");
>>>>>>>>> 643 assert(Thread::current() == biased_locker ||
>>>>>>>>> Thread::current()->is_VM_thread(), "wrong thread");
>>>>>>>>> 644
>>>>>>>>>
>>>>>>>>> This is called "revoke_own_lock" but it can also be executed
>>>>>>>>> by the VMThread - so its not its own lock. Also we don't
>>>>>>>>> revoke anything related to a "lock" - we revoke a bias from
>>>>>>>>> the markword of an oop. I think a better name is needed.
>>>>>>>> Yes, I didn't really like it either. How about
>>>>>>>> walk_stack_and_revoke() ?
>>>>>>>
>>>>>>> That sounds good to me. Roll on v2 :)
>>>>>>>
>>>>>>> Thanks,
>>>>>>> David
>>>>>>> -----
>>>>>>>
>>>>>>>>
>>>>>>>>> src/hotspot/share/runtime/handshake.cpp
>>>>>>>>>
>>>>>>>>> 125 log_trace(handshake)("JavaThread " INTPTR_FORMAT "
>>>>>>>>> is not alive", (intptr_t)_target);
>>>>>>>>>
>>>>>>>>> Use p2i(_target) rather than cast to intptr_t.
>>>>>>>> Fixed.
>>>>>>>>
>>>>>>>>
>>>>>>>>> That's all from me.
>>>>>>>> Thanks for looking into this David! If you are okay with the
>>>>>>>> "walk_stack_and_revoke()" name then I can send v2.
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks!
>>>>>>>>
>>>>>>>>
>>>>>>>> Patricio
>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> David
>>>>>>>>> -----
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 30/05/2019 2:29 am, Patricio Chilano wrote:
>>>>>>>>>> Hi all,
>>>>>>>>>>
>>>>>>>>>> Could you review this patch that uses thread local handshakes
>>>>>>>>>> instead of safepoints to revoke the biases of locked objects?
>>>>>>>>>>
>>>>>>>>>> Webrev:
>>>>>>>>>> http://cr.openjdk.java.net/~pchilanomate/8191890/v01/webrev/
>>>>>>>>>> Bug:
>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8191890
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Today whenever a JavaThread needs to revoke the bias of an
>>>>>>>>>> object that has been biased by another JavaThread (and where
>>>>>>>>>> the epoch is still valid and the prototype header of the
>>>>>>>>>> class still has the bias pattern) it needs to request a
>>>>>>>>>> safepoint operation. The VMThread inside the safepoint walks
>>>>>>>>>> the stack of the biaser looking for lock records associated
>>>>>>>>>> with the biased object, and converts them to thin locks if
>>>>>>>>>> any are found.
>>>>>>>>>> This patch uses thread local handshakes instead, since we
>>>>>>>>>> actually only need to be able to safely walk the stack of the
>>>>>>>>>> JavaThread that biased the object and not other JavaThreads.
>>>>>>>>>>
>>>>>>>>>> Some notes about the patch:
>>>>>>>>>>
>>>>>>>>>> - Thanks to Robbin for initial work on this patch and for
>>>>>>>>>> advice and feedback!
>>>>>>>>>> - We still execute bulk rebias and bulk revoke operations
>>>>>>>>>> inside safepoints, since in those cases all the JavaThread's
>>>>>>>>>> stacks need to be walked to potentially update lock records.
>>>>>>>>>> - The method revoke_bias() was renamed to
>>>>>>>>>> single_revoke_at_safepoint(). This method is still kept
>>>>>>>>>> because there are places where we check whether we are
>>>>>>>>>> already at safepoint when trying to revoke. In those cases,
>>>>>>>>>> if we are already at a safepoint we simply end up calling
>>>>>>>>>> this method.
>>>>>>>>>> - Handshakes are executed as VMOperations so the VMThread is
>>>>>>>>>> still involved in the revocation. This means we cannot have
>>>>>>>>>> different revocations being executed in parallel (same as
>>>>>>>>>> with safepoints). Ideally we would like to execute thread
>>>>>>>>>> local handshakes without needing for the VMThread to
>>>>>>>>>> participate. However, now other JavaThreads that do not
>>>>>>>>>> participate in the revocation are allow to continue making
>>>>>>>>>> progress.
>>>>>>>>>>
>>>>>>>>>> Run several benchmarks and mostly performance seems
>>>>>>>>>> unaffected. Measured the average time it takes for revoking
>>>>>>>>>> bias with a handshake and with a safepoint and numbers are
>>>>>>>>>> pretty similar varying between benchmarks. Some numbers are
>>>>>>>>>> shown below:
>>>>>>>>>>
>>>>>>>>>> specjbb2015
>>>>>>>>>> Handshakes Safepoints
>>>>>>>>>> Linux 4ms 4.6ms
>>>>>>>>>> Windows 11ms 19ms
>>>>>>>>>>
>>>>>>>>>> startup benchmarks
>>>>>>>>>> Handshakes Safepoints
>>>>>>>>>> Linux 159us 248us
>>>>>>>>>> Windows 150us 111us
>>>>>>>>>>
>>>>>>>>>> Overall the variation is not enough to show significant
>>>>>>>>>> difference in performance, considering also that revocations
>>>>>>>>>> of a valid biaser are usually a fraction of the overall
>>>>>>>>>> running time of a benchmark (specially jbb2015). In any case
>>>>>>>>>> using handshakes allows other JavaThreads to make progress
>>>>>>>>>> during that time, minimizing STW operations.
>>>>>>>>>>
>>>>>>>>>> In terms of testing, the patch passed several runs of
>>>>>>>>>> tiers1-6 in mach5 on Windows, Linux, MacOS and Solaris.
>>>>>>>>>>
>>>>>>>>>> Thanks!
>>>>>>>>>>
>>>>>>>>>> Patricio
>>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
More information about the hotspot-runtime-dev
mailing list