RFR(XXS): 8149803: Adjust lock rankings for some Event-based tracing locks
Karen Kinnear
karen.kinnear at oracle.com
Thu Feb 25 15:35:56 UTC 2016
Markus,
It would be helpful if you could find a place to record a comment about the JfrStream_lock - that the rank needs to
be between safe point and JfrBuffer_lock. That way, should we do cleanup in this area we will know the constraints.
thanks,
Karen
> On Feb 24, 2016, at 2:30 PM, Karen Kinnear <karen.kinnear at oracle.com> wrote:
>
> Markus,
>
> See below.
>> On Feb 23, 2016, at 11:58 PM, David Holmes <david.holmes at oracle.com> wrote:
>>
>> Hi Markus,
>>
>> On 23/02/2016 8:00 PM, Markus Gronlund wrote:
>>> Thanks for taking a look David,
>>>
>>> I have verified the new lock rankings by running Kitchensink and by code inspection.
>>>
>>> The original intent was to try to make the tracing locks transparent much like the way ttyLocker is today ("event" ranking). I did this in an effort to provide "seamless" tracing usages from higher ranking locks (leafs in particular). As you point out, the mechanics for accomplishing this becomes a bit of a force-fit in light of the existing lock_types enum. I still needed lock ordering since ttyLocker is being used under some of these locks; in addition, there is a relative order between two of these locks (that was the reason for special+1 - and yes it lands on the same ordinal as suspend_resume (but I didn't want to use that designation since it is totally separate)).
>>>
>>> I also elaborated on expanding the lock_types enum:
>>>
>>> enum lock_types {
>>> event,
>>> special,
>>> suspend_resume,
>>> traceleaf,
>>> trace,
>>> leaf,
>>> safepoint = leaf + 10,
>>> barrier = safepoint + 1,
>>> nonleaf = barrier + 1,
>>> max_nonleaf = nonleaf + 900,
>>> native = max_nonleaf + 1
>>> };
>>>
>>> This seems also to work but I am not sure about any disruptions this might cause. Would mean much more investigation to do any changes here.
>>
>> I don't think this should cause any disruptions unless we have hard-coded the enum values somewhere - which we do not seem to do. It also avoids the potential problem of having a lock with rank special+1 being considered to have rank suspend_resume in the rank checking code:
>>
>> if (this->rank() != Mutex::native &&
>> this->rank() != Mutex::suspend_resume &&
>> locks != NULL && locks->rank() <= this->rank() &&
>> ...
>>
>>> On the topic of the "special" rank comment mentioning guarantees not to block, that would be true for JfrBuffer_lock, but not for JfrStream_lock (could be blocked on the former). So this is not correct - thanks for pointing this out.
>>>
>>> I am having a rethink about this:
>>>
>>> Reducing the lock rankings underneath the ordinary leaf ranking would be for facility yes. However, doing so might cause an unwanted effect:
>>>
>>> It would become easier to generated trace events while holding these other locks. A better pattern for overall performance would be to instead save the necessary information, release the lock as soon as possible, and generate an event post-lock release (if possible). I have not yet seen a real case where this is not possible to do - I might need to revisit this if that becomes a real problem in the future.
>>>
>>> Even though it might not be the primary vehicle for enforcement, we could capitalize on the lock ordering scheme to avoid generating events underneath other locks unnecessarily.
>>>
>>> So I am retracting the lock ranking "special" suggestion.
>>>
>>> What I actually only want/need is this:
>>>
>>> - def(JfrStream_lock , Mutex, nonleaf, true,
>>> Monitor::_safepoint_check_never);
>>> + def(JfrStream_lock , Mutex, leaf+1, true,
>>> Monitor::_safepoint_check_never);
>>>
>>> The lock ordering assertions will today handle nonleaf for this lock, but only because of the assertion bailing on SafepointSynchronize::is_at_safepoint(). I would like to reduce the ranking from nonleaf to leaf+1 to gracefully handle SafepointSynchronize::is_synchronizing() and other states as well.
>>
>> I didn't quite follow all that. :) If this simpler suggestion meets your requirements and adheres to the intent of the lock-ranking protocol then that is good.
> I think what you are saying is that actually you do not adhere to the lock-ranking protocol. Rather that, when not at a safe point you adhere to the
> lock ranking protocol, but when at a safe point, the safe point check (is_a_safepoint) skips the lock ranking check.
> I think the rest of your comment is that with the new tracepoints in the safe point code, which need this lock while holding the Safepoint_lock (i.e. while getting
> to a safepoint or leaving a safe point but not during a safe point), so you need to ensure that both the JfrBuffer_lock and JfrStream_lock can be acquired
> while not actually at a safe point, and while doing a lock ordering check, while holding the Safepoint_lock. And you are maintaining the relative
> relationship of JfrBuffer_lock with JfrStream_lock.
>
> So you are just changing JfrStream_lock from non leaf to leaf+1.
>
> If that is what you are saying, I am good with the change. Ship it once sufficiently tested.
>
> thanks,
> Karen
>
>>
>> Thanks,
>> David
>> -----
>>
>>> Thanks for your feedback
>>> Markus
>>>
>>>
>>> -----Original Message-----
>>> From: David Holmes
>>> Sent: den 22 februari 2016 02:10
>>> To: Markus Gronlund; serviceability-dev at openjdk.java.net
>>> Subject: Re: RFR(XXS): 8149803: Adjust lock rankings for some Event-based tracing locks
>>>
>>> Hi Markus,
>>>
>>> On 20/02/2016 1:55 AM, Markus Gronlund wrote:
>>>> Greetings,
>>>>
>>>> Please review this small change lowering the lock rankings of some locks.
>>>
>>> Have we actually verified the new ranking constraints (ie that special guarantees not to block) by code inspection?
>>>
>>>> This is done in order to reduce the risk for potential deadlocks and
>>>> to increase the surface area for event generation.
>>>>
>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8149803
>>>>
>>>> Patch of this tiny change is attached.
>>>
>>> - def(JfrStream_lock , Mutex, nonleaf, true,
>>> Monitor::_safepoint_check_never);
>>> + def(JfrStream_lock , Mutex, special+1, true,
>>> Monitor::_safepoint_check_never);
>>>
>>>
>>> Not clear what "special+1" is supposed to signify here - doesn't that make it the same rank as suspend_resume?
>>>
>>> enum lock_types {
>>> event,
>>> special,
>>> suspend_resume,
>>> leaf = suspend_resume + 2,
>>> safepoint = leaf + 10,
>>> barrier = safepoint + 1,
>>> nonleaf = barrier + 1,
>>> max_nonleaf = nonleaf + 900,
>>> native = max_nonleaf + 1
>>> };
>>>
>>>
>>> Thanks,
>>> David
>>>
>>>
>>>> Thanks in advance
>>>>
>>>> Markus
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20160225/49465e0b/attachment-0001.html>
More information about the serviceability-dev
mailing list