RFR: 8203817: Monitor::try_lock() should not call check_prelock_state()
David Holmes
david.holmes at oracle.com
Fri May 25 07:51:15 UTC 2018
Hi Erik,
On 25/05/2018 5:17 PM, Erik Osterlund wrote:
> Hi David,
>
> The change Per is proposing would make try_lock perform the same checks
> that locking without safepoint checks does.
Yet locking without a safepoint check also ensures
_safepoint_check_required != Monitor::_safepoint_check_always. But
try_lock does not and can be applied to both kinds of Monitors/Mutexes.
Further it also has a stated, but not checked, precondition that the
thread is not _thread_in_vm - though that may only be an issue if we
were to block.
I'm looking at the checks in check_prelock_state wondering which are
only truly needed if we risk blocking? I think most of them, with the
exception of is_crash_protected. Are we guaranteed never to run this JFR
code on the WatcherThread?
Though I'm also concerned about lock-ranking and deadlocks if we
try_lock multiple locks. Notwithstanding the rank/deadlock code is far
from ideal already.
I don't mind removing checks/guards that really don't apply in the
try_lock case, but I'm wary of removing all guards without being more
sure of that.
Thanks,
David
-----
> Perhaps locking without
> safepont checks could perform more sanity checks, but that is a separate
> issue. I think try_lock should perform the same checks that locking
> without safepoint checks does. The alternatives are then to
>
> 1) Remove checking the prelock state like Per suggests for try_lock
> (then they do the same checks), or
> 2) Overhaul the safepoint checking refactoring out the bits that check
> the safepointing sanity chrcka from other deadlock checks (and correct
> those to check for rank <= special, and not == special), remove the
> safepoint checking part from try_lock and adding the deadlock checking
> parts to lock without safepoints.
>
> Doing #2 seems like a different RFE. In fact I believe that would be
> https://bugs.openjdk.java.net/browse/JDK-8184732
> <https://bugs.openjdk.java.net/browse/JDK-8184732?jql=text%20~%20%22deadlock%20check%22> that
> I filed a while back.
>
> In summary, there is a whole bunch of problems in the deadlock detection
> system, and #2 makes it hard to not get dragged down in the rabbit hole.
> #1 is sufficient to make try_lock check as much (or little) as locking
> without safepoint checking. And I think that is enough for the scope of
> this change.
>
> Looks good to me.
>
> Thanks,
> /Erik
>
> On 25 May 2018, at 08:53, David Holmes <david.holmes at oracle.com
> <mailto:david.holmes at oracle.com>> wrote:
>
>> Hi Per,
>>
>> Exactly what condition(s) does JFR violate? This is throwing away all
>> the checks that guard against incorrect monitor use. It's not just
>> about whether you'd block trying to acquire the Monitor, it's also
>> about whether it is safe to acquire it from that code/thread in the
>> first place. (Though I think some of the checks in there should also
>> be considering the value of _safepoint_check_required.)
>>
>> Thanks,
>> David
>>
>>
>> On 25/05/2018 4:39 PM, Per Liden wrote:
>>> In debug builds, Monitor::try_lock() calls check_prelock_state() to
>>> check the thread state, etc. The intention is to verify that the call
>>> is made from the correct context, a context where we're allowed to
>>> block and potentially safepoint. Unlike Monitor::lock(),
>>> Monitor::try_lock() will never block, hence the call to
>>> check_prelock_state() is overly strict and we should remove it.
>>> Removing it would match the behavior of all other non-blocking
>>> functions, like Monitor::lock_without_safepoint_check(), which
>>> doesn't call check_prelock_state() either (for a good reason).
>>> The specific problem I've run into with this is related to JFR.
>>> Monitor::try_lock() is by JFR to allow non-blocking event generation,
>>> so that you can generate JFR events from "any" context without risk
>>> blocking/safepointing (the logic is doing something like, if
>>> try_lock() fails then put the event on a different queue and let the
>>> next owner of the lock handle it). The overly strict checks done by
>>> check_prelock_state() in try_lock() breaks this logic, which in turn
>>> means that you can't generate JFR event from "any" context as was
>>> intended.
>>> The patch to fix this is a one-liner, just remove the call to
>>> check_prelock_state().
>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8203817
>>> Webrev: http://cr.openjdk.java.net/~pliden/8203817/webrev.0
>>> /Per
More information about the hotspot-dev
mailing list