RFR (S): 8182703: Correct G1 barrier queue lock orderings
Erik Österlund
erik.osterlund at oracle.com
Fri Jul 7 15:17:39 UTC 2017
Hi Kim,
Added hotspot-dev as requested.
To answer your worries we must first understand what invariant these
checks for 'special' locks seek to achieve.
The invariant is, AFAIK, that locks with a rank of 'special' and below
(now including 'access' as well as 'event' that was already there
before) must *not* check for safepoints when grabbed from JavaThreads.
Safepoint checks translate to performing ThreadBlockInVM which must
*not* be performed in 'special' or below ranked locks.
The reason this is necessary, is that those locks must be usable from
safepoint-unsafe places, e.g. leaf calls, where we can not yield to the
safepoint synchronizer. I believe that must have been what the name
'special' originated from, correct me if I'm wrong. Since locking in
safepoint-unsafe places potentially blocks the safepoint synchronizer, a
deadlock would arise if a thread, 'Thread A', acquires a special lock
with Java/VM thread state. But that special lock is held by another
thread, 'Thread B', that has blocked on a non-special lock, because it
yielded to the safepoint synchronizer on the VM thread, that is
concurrently synchronizing a safepoint. However, the safepoint
synchronizer is blocked waiting for the thread acquiring the special
lock to yield. But it never will. In the end, 'Thread A' waits for
'Thread B' to release the special lock, and 'Thread B' waits for the
VMThread that is safepoint synchronizing, and the VMThread is waiting
for 'Thread A' - a deadlock.
With the provided invariant however, it is impossible to acquire
non-safepoint non-special locks while already holding 'special' and
below locks. Therefore, by checking for the condition of the invariant,
we will be certain such a deadlock can not happen, as that would violate
the usual lock rank ordering that we all know about - one can not
acquire a lock that has a rank higher than locks already acquired.
First of all, before we delve deeper into this rabbit hole. Note that if
our new 'access' locks are used properly, e.g. do not perform safepoint
checks or call code that needs to have a safepoint-safe state under the
lock, we are good. The G1 barrier code conforms to those restrictions
and the code has since forever been called in leaf calls while in Java
thread state, without oop maps, making it a not safepoint-safe state. So
that is all good. The remaining question to answer is, had we done crazy
things while holding those access locks, would the deadlock detection
system have detected that?
To answer that question we must examine whether that invariant holds or not.
There are three paths for locking a Mutex/Monitor, try_lock(), lock()
and lock_without_safepoint_check(). Among these, only lock() violates
the invariant for 'special' and below locks taken from a JavaThread. It
is the only locking operation that performs a safepoint check.
try_lock() instead returns false if the lock could not be acquired, and
lock_without_safepoint_check does not check for safepoints.
MutexLocker and MutexLockerEx are abstractions around Mutex that boil
down to calls to either lock() or lock_without_safepoint_check(). So if
lock() catches the broken invariant, all locking in hotspot will somehow
catch it. Let's examine what happens in lock().
share/vm/runtime/mutex.cpp:933:21: assert(rank() > Mutex::special,
"Potential deadlock with special or lesser rank mutex");
This is the check in Monitor::lock() *with* safepoint check. It
definitely catches illegal uses of lock() on 'special', 'event' and
'access' ranked locks on JavaThreads. Since we always catch misuse of
special and below ranked locks here, the deadlock detection system works
correctly even for 'event' and the new 'access' ranked locks. The other
checks are mostly redundant and will eventually boil down to this check.
Examples:
share/vm/runtime/mutexLocker.hpp:165:29: assert(mutex->rank() !=
Mutex::special
share/vm/runtime/mutexLocker.hpp:173:29: assert(mutex->rank() !=
Mutex::special,
These two are constructors for MutexLocker which will call lock().
Therefore, this check is redundant. The 'event' and 'access' ranks will
both miss this redundant check, but subsequently run into the check in
lock(), which is the one that matters and still catches the broken
invariant.
share/vm/runtime/mutexLocker.hpp:208:30: assert(mutex->rank() >
Mutex::special || no_safepoint_check,
This check in MutexLockerEx checks that either the lock should be over
special or it must not check for safepoints. This works as intended with
'event' and 'access' locks. They are forced to perform a lock without
safepoint check, and hence not enter the lock() path of Mutex. However,
if they did, it would still be redundantly sanity checked there too.
share/vm/runtime/mutex.cpp:1384:23: || rank() == Mutex::special,
"wrong thread state for using locks");
share/vm/runtime/mutex.cpp:1389:30: debug_only(if (rank() !=
Mutex::special) \
These two check checks are found in Monitor::check_prelock_state().
Which is called from lock() and try_lock(). As for lock, we already
concluded that using lock() *at all* on a special lock from a JavaThread
will be found out and an assert will trigger. So these checks for
special locks seem to be a bit redundant. As for the try_lock() case,
they even seem wrong. Checking for valid safepoint state for a locking
operation that can't block seems wrong. But that is invariant of whether
the lock is special or not. It just should not check for safepoint-safe
states on try_lock().
share/vm/runtime/thread.cpp:903:27: cur->rank() ==
Mutex::special) {
This check is in Thread::check_for_valid_safepoint_state() where it
walks all the monitors and makes sure that we don't have any of
{Threads_lock, Compile_lock, VMOperationRequest_lock,
VMOperationQueue_lock} and allow_vm_block() or it contains any special
lock. This is performed when e.g. allocating etc. This check should
arguably check for special *and below* ranked locks that should not be
acquired while in safepoint-safe. However, those access locks return
true on allow_vm_block(), and therefore will correctly be detected as
dangerous had we done crazy things under these locks.
All in all, I believe that the deadlock detecion system has some
redundant, and some confusing checks that involve the lock rank
Mutex::special. But I do believe that it works and would detect
deadlocks, but could do with some reworking to make it more explicit.
And that is invariant of the new access rank and applies equally to the
event rank.
However, since these access locks play well with the current deadlock
detection as they do not do anything illegal, and even if use of these
locks did indeed do illegal things, it would still be detected by the
deadlock detection system, it is reasonable to say that refactoring the
deadlock detection system is a separate RFE?
Specifically, clarifying the deadlock detection system by removing
redundant checks, not checking for safepoint-safe state in try_lock as
well as explicitly listing special and below locks as illegal when
verifying Thread::check_for_valid_safepoint_state(), regardles of
whether allow_vm_block() is true or not. Sounds like a separate RFE to me!
Thanks,
/Erik
On 2017-07-06 22:15, Kim Barrett wrote:
>> On Jul 6, 2017, at 4:11 PM, Kim Barrett <kim.barrett at oracle.com> wrote:
>>
>>> On Jul 4, 2017, at 10:00 PM, Kim Barrett <kim.barrett at oracle.com> wrote:
>>> The lock ranking changes look good.
>> I'm going to retract that.
>>
>> How does these new lock rankings interact with various assertions that
>> rank() == or != Mutex::special? I'm not sure those places handle
>> these new ranks properly. (I'm not sure those places handle
>> Mutex::event rank properly either.)
> And maybe this change needs to be discussed on hotspot-dev rather than hotspot-gc-dev.
>
More information about the hotspot-gc-dev
mailing list