RFR: 8166197: assert(RelaxAssert || w != Thread::current()->_MutexEvent) failed: invariant
David Holmes
david.holmes at oracle.com
Thu Oct 13 00:54:31 UTC 2016
Hi Carsten,
Thanks for looking at this.
On 12/10/2016 11:21 PM, Carsten Varming wrote:
> Dear David,
>
> In line 590 "Pass onDeck to w". I don't understand this part of the
> comment. Should it say something like "Pass ownership of _OnDeck and
> _EntryList to w".
First note that that part of the comment already existed in the old code:
589 _OnDeck = w; // pass OnDeck to w.
This is making 'w' the OnDeck thread - nothing to do with _EntryList.
Prior to executing this line the current thread holds the "OnDeck lock"
- which is just a logical lock obtained by CASing _OnDeck from 0 to 1.
The current thread selects 'w' (as the head of _EntryList) to be the new
OnDeck thread, and passes that role to it through the assignment - and
of course by doing so drops the "OnDeck lock". I changed to read:
// Pass OnDeck role to w, ensuring that _EntryList has been set first.
> In line 532 there is a read of _OnDeck and in line 553 there is a read
> of _EntryList. Why is it safe not to do a load_acquire of _OnDeck in
> line 532?
<sigh> I knew I should have just stuck in a storestore() then we could
all be blissfully ignorant :) Okay here's that code fragment with
comments elided
514 void Monitor::IUnlock(bool RelaxAssert) {
529 OrderAccess::release_store(&_LockWord.Bytes[_LSBINDEX], 0); //
drop outer lock
530
531 OrderAccess::storeload();
532 ParkEvent * const w = _OnDeck;
533 assert(RelaxAssert || w != Thread::current()->_MutexEvent,
"invariant");
534 if (w != NULL) {
548 if ((UNS(w) & _LBIT) == 0) w->unpark();
549 return;
550 }
A release-store to X is used to ensure that shared data written prior to
the store to X actually occurs prior to that store. A load-acquire of X
ensures that if the load sees the value written by the store-release
then it also sees the updates to the shared data. In the current case
the release-store to _OnDeck writes the non-NULL value 'w', and if the
code above sees a non-NULL value (ie it sees 'w') then at most it
unparks 'w' and returns. It never accesses the shared data ie
_EntryList. So no load-acquire is needed as we are not trying to
"synchronize" with the changes to the shared state made by the thread
that did the release-store. I've added a comment as per my response to
Dan's email
Now lets look at all the other loads of _OnDeck:
- there are a bunch of asserts such as:
444 assert(_OnDeck != Self->_MutexEvent, "invariant");
588 assert(UNS(_OnDeck) == _LBIT, "invariant");
836 assert(_OnDeck != ESelf, "invariant");
1197
assert((UNS(_owner)|UNS(_LockWord.FullWord)|UNS(_EntryList)|UNS(_WaitSet)|UNS(_OnDeck))
== 0, "");
and a load that is only used for an assert:
1161 uintptr_t ondeck = UNS(_OnDeck);
These are all logical checks on the current state and are not attempting
to synchronize with any other changes to shared state, so no need for
load-acquire. I hope there is no controversy on that point.
- there are some CAS operations which already embody full bi-directional
fences which subsume the load-acquire (whether directly needed to
synchronize with the store-release or not)
466 if ((NativeMonitorFlags & 32) && CASPTR (&_OnDeck, NULL,
UNS(ESelf)) == 0) {
573 if (CASPTR (&_OnDeck, NULL, _LBIT) != UNS(NULL)) {
- then we have:
843 if (_OnDeck == ESelf && TrySpin(Self)) break;
I missed this one - it needs the load-acquire for the same reason as the
code in ILock - which should have been obvious given:
841 // The following fragment is extracted from Monitor::ILock()
:)
Webrev updated in place - also see my response to Dan.
Thanks,
David
> Carsten
>
>
> On Wed, Oct 12, 2016 at 1:08 AM, David Holmes <david.holmes at oracle.com
> <mailto:david.holmes at oracle.com>> wrote:
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8166197
> <https://bugs.openjdk.java.net/browse/JDK-8166197>
> webrev: http://cr.openjdk.java.net/~dholmes/8166197/webrev/
> <http://cr.openjdk.java.net/~dholmes/8166197/webrev/>
>
> In IUnlock we have the following succession code to wakeup the
> "onDeck" thread:
>
> ParkEvent * List = _EntryList;
> if (List != NULL) {
> // Transfer the head of the EntryList to the OnDeck position.
> // Once OnDeck, a thread stays OnDeck until it acquires the lock.
> // For a given lock there is at most OnDeck thread at any one
> instant.
> WakeOne:
> assert(List == _EntryList, "invariant");
> ParkEvent * const w = List;
> assert(RelaxAssert || w != Thread::current()->_MutexEvent,
> "invariant");
> _EntryList = w->ListNext;
> // as a diagnostic measure consider setting w->_ListNext = BAD
> assert(UNS(_OnDeck) == _LBIT, "invariant");
> _OnDeck = w; // pass OnDeck to w.
>
> It is critical that the update to _EntryList happens before we set
> _OnDeck, as as soon as _OnDeck is set the selected thread (which
> need not yet have parked) can acquire the mutex, complete its
> critical section and proceed to unlock the mutex, and so execute
> IUnlock in parallel with the original thread. If the write to
> _EntryList has not yet happened that second thread finds itself
> still at the head of _EntryList and so the assert fires. If the
> write to _EntryList happens after the load "List = _EntryList", then
> the first assert can also fire.
>
> Preferred fix today is to use OrderAccess::release_store(&_OnDeck,
> w) with a matching load_acquire(&_OnDeck) in the ILock code:
>
> while (_OnDeck != ESelf) {
> ParkCommon(ESelf, 0);
> }
>
> and corresponding "raw" lock code. Also fixed a couple of typos.
>
> Thanks,
> David
>
>
More information about the hotspot-runtime-dev
mailing list