RFR: 8166197: assert(RelaxAssert || w != Thread::current()->_MutexEvent) failed: invariant
David Holmes
david.holmes at oracle.com
Thu Oct 13 01:18:05 UTC 2016
Hi Dan,
Thanks for looking at this.
On 13/10/2016 1:03 AM, Daniel D. Daugherty wrote:
> On 10/11/16 11:08 PM, David Holmes wrote:
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8166197
>> webrev: http://cr.openjdk.java.net/~dholmes/8166197/webrev/
>
> Very nice catch! We should check the ObjectMonitor succession code for
> similar issues (my task).
Yes. As I said in email I did a quick check through but the succession
logic is sufficiently different that nothing was obviously wrong in a
similar way.
>
> src/share/vm/runtime/mutex.cpp
> L466: if ((NativeMonitorFlags & 32) && CASPTR (&_OnDeck, NULL,
> UNS(ESelf)) == 0) {
> Thanks for fixing this bug also!
>
> L477: while (OrderAccess::load_ptr_acquire(&_OnDeck) != ESelf) {
> So you've changed this load of _OnDeck to use load-acquire
> which matches the new store-release on L595:
>
> OrderAccess::release_store_ptr(&_OnDeck, w);
Right.
> What about the other loads of _OnDeck or stores to _OnDeck?
> There should at least be a new comment explaining why we don't
> need an OrderAccess operation for those. Update: I see you
> changed one other load of _OnDeck on L1061. Now I'm really
> wanting comments for the other _OnDeck loads and stores. :-)
>
> Update: I see Carsten V. asked about this in a slightly different
> way.
See my reply to Carsten re the load's. I did miss one as we have three
"locking" paths that need to synchronize with the IUnlock code.
As for documenting ... for line 532 I can add something simple like:
532 ParkEvent * const w = _OnDeck; // raw load as we will just
return if non-NULL
For the other stores to _OnDeck ... CAS should be obvious. The setting
to NULL should also be quite clear as only the _OnDeck thread sets to
NULL to relinquish being _OnDeck once it has acquired the mutex, which
happens via CAS which has full barriers. None of the plain stores are in
the context of:
some_var = y; // write some shared-state
_OnDeck = NULL; // signal some_var has been updated
> L590: // Pass onDeck to w, ensuring that _EntryList has been set
> first.
> Typo: 'onDeck' -> 'OnDeck'
>
> I suspect you don't want to fix all this CamelCase usage to meet
> HotSpot style. I did that for most of the ObjectMonitor code and
> it was painful. We could clean it up early in JDK10.
I fixed the typo and also changed ONDECK to OnDeck so that we generally
refer to OnDeck in commentary unless specifically referring to the
_OnDeck field.
> Update: I see Carsten has a comment about this comment also. I
> don't think I quite agree that we're "passing" _EntryList to w,
> but I can be convinced otherwise...
Right, nothing to do with _EntryList just making w the OnDeck thread.
> Again, very nice catch! I'd like to see another webrev with the other
> _OnDeck loads and stores either updated for OrderAccess ops or some
> comment explaining why it's not needed.
webrev updated in place with one comment and one new use of
load-acquire. Plus some cosmetic changes.
Thanks again,
David
> Dan
>
>
>>
>> In IUnlock we have the following succession code to wakeup the
>> "onDeck" thread:
>>
>> ParkEvent * List = _EntryList;
>> if (List != NULL) {
>> // Transfer the head of the EntryList to the OnDeck position.
>> // Once OnDeck, a thread stays OnDeck until it acquires the lock.
>> // For a given lock there is at most OnDeck thread at any one
>> instant.
>> WakeOne:
>> assert(List == _EntryList, "invariant");
>> ParkEvent * const w = List;
>> assert(RelaxAssert || w != Thread::current()->_MutexEvent,
>> "invariant");
>> _EntryList = w->ListNext;
>> // as a diagnostic measure consider setting w->_ListNext = BAD
>> assert(UNS(_OnDeck) == _LBIT, "invariant");
>> _OnDeck = w; // pass OnDeck to w.
>>
>> It is critical that the update to _EntryList happens before we set
>> _OnDeck, as as soon as _OnDeck is set the selected thread (which need
>> not yet have parked) can acquire the mutex, complete its critical
>> section and proceed to unlock the mutex, and so execute IUnlock in
>> parallel with the original thread. If the write to _EntryList has not
>> yet happened that second thread finds itself still at the head of
>> _EntryList and so the assert fires. If the write to _EntryList happens
>> after the load "List = _EntryList", then the first assert can also fire.
>>
>> Preferred fix today is to use OrderAccess::release_store(&_OnDeck, w)
>> with a matching load_acquire(&_OnDeck) in the ILock code:
>>
>> while (_OnDeck != ESelf) {
>> ParkCommon(ESelf, 0);
>> }
>>
>> and corresponding "raw" lock code. Also fixed a couple of typos.
>>
>> Thanks,
>> David
>
More information about the hotspot-runtime-dev
mailing list