RFR: 8166197: assert(RelaxAssert || w != Thread::current()->_MutexEvent) failed: invariant

Daniel D. Daugherty daniel.daugherty at oracle.com
Thu Oct 13 16:24:51 UTC 2016


 > webrev updated in place with one comment and one new use of load-acquire.
 > Plus some cosmetic changes.
 >
 > webrev: http://cr.openjdk.java.net/~dholmes/8166197/webrev/

src/share/vm/runtime/mutex.cpp
     No comments.

Thumbs up!

Dan



On 10/12/16 7:18 PM, David Holmes wrote:
> Hi Dan,
>
> Thanks for looking at this.
>
> On 13/10/2016 1:03 AM, Daniel D. Daugherty wrote:
>> On 10/11/16 11:08 PM, David Holmes wrote:
>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8166197
>>> webrev: http://cr.openjdk.java.net/~dholmes/8166197/webrev/
>>
>> Very nice catch! We should check the ObjectMonitor succession code for
>> similar issues (my task).
>
> Yes. As I said in email I did a quick check through but the succession 
> logic is sufficiently different that nothing was obviously wrong in a 
> similar way.
>
>>
>> src/share/vm/runtime/mutex.cpp
>>     L466:   if ((NativeMonitorFlags & 32) && CASPTR (&_OnDeck, NULL,
>> UNS(ESelf)) == 0) {
>>         Thanks for fixing this bug also!
>>
>>     L477:   while (OrderAccess::load_ptr_acquire(&_OnDeck) != ESelf) {
>>         So you've changed this load of _OnDeck to use load-acquire
>>         which matches the new store-release on L595:
>>
>>         OrderAccess::release_store_ptr(&_OnDeck, w);
>
> Right.
>
>>         What about the other loads of _OnDeck or stores to _OnDeck?
>>         There should at least be a new comment explaining why we don't
>>         need an OrderAccess operation for those. Update: I see you
>>         changed one other load of _OnDeck on L1061. Now I'm really
>>         wanting comments for the other _OnDeck loads and stores. :-)
>>
>>         Update: I see Carsten V. asked about this in a slightly 
>> different
>>         way.
>
> See my reply to Carsten re the load's. I did miss one as we have three 
> "locking" paths that need to synchronize with the IUnlock code.
>
> As for documenting ... for line 532 I can add something simple like:
>
>  532   ParkEvent * const w = _OnDeck; // raw load as we will just 
> return if non-NULL
>
> For the other stores to _OnDeck ... CAS should be obvious. The setting 
> to NULL should also be quite clear as only the _OnDeck thread sets to 
> NULL to relinquish being _OnDeck once it has acquired the mutex, which 
> happens via CAS which has full barriers. None of the plain stores are 
> in the context of:
>
>  some_var = y; // write some shared-state
>  _OnDeck = NULL; // signal some_var has been updated
>
>>     L590:     // Pass onDeck to w, ensuring that _EntryList has been set
>> first.
>>         Typo: 'onDeck' -> 'OnDeck'
>>
>>         I suspect you don't want to fix all this CamelCase usage to meet
>>         HotSpot style. I did that for most of the ObjectMonitor code and
>>         it was painful. We could clean it up early in JDK10.
>
> I fixed the typo and also changed ONDECK to OnDeck so that we 
> generally refer to OnDeck in commentary unless specifically referring 
> to the _OnDeck field.
>
>>         Update: I see Carsten has a comment about this comment also. I
>>         don't think I quite agree that we're "passing" _EntryList to w,
>>         but I can be convinced otherwise...
>
> Right, nothing to do with _EntryList just making w the OnDeck thread.
>
>> Again, very nice catch! I'd like to see another webrev with the other
>> _OnDeck loads and stores either updated for OrderAccess ops or some
>> comment explaining why it's not needed.
>
> webrev updated in place with one comment and one new use of 
> load-acquire. Plus some cosmetic changes.
>
> Thanks again,
> David
>
>> Dan
>>
>>
>>>
>>> In IUnlock we have the following succession code to wakeup the
>>> "onDeck" thread:
>>>
>>>  ParkEvent * List = _EntryList;
>>>   if (List != NULL) {
>>>     // Transfer the head of the EntryList to the OnDeck position.
>>>     // Once OnDeck, a thread stays OnDeck until it acquires the lock.
>>>     // For a given lock there is at most OnDeck thread at any one
>>> instant.
>>>    WakeOne:
>>>     assert(List == _EntryList, "invariant");
>>>     ParkEvent * const w = List;
>>>     assert(RelaxAssert || w != Thread::current()->_MutexEvent,
>>> "invariant");
>>>     _EntryList = w->ListNext;
>>>     // as a diagnostic measure consider setting w->_ListNext = BAD
>>>     assert(UNS(_OnDeck) == _LBIT, "invariant");
>>>     _OnDeck = w;  // pass OnDeck to w.
>>>
>>> It is critical that the update to _EntryList happens before we set
>>> _OnDeck, as as soon as _OnDeck is set the selected thread (which need
>>> not yet have parked) can acquire the mutex, complete its critical
>>> section and proceed to unlock the mutex, and so execute IUnlock in
>>> parallel with the original thread. If the write to _EntryList has not
>>> yet happened that second thread finds itself still at the head of
>>> _EntryList and so the assert fires. If the write to _EntryList happens
>>> after the load "List = _EntryList", then the first assert can also 
>>> fire.
>>>
>>> Preferred fix today is to use OrderAccess::release_store(&_OnDeck, w)
>>> with a matching load_acquire(&_OnDeck) in the ILock code:
>>>
>>>   while (_OnDeck != ESelf) {
>>>     ParkCommon(ESelf, 0);
>>>   }
>>>
>>> and corresponding "raw" lock code. Also fixed a couple of typos.
>>>
>>> Thanks,
>>> David
>>



More information about the hotspot-runtime-dev mailing list