(S) RFR: 8029453: java/util/concurrent/locks/ReentrantLock/TimeoutLockLoops.java failed by timeout
David Holmes
david.holmes at oracle.com
Wed Aug 12 22:34:03 UTC 2015
On 11/08/2015 9:54 PM, Bertrand Delsart wrote:
> On 11/08/2015 12:51, David Holmes wrote:
>> On 11/08/2015 7:37 PM, Bertrand Delsart wrote:
>>> Looks OK but may not be sufficient.
>>>
>>> To play it even safer, I'd rather read _cur_index only once, for both
>>> the WorkAroundNPTLTimedWaitHang and the !WorkAroundNPTLTimedWaitHang
>>> case, e.g.:
>>
>> I can do that but it isn't necessary for correctness - _cur_index is
>> only modified whilst holding the mutex so it must also only be read
>> whilst holding the mutex, which is now fully covered.
>
> OK. Approved as is if you prefer. Your pick.
Thanks Bertrand.
David
> Bertrand.
>
>>
>> Thanks,
>> David
>> -----
>>
>>> 5779 // thread might be parked
>>>
>>> [ save the _cur_index here, before testing it ]
>>> int index = _cur_index;
>>>
>>> 5780 if (_cur_index != -1) {
>>> => if (index != -1)
>>>
>>> 5781 // thread is definitely parked
>>> 5782 if (WorkAroundNPTLTimedWaitHang) {
>>> 5783 status = pthread_cond_signal(&_cond[_cur_index]);
>>> => use index instead of re-reading _cur_index
>>>
>>> 5784 assert(status == 0, "invariant");
>>> 5785 status = pthread_mutex_unlock(_mutex);
>>> 5786 assert(status == 0, "invariant");
>>> 5787 } else {
>>> 5788 // must capture correct index before unlocking
>>>
>>> [ 5789 int index = _cur_index; ] // now loaded earlier
>>>
>>> 5790 status = pthread_mutex_unlock(_mutex);
>>> 5791 assert(status == 0, "invariant");
>>> 5792 status = pthread_cond_signal(&_cond[index]);
>>> 5793 assert(status == 0, "invariant");
>>>
>>>
>>> Bertrand.
>>>
>>> On 11/08/2015 08:40, David Holmes wrote:
>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8029453
>>>>
>>>> Webrev: http://cr.openjdk.java.net/~dholmes/8029453/webrev/
>>>>
>>>> The code introduced in 6900441 contained a bug in the code path for
>>>> when
>>>> WorkAroundNPTLTimedWaitHang was zero, and this was exposed by the
>>>> change
>>>> in 8130728 which made the default setting of
>>>> WorkAroundNPTLTimedWaitHang
>>>> zero.
>>>>
>>>> In PlatformParker on Linux _cur_index tracks which pthread_cond object
>>>> is currently in use by a waiting thread (one for relative-timed waits,
>>>> the other for absolute-timed waits) and is set to -1 when the thread is
>>>> not waiting. In the path now used by default we release the
>>>> pthread_mutex_t and then pthread_cond_signal the condition variable at
>>>> _cond[_cur_index]. But as soon as we release the mutex the waiting
>>>> thread can resume execution (it may have timed-out and so not need the
>>>> signal) and set _cur_index to -1. The signalling thread then signals
>>>> _cond[-1] which does not contain a pthread_cond_t object. This can
>>>> result in the pthread_cond_signal hanging, and potentially other
>>>> consequences.
>>>>
>>>> The fix is simple: save the correct index before unlocking the mutex.
>>>>
>>>> The test:
>>>> java/util/concurrent/locks/ReentrantLock/TimeoutLockLoops.java
>>>> has been marked as failing intermittently (8133231) due to this and I
>>>> will revert that as part of this fix, once that change reaches the
>>>> hs-rt
>>>> forest.
>>>>
>>>> Thanks,
>>>> David
>>>
>>>
>
>
More information about the hotspot-runtime-dev
mailing list