(S) RFR: 8029453: java/util/concurrent/locks/ReentrantLock/TimeoutLockLoops.java failed by timeout

David Holmes david.holmes at oracle.com
Tue Aug 11 10:51:25 UTC 2015


On 11/08/2015 7:37 PM, Bertrand Delsart wrote:
> Looks OK but may not be sufficient.
>
> To play it even safer, I'd rather read _cur_index only once, for both
> the WorkAroundNPTLTimedWaitHang and the !WorkAroundNPTLTimedWaitHang
> case, e.g.:

I can do that but it isn't necessary for correctness - _cur_index is 
only modified whilst holding the mutex so it must also only be read 
whilst holding the mutex, which is now fully covered.

Thanks,
David
-----

> 5779     // thread might be parked
>
>           [ save the _cur_index here, before testing it ]
>           int index = _cur_index;
>
> 5780     if (_cur_index != -1) {
>           =>  if (index != -1)
>
> 5781       // thread is definitely parked
> 5782       if (WorkAroundNPTLTimedWaitHang) {
> 5783         status = pthread_cond_signal(&_cond[_cur_index]);
>           => use index instead of re-reading _cur_index
>
> 5784         assert(status == 0, "invariant");
> 5785         status = pthread_mutex_unlock(_mutex);
> 5786         assert(status == 0, "invariant");
> 5787       } else {
> 5788         // must capture correct index before unlocking
>
> [ 5789         int index = _cur_index; ] // now loaded earlier
>
> 5790         status = pthread_mutex_unlock(_mutex);
> 5791         assert(status == 0, "invariant");
> 5792         status = pthread_cond_signal(&_cond[index]);
> 5793         assert(status == 0, "invariant");
>
>
> Bertrand.
>
> On 11/08/2015 08:40, David Holmes wrote:
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8029453
>>
>> Webrev: http://cr.openjdk.java.net/~dholmes/8029453/webrev/
>>
>> The code introduced in 6900441 contained a bug in the code path for when
>> WorkAroundNPTLTimedWaitHang was zero, and this was exposed by the change
>> in 8130728 which made the default setting of WorkAroundNPTLTimedWaitHang
>> zero.
>>
>> In PlatformParker on Linux _cur_index tracks which pthread_cond object
>> is currently in use by a waiting thread (one for relative-timed waits,
>> the other for absolute-timed waits) and is set to -1 when the thread is
>> not waiting. In the path now used by default we release the
>> pthread_mutex_t and then pthread_cond_signal the condition variable at
>> _cond[_cur_index]. But as soon as we release the mutex the waiting
>> thread can resume execution (it may have timed-out and so not need the
>> signal) and set _cur_index to -1. The signalling thread then signals
>> _cond[-1] which does not contain a pthread_cond_t object. This can
>> result in the pthread_cond_signal hanging, and potentially other
>> consequences.
>>
>> The fix is simple: save the correct index before unlocking the mutex.
>>
>> The test: java/util/concurrent/locks/ReentrantLock/TimeoutLockLoops.java
>> has been marked as failing intermittently (8133231) due to this and I
>> will revert that as part of this fix, once that change reaches the hs-rt
>> forest.
>>
>> Thanks,
>> David
>
>


More information about the hotspot-runtime-dev mailing list