RFR [8038333] java/lang/ref/EarlyTimeout.java failed
Mandy Chung
mandy.chung at oracle.com
Thu Mar 27 00:30:34 UTC 2014
On 3/26/2014 3:57 PM, David Holmes wrote:
> Ivan,
>
> I think the problem is that the EarlyTimeout threads can complete
> their remove(TIMEOUT) before the main thread has started them all,
> cleared the reference and called System.gc().
>
> Depending on exactly what is being tested, the EarlyTimeout threads
> may need to wait on another latch that is signalled by the main thread
> after the gc() call returns. Still no guarantee that the gc will do
> its work before the timeout elapses.
>
This is similar to what I have been thinking. I believe the
EarlyTimeout threads don't need the startedSignal.countDown; instead it
can have a signal latch with 1 count. The EarlyTimeout threads awaits
on the signal latch. The main thread will call signal.countDown after
System.gc() and we can add a check weakReference.isEnqueued to make sure
before we awake the threads to proceed the queue.remove call.
Ivan - for the error message, perhaps you can simply do this:
if (nonNullRefCount != 1) {
throw new RuntimeException(nonNullRefCount + " references were removed from queue");
}
Mandy
> David
>
> On 27/03/2014 6:18 AM, Ivan Gerasimov wrote:
>> Thank you Mandy!
>>
>>> Are you able to reproduce the test failure?
>>
>> Yes, I could easily reproduce the failure when I reduced the timeout to
>> 10 ms.
>> With the timeout reduced, the test fails every third time on my machine.
>>
>>> I think the test verifies that only one thread gets the reference is a
>>> good test.
>>>
>> But if none of the threads could get the reference, it should not cause
>> the failure of the test.
>> It only means that during this particular run we could not have tested
>> what we needed.
>> We could retry, but I'm not sure it's worth complicating the test.
>> It's easier to ignore the failure, taking into account that it happens
>> rarely.
>>
>>> I think the race is due to the threads get to call queue.remove as
>>> soon as both threads decrement the count of the latch that can be well
>>> before the reference is enqueued.
>>>
>> The problem is that we have no way to block the main thread until there
>> is a reference in the queue.
>> I improved the situation a bit, having moved the await() after the call
>> of gc().
>>
>>> It'd be good to add additional information in the test to help
>>> diagnosing test failure.
>>>
>> I added reporting to stderr about being unable to remove a reference
>> from the queue.
>> I believe we shouldn't treat it as an error, and should simply ignore
>> it.
>>
>> Would you please have a look at the updated webrev:
>> http://cr.openjdk.java.net/~igerasim/8038333/1/webrev/
>>
>> Sincerely yours,
>> Ivan
>>
>>
More information about the core-libs-dev
mailing list