RFR for JDK-6772009 Intermittent test failure: java/util/concurrent/locks/ReentrantLock/CancelledLockLoops.java test failed with 'Completed != 2'
srikalyan
srikalyan.chandrashekar at oracle.com
Mon Dec 9 20:15:19 UTC 2013
Hi David/Martin a gentle reminder for review.
--
Thanks
kalyan
Ph: (408)-585-8040
On 12/2/13, 11:21 AM, srikalyan wrote:
> Hi David, Thanks for the review, the new webrev is hosted at
> http://cr.openjdk.java.net/~cl/host_for_kal/6772009-CancelledLockLoop/
> . Please see inline text.
>
> On 11/20/13, 6:35 PM, David Holmes wrote:
>> On 21/11/2013 10:28 AM, Martin Buchholz wrote:
>>> I again tried and failed to reproduce a failure. Even if I go whole
>>> hog
>>> and multiply TIMEOUT by 100 and divide ITERS by 100, the test
>>> continues to
>>> PASS. Is it just me?!
>>
>> I think you are going the wrong way Martin - you want the timeout to
>> be smaller than the time it takes to execute ITERS.
>>
>>> I don't think there's any reason to make result long. It's not even
>>> used
>>> except to inhibit hotspot optimizations.
>>>
>>> + private volatile long result = 17;//Can get int overflow,so
>>> using long
>>
>> Further the subsequent use of += is incorrect as it is not an atomic
>> operation. Even if we don't care about the value, it looks bad.
> Made the necessary changes for atomic update.
>>
>> I'm not sure result must be updated if we get a
>> BrokenBarrierException either. Probably harmless, but necessary?
> I retained it in the fix for completeness in updating the numbers,
> please let me know if you still think otherwise.
>>
>>> need to fix spelling and spacing below.
>>>
>>> + barrier.await();//If a BrokeBarrierException happens
>>> here(due to
>>
>> There are a number of style issues with spacing around comments.
> Fixed the spelling error and styling issues.
>>
>> And I don't think this change is sufficient to claim co-author status
>> with Doug either ;-)
> Removed the claim :)
>>
>> The additional tracing may be useful but seems stylistically
>> different from the rest of the code.
> Retained the tracking to understand if it is again the timing issue
> which is the cause in an event of a failure, however i can remove it
> if you think it is not necessary (OR) include an alternate solution as
> you may want to suggest.
>>
>> Overall I'm suspicious that the changed timeout actually fixes
>> anything - normally we need to add longer timeouts not shorter ones.
>> Does this fail on a range of machines or only specific ones? Have we
>> verified that the clocks/timers are behaving properly on those systems?
> Here the time out is not about waiting for threads to complete
> something but to "be interrupted" before being considered done, so we
> decreased the timeout. However we now chose to increase the number of
> iterations to 5000000 from 1000000(thanks to tristan for the
> suggestion) instead of decreasing the timeout as done earlier because
> the increasing iterations ensures the threads are busy for long time
> curtailing the need to touch the timeout.
>
>>
>> Thanks,
>> David
>
> --
> Thanks
> kalyan
> Ph: (408)-585-8040
>
>>
>>>
>>>
>>>
>>> On Wed, Nov 20, 2013 at 11:52 AM, srikalyan <
>>> srikalyan.chandrashekar at oracle.com> wrote:
>>>
>>>> Hi Martin , apologies for the delay , was trying to get help for
>>>> hosting
>>>> my webrev. . Please see inline text.
>>>>
>>>>
>>>> On 11/19/13, 10:35 PM, Martin Buchholz wrote:
>>>>
>>>> Hi Kalyan,
>>>>
>>>> None of us can review your changes yet because you haven't given
>>>> us a
>>>> URL of your webrev.
>>>>
>>>> It is located here
>>>> http://cr.openjdk.java.net/~cl/host_for_srikalyan_6772009_CancelledLockLoops/
>>>>
>>>>
>>>>
>>>> I've tried to make the jsr166 copy of CancelledLockLoops fail by
>>>> adjusting ITERS and TIMEOUT radically up and down, but the test
>>>> just keeps
>>>> on passing for me. Hints appreciated.
>>>>
>>>> Bump up the timeout to 500ms and you will see a failure (i can see it
>>>> consistently on my machine Linux 64bit,8GBRAM,dual cores, with JDK 1.8
>>>> latest any promoted build).
>>>>
>>>>
>>>>
>>>> On Tue, Nov 19, 2013 at 6:39 PM, srikalyan chandrashekar <
>>>> srikalyan.chandrashekar at oracle.com> wrote:
>>>>
>>>>> Suggested Fix:
>>>>>> a) Decrease the timeout from 100 to 50ms which will ensure that
>>>>>> the test
>>>>>> will pass even on faster machines
>>>>>>
>>>>>
>>>> This doesn't look like a permanent fix - it just makes the
>>>> failing case
>>>> rarer.
>>>>
>>>> Thats true , the other way is to make the main thread wait on TIMEOUT
>>>> after firing the interrupts instead of other way round, but that
>>>> would be
>>>> over-optimization which probably is not desirable as well. The 50
>>>> ms was
>>>> arrived at empirically after running several 100 times on multiple
>>>> configurations and did not cause failures.
>>>>
>>>> --
>>>> Thanks
>>>> kalyan
>>>> Ph: (408)-585-8040
>>>>
>>>>
>>>>
More information about the core-libs-dev
mailing list