RFR for JDK-6772009 Intermittent test failure: java/util/concurrent/locks/ReentrantLock/CancelledLockLoops.java test failed with 'Completed != 2'

David Holmes david.holmes at oracle.com
Fri Dec 20 04:10:59 UTC 2013


Sorry Kalyan but I don't see the need for all the incidental changes if 
the primary change is to just increase the iterations. I also don't see 
why you need to do anything for BrokenBarrierException as it is not 
expected to happen and the test should just fail if it does.

David

On 10/12/2013 6:15 AM, srikalyan wrote:
> Hi David/Martin a gentle reminder for review.
>
> --
> Thanks
> kalyan
> Ph: (408)-585-8040
>
>
> On 12/2/13, 11:21 AM, srikalyan wrote:
>> Hi David, Thanks for the review, the new webrev is hosted at
>> http://cr.openjdk.java.net/~cl/host_for_kal/6772009-CancelledLockLoop/
>> . Please see inline text.
>>
>> On 11/20/13, 6:35 PM, David Holmes wrote:
>>> On 21/11/2013 10:28 AM, Martin Buchholz wrote:
>>>> I again tried and failed to reproduce a failure.  Even if I go whole
>>>> hog
>>>> and multiply TIMEOUT by 100 and divide ITERS by 100, the test
>>>> continues to
>>>> PASS.  Is it just me?!
>>>
>>> I think you are going the wrong way Martin - you want the timeout to
>>> be smaller than the time it takes to execute ITERS.
>>>
>>>> I don't think there's any reason to make result long.  It's not even
>>>> used
>>>> except to inhibit hotspot optimizations.
>>>>
>>>> +        private volatile long result = 17;//Can get int overflow,so
>>>> using long
>>>
>>> Further the subsequent use of += is incorrect as it is not an atomic
>>> operation. Even if we don't care about the value, it looks bad.
>> Made the necessary changes for atomic update.
>>>
>>> I'm not sure result must be updated if we get a
>>> BrokenBarrierException either. Probably harmless, but necessary?
>> I retained it in the fix for completeness in updating the numbers,
>> please let me know if you still think otherwise.
>>>
>>>> need to fix spelling and spacing below.
>>>>
>>>> +                barrier.await();//If a BrokeBarrierException happens
>>>> here(due to
>>>
>>> There are a number of style issues with spacing around comments.
>> Fixed the spelling error and styling issues.
>>>
>>> And I don't think this change is sufficient to claim co-author status
>>> with Doug either ;-)
>> Removed the claim :)
>>>
>>> The additional tracing may be useful but seems stylistically
>>> different from the rest of the code.
>> Retained the tracking to understand if it is again the timing issue
>> which is the cause in an event of a failure, however i can remove it
>> if you think it is not necessary (OR) include an alternate solution as
>> you may want to suggest.
>>>
>>> Overall I'm suspicious that the changed timeout actually fixes
>>> anything - normally we need to add longer timeouts not shorter ones.
>>> Does this fail on a range of machines or only specific ones? Have we
>>> verified that the clocks/timers are behaving properly on those systems?
>> Here the time out is not about waiting for threads to complete
>> something but to "be interrupted" before being considered done, so we
>> decreased the timeout. However we now chose to increase the number of
>> iterations to 5000000 from 1000000(thanks to tristan for the
>> suggestion) instead of decreasing the timeout as done earlier because
>> the increasing iterations ensures the threads are busy for long time
>> curtailing the need to touch the timeout.
>>
>>>
>>> Thanks,
>>> David
>>
>> --
>> Thanks
>> kalyan
>> Ph: (408)-585-8040
>>
>>>
>>>>
>>>>
>>>>
>>>> On Wed, Nov 20, 2013 at 11:52 AM, srikalyan <
>>>> srikalyan.chandrashekar at oracle.com> wrote:
>>>>
>>>>>   Hi Martin , apologies for the delay , was trying to get help for
>>>>> hosting
>>>>> my webrev.  .  Please see inline text.
>>>>>
>>>>>
>>>>> On 11/19/13, 10:35 PM, Martin Buchholz wrote:
>>>>>
>>>>> Hi Kalyan,
>>>>>
>>>>>   None of us can review your changes yet because you haven't given
>>>>> us a
>>>>> URL of your webrev.
>>>>>
>>>>> It is located here
>>>>> http://cr.openjdk.java.net/~cl/host_for_srikalyan_6772009_CancelledLockLoops/
>>>>>
>>>>>
>>>>>
>>>>>   I've tried to make the jsr166 copy of CancelledLockLoops fail by
>>>>> adjusting ITERS and TIMEOUT radically up and down, but the test
>>>>> just keeps
>>>>> on passing for me.  Hints appreciated.
>>>>>
>>>>> Bump up the timeout to 500ms and you will see a failure (i can see it
>>>>> consistently on my machine Linux 64bit,8GBRAM,dual cores, with JDK 1.8
>>>>> latest any promoted build).
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Nov 19, 2013 at 6:39 PM, srikalyan chandrashekar <
>>>>> srikalyan.chandrashekar at oracle.com> wrote:
>>>>>
>>>>>>     Suggested Fix:
>>>>>>> a) Decrease the timeout from 100 to 50ms which will ensure that
>>>>>>> the test
>>>>>>> will pass even on faster machines
>>>>>>>
>>>>>>
>>>>>   This doesn't look like a permanent fix - it just makes the
>>>>> failing case
>>>>> rarer.
>>>>>
>>>>> Thats true , the other way is to make the main thread wait on TIMEOUT
>>>>> after firing the interrupts instead of other way round, but that
>>>>> would be
>>>>> over-optimization which probably is not desirable as well.  The 50
>>>>> ms was
>>>>> arrived at empirically after running several 100 times on multiple
>>>>> configurations and did not cause failures.
>>>>>
>>>>> --
>>>>> Thanks
>>>>> kalyan
>>>>> Ph: (408)-585-8040
>>>>>
>>>>>
>>>>>



More information about the core-libs-dev mailing list