RFR JDK-8152589 : java/lang/management/ThreadMXBean/Locks.java fails intermittently, blocked on wrong object

Harsha Wardhana B harsha.wardhana.b at oracle.com
Wed Aug 31 08:20:16 UTC 2016


Hello,

Please review new webrev incorporating nits form Daniel.

http://cr.openjdk.java.net/~hb/8152589/webrev.03/

-Harsha

On Wednesday 31 August 2016 01:47 AM, Daniel D. Daugherty wrote:
> On 8/29/16 8:51 PM, David Holmes wrote:
>>
>>
>> On 30/08/2016 9:08 AM, Daniel D. Daugherty wrote:
>>> On 8/23/16 6:17 AM, Harsha Wardhana B wrote:
>>>> Hi David,
>>>>
>>>> You approach to waiter object is a neat little abstraction and I will
>>>> make it a point to use it in future fixes, if required.
>>>>
>>>> revised webrev : http://cr.openjdk.java.net/~hb/8152589/webrev.02/
>>>
>>> test/java/lang/management/ThreadMXBean/Locks.java
>>>     General comment:
>>>         'synchronized(foo)' and 'synchronized (foo)' are both used.
>>>         Please pick a style and use it consistently.
>>>
>>>     L24:  /*
>>>         Space added before comment block begin. Please remove it.
>>>
>>>     L76:     /*
>>>     L77:     Handy debug function to check if error condition is 
>>> because
>>> of test code or not.
>>>     L78:     */
>>>         Should look like this:
>>>
>>>         L76:     /*
>>>         L77:      * Handy debug function to check if error condition is
>>> because of test code or not.
>>>         L78:      */
>>>
>>>         That is standard block comment format.
>>>
>>>     L114:     /*
>>>     L115:     Do slow check if thread is blocked on a lock. It is
>>> possible that last thread
>>>     L116:     to come out of Phaser might still be in Phaser call stack
>>> (Unsfe.park) and
>>>     L117:     hence might eventually acquire expected lock.
>>>     L118:     */
>>>         Same comment about standard block comment format.
>>>
>>>         Also typo: "Unsfe.park" -> "Unsafe.park".
>>>
>>>     L129:             while(p.test(result)){
>>>         Space before first '(' and before '{'.
>>>
>>>     L171:                 // stop here  for LockBThread to hold OBJB
>>>         Perhaps this for the comment:
>>>                           // block here while LockBThread holds OBJB
>>>
>>>     L208:     /*
>>>     L209:     Must be invoked from within a Synchronized context
>>>     L210:     */
>>>         Same comment about standard block comment format.
>>>
>>>         Also not sure why "Synchronized" is capitalized...
>>>
>>>     L213:         boolean isNotified = false;
>>>         This should probably be volatile.
>>>
>>>         Update: WaitingThread calls OBJC.doWait() and
>>>         CheckerThread calls OBJC.doNotify(). Yup, it should
>>>         be volatile.
>>
>> Nope - it is only accessed under synchronization lock, which must be 
>> held by the caller else the wait()/notify() will throw 
>> IllegalMonitorStateException.
>
> Yup. That occurred to me late last night... Sorry for the noise.
>
> Dan
>
>
>>
>> David
>> -----
>>
>>> Dan
>>>
>>>
>>>>
>>>> On Tuesday 23 August 2016 11:47 AM, David Holmes wrote:
>>>>> Hi Harsha,
>>>>>
>>>>> On 22/08/2016 6:48 PM, Harsha Wardhana B wrote:
>>>>>> Hello,
>>>>>>
>>>>>> Please review the below webrev incorporating David's comments.
>>>>>>
>>>>>> http://cr.openjdk.java.net/~hb/8152589/webrev.01/
>>>>>
>>>>> Using a static isNotified field isn't exactly what I had in mind, I
>>>>> was thinking of something more encapsulated - something I thought
>>>>> already existed in other tests, but now I can't find it. Eg:
>>>>>
>>>>> /* synchronization helper for two-thread wait/notify interaction
>>>>> */
>>>>> static class WaitObject {
>>>>>   boolean isNotified = false;
>>>>>   public void await() throws InterruptedException {
>>>>>     while (!isNotified)
>>>>>       wait();
>>>>>     isNotified = false;
>>>>>   }
>>>>>   public void doNotify() {
>>>>>     isNotified = true;
>>>>>     notify();
>>>>>   }
>>>>> }
>>>>>
>>>>> then OBJC would be a WaitObject and you would do:
>>>>>
>>>>> synchronized(OBJC) {
>>>>>    log("WaitingThread about to wait on objC");
>>>>>    try {
>>>>>        // Signal checker thread, about to wait on objC.
>>>>>        waiting = false;
>>>>>        p.arriveAndAwaitAdvance(); // Phase 1 (waiting)
>>>>>        waiting = true;
>>>>>        OBJC.doWait();
>>>>>     } catch (InterruptedException e) {
>>>>> ...
>>>>>
>>>>> etc.
>>>>>
>>>>> Minor nits:
>>>>> - why did you move the @library ?
>>>> It was suggested by NetBeans Jtreg plugin to fix tag order.
>>>>> - @Override, if used, should be applied consistently
>>>> Have applied the annotation consistently.
>>>>> - if you want to capitalize objA, objB, objC then ensure you also
>>>>> update it in the comments and log statements (it really isn't
>>>>> necessary to capitalize them, that is suggested for compile-time
>>>>> constants and these are not - they are just static final variables).
>>>> Done.
>>>>
>>>> Thanks
>>>> Harsha
>>>>>
>>>>>
>>>>> Thanks,
>>>>> David
>>>>>
>>>>>> Regards
>>>>>>
>>>>>> Harsha
>>>>>>
>>>>>>
>>>>>> On Wednesday 17 August 2016 11:45 AM, Harsha Wardhana B wrote:
>>>>>>> Hi David,
>>>>>>>
>>>>>>> I will incorporate changes suggested by you. Let's wait for few 
>>>>>>> more
>>>>>>> review comments and then I will send consolidated webrev.
>>>>>>>
>>>>>>> Regards
>>>>>>> Harsha
>>>>>>>
>>>>>>> On Wednesday 17 August 2016 09:02 AM, David Holmes wrote:
>>>>>>>> On 16/08/2016 11:33 PM, Harsha Wardhana B wrote:
>>>>>>>>> Hi David,
>>>>>>>>>
>>>>>>>>> Agreed that we could fix WaitingThread the way you have said, 
>>>>>>>>> but in
>>>>>>>>> recent past, there aren't any issues reported w.r.t 
>>>>>>>>> WaitingThread.
>>>>>>>>
>>>>>>>> Nor are there likely to be - that's what makes spurious wakeup 
>>>>>>>> bugs
>>>>>>>> so difficult to detect!
>>>>>>>>
>>>>>>>>> This test has been fixed several times (3-4) for intermittent
>>>>>>>>> failures
>>>>>>>>> and hence I would not like to meddle with code that is not
>>>>>>>>> causing any
>>>>>>>>> problems even though there is scope for refactoring.
>>>>>>>>
>>>>>>>> It isn't refactoring it is fixing and we have fixed several 
>>>>>>>> tests in
>>>>>>>> a similar way.
>>>>>>>>
>>>>>>>>> The issue reported was with LockThreadB and hence I have provided
>>>>>>>>> possible fix for the same.
>>>>>>>>
>>>>>>>> That doesn't preclude fixing other issues with the test at the 
>>>>>>>> same
>>>>>>>> time.
>>>>>>>>
>>>>>>>> David
>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>>
>>>>>>>>> Harsha
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Tuesday 16 August 2016 01:32 PM, David Holmes wrote:
>>>>>>>>>> Hi Harsha,
>>>>>>>>>>
>>>>>>>>>> On 16/08/2016 4:08 PM, Harsha Wardhana B wrote:
>>>>>>>>>>> Hello,
>>>>>>>>>>>
>>>>>>>>>>> Please review and provide comments for fix for issue,
>>>>>>>>>>>
>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8152589
>>>>>>>>>>>
>>>>>>>>>>> having webrev located at
>>>>>>>>>>>
>>>>>>>>>>> http://cr.openjdk.java.net/~hb/8152589/webrev.00/
>>>>>>>>>>
>>>>>>>>>> These changes look quite good (though I have to admit I 
>>>>>>>>>> struggle to
>>>>>>>>>> read the lambda and stream code :) ).
>>>>>>>>>>
>>>>>>>>>> Note that like many of these kinds of tests there is an issue 
>>>>>>>>>> with
>>>>>>>>>> WaitingThread because it does not wait in a loop and so is
>>>>>>>>>> susceptible
>>>>>>>>>> to spurious wakeups. The way to fix that is to add a "notified"
>>>>>>>>>> variable and then do:
>>>>>>>>>>
>>>>>>>>>> while (!notified)
>>>>>>>>>>   wait();
>>>>>>>>>>
>>>>>>>>>> and set notified before the notify().
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> David
>>>>>>>>>>
>>>>>>>>>>> Fix details:
>>>>>>>>>>>
>>>>>>>>>>> 1. From nightly failures we see that LockThreadB was blocked on
>>>>>>>>>>> wrong
>>>>>>>>>>> object. We now do a repeated check with timeout if any given
>>>>>>>>>>> thread is
>>>>>>>>>>> blocked on expected object. It is possible that LockThreadB 
>>>>>>>>>>> might
>>>>>>>>>>> still
>>>>>>>>>>> be in Phaser call stack (Unsafe.park) when 
>>>>>>>>>>> 'checkBlockedObject' is
>>>>>>>>>>> invoked.
>>>>>>>>>>>
>>>>>>>>>>> 2. The logs from lock free logger was never printed. It is not
>>>>>>>>>>> being
>>>>>>>>>>> printed.
>>>>>>>>>>>
>>>>>>>>>>> 3. Any time we see failure, thread stack is being logged. This
>>>>>>>>>>> helps us
>>>>>>>>>>> ascertain if failure is in test case or in the component.
>>>>>>>>>>>
>>>>>>>>>>> 4. Even though we had lock free logger, several
>>>>>>>>>>> ex.printStackTrace() was
>>>>>>>>>>> used which could be responsible for failure. It is removed.
>>>>>>>>>>>
>>>>>>>>>>> 5. There were several places where tests continue to ran even
>>>>>>>>>>> after
>>>>>>>>>>> failure (testFailed flag). That is fixed.
>>>>>>>>>>>
>>>>>>>>>>> 6. Couple of other minor refactoring.
>>>>>>>>>>>
>>>>>>>>>>> Thanks
>>>>>>>>>>>
>>>>>>>>>>> Harsha
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>>
>>>>
>>>
>



More information about the serviceability-dev mailing list