RFR 8214148: [TESTBUG] serviceability/tmtools/jstack/WaitNotifyThreadTest.java is not doing what is expected

Sat Dec 8 20:06:59 UTC 2018

Hi David,

On 12/8/18 5:54 AM, David Holmes wrote:
> Hi Patricio,
>
> This test is failing now in tier 4.
Ok, I found the problem. Tier4 is running the test with flag -Xcomp, and 
that's making jstack always return the frame java.lang.Object.wait 
without the monitor address (- waiting on <no object reference 
available>). Since now we do check for the address of the monitor, the 
test fails. Sorry I only run this test up to tier3 and as a standalone 
test that's why it never failed for me before.

We can stop checking for the address altogether(1) or only for this 
particular case(2):

(1) http://cr.openjdk.java.net/~pchilanomate/8215050.01/webrev/
(2) http://cr.openjdk.java.net/~pchilanomate/8215050.02/webrev/

I tested both changes and they work with and without that flag. I filed 
the bug here https://bugs.openjdk.java.net/browse/JDK-8215050.

Thanks,
Patricio
> David
>
> On 8/12/2018 12:59 am, Patricio Chilano wrote:
>> Thanks Coleen!
>>
>> Patricio
>>
>> On 12/6/18 8:00 PM, coleen.phillimore at oracle.com wrote:
>>> On 12/6/18 7:55 PM, Patricio Chilano wrote:
>>>> Hi Coleen, David,
>>>>
>>>> On 12/6/18 7:19 PM, David Holmes wrote:
>>>>> On 7/12/2018 8:09 am, coleen.phillimore at oracle.com wrote:
>>>>>>
>>>>>> Hi, Sorry this took a while to get to.
>>>>>>
>>>>>> In analyzeThreadStackWaiting, is it possible to loop through the 
>>>>>> stack and not find either OBJECT_WAIT or RUN_METHOD now?
>>>>>
>>>>> It looks like it was already possible in the sense that the loop 
>>>>> doesn't account for it. But that means you'd have to have a 
>>>>> completely wrong stack for the thread.
>>>>>
>>>>>> Could you add a count and a test that you don't drop through the 
>>>>>> loop, without testing both conditions?
>>>>>
>>>>> A simple boolean foundWait/foundRun would suffice I think.
>>>>
>>>> I think the loop should at least execute the "if" for the 
>>>> RUN_METHOD in analyzeThreadStackWaiting(), since before executing 
>>>> analyzeThreadStackWaiting() we are waiting for the waitThread to 
>>>> reach the WAITING state.  If the OBJECT_WAIT case does not execute 
>>>> the test will already fail since the monitorAddress will be null in 
>>>> the RUN_METHOD case.
>>>>
>>>> If you still think it is worth testing that I can add a check that 
>>>> monitorAddress should not be null before returning from 
>>>> analyzeThreadStackWaiting().
>>>
>>> Thanks for the explanation!  It looks good to me then.
>>> Coleen
>>>
>>>>
>>>> Thanks,
>>>> Patricio
>>>>>
>>>>> Cheers,
>>>>> David
>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Coleen
>>>>>>
>>>>>> On 12/3/18 11:48 AM, Patricio Chilano wrote:
>>>>>>> Hi David,
>>>>>>>
>>>>>>> On 12/3/18 2:14 AM, David Holmes wrote:
>>>>>>>> Hi Patricio,
>>>>>>>>
>>>>>>>> On 1/12/2018 2:31 am, Patricio Chilano wrote:
>>>>>>>>> Hi David,
>>>>>>>>>
>>>>>>>>> On 11/29/18 8:05 PM, David Holmes wrote:
>>>>>>>>>> Hi Patricio,
>>>>>>>>>>
>>>>>>>>>> This seems very complicated and I'm not quite seeing how it 
>>>>>>>>>> all goes together. The check for waiting to re-lock now seems 
>>>>>>>>>> to dominant the test and obscure the original checks.
>>>>>>>>>>
>>>>>>>>>> I'm not sure this is worthwhile in the context of this test. 
>>>>>>>>>> It might be much simpler to just get rid of the existing 
>>>>>>>>>> "waiting to re-lock" check which will not be seen and then if 
>>>>>>>>>> we really want to check that case add a much simpler form 
>>>>>>>>>> that just checks for that.
>>>>>>>>> Ok, I actually had similar thoughts while I was adding the 
>>>>>>>>> extra code. Here is the new webrev:
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~pchilanomate/8214148.02/webrev/
>>>>>>>>>
>>>>>>>>> I removed the check for the "waiting to re-lock" case. 
>>>>>>>>
>>>>>>>> Good - the more I look at this test the more I see the "waiting 
>>>>>>>> to re-lock" case is just not relevant to it.
>>>>>>>>
>>>>>>>> The addition of the code to wait until the target thread is in 
>>>>>>>> the right state seems good.
>>>>>>>>
>>>>>>>> I'm unclear on some of the changes to the checking code in 
>>>>>>>> analyzeThreadStackWaiting. It seems you removed the code that 
>>>>>>>> watched for finding the wrong monitor and replaced that with a 
>>>>>>>> call to assertMonitorInfo. But the latter is passed the address 
>>>>>>>> you got from the MonitorInfo in the first place so the check of 
>>>>>>>> the address is never going to fail. ?? 
>>>>>>> I just replaced that "if-else" with assertMonitorInfo() because 
>>>>>>> the same check "monInfo.getType().equals("waiting on") && 
>>>>>>> compareMonitorClass(monInfo)" will be done in 
>>>>>>> assertMonitorInfo() and the error case is also doing the same, 
>>>>>>> so the code gets simplified. Yes, the extra address check will 
>>>>>>> never fail.
>>>>>>>
>>>>>>>> That said you can't be waiting on two monitors so I don't see 
>>>>>>>> how we can ever have the wrong one ??
>>>>>>> I'm not sure why this test is checking for the monitor address 
>>>>>>> in assertMonitorInfo(). The only case where I see it could fail 
>>>>>>> is for the RUN_METHOD case if the address is null, but that has 
>>>>>>> a separate check before assertMonitorInfo(). Maybe at some point 
>>>>>>> the test had more monitors, because there was also that "for" 
>>>>>>> loop checking for the "waiting to re-lock in wait()" case. I can 
>>>>>>> remove the test 
>>>>>>> "monInfo.getMonitorAddress().equals(monitorAddress)" in 
>>>>>>> assertMonitorInfo() but I don't think it hurts to keep it.
>>>>>>>
>>>>>>>> A few minor style nits:
>>>>>>>>
>>>>>>>> Pre-existing:
>>>>>>>>
>>>>>>>> 54             //Notify the waiting thread, so it stops waiting 
>>>>>>>> and sleeps
>>>>>>>>
>>>>>>>> Please add a space after //
>>>>>>>>
>>>>>>>>  106         // Start athread that just waits
>>>>>>>>
>>>>>>>> s/athread/a thread/
>>>>>>>>
>>>>>>>>  145                     throw new RuntimeException(OBJECT_WAIT
>>>>>>>>  146                             + " method has to contain one 
>>>>>>>> lock record but it contains " + mi.getLocks().size());
>>>>>>>>
>>>>>>>> Indentation is wrong - the '+' should align with the O in 
>>>>>>>> OBJECT. Break into three lines (at second + if needed)
>>>>>>>>
>>>>>>>> New:
>>>>>>>>
>>>>>>>> 154                 if(mi.getLocks().size() == 1){
>>>>>>>>
>>>>>>>> Space after "if", and space before {
>>>>>>>>
>>>>>>>>  157                 else{
>>>>>>>>
>>>>>>>> Space before {
>>>>>>>>
>>>>>>>> 158                     throw new RuntimeException(RUN_METHOD + 
>>>>>>>> " method has to contain one lock record but it contains "
>>>>>>>>  159 + mi.getLocks().size());
>>>>>>>>
>>>>>>>> Incorrect indentation - '+' should align with R in RUN
>>>>>>> Done!
>>>>>>>>> I'm not sure if it's okay to keep the change to 
>>>>>>>>> serviceability/tmtools/jstack/utils/DefaultFormat.java then. 
>>>>>>>>> It doesn't really affect this test, but it is needed for 
>>>>>>>>> jstack to detect the locks that appear in the stack report 
>>>>>>>>> with the message "waiting to re-lock in wait()".
>>>>>>>>
>>>>>>>> I'd probably revert that change at this stage.
>>>>>>> Done!
>>>>>>>
>>>>>>> Here is the new webrev:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~pchilanomate/8214148.03/webrev/
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Patricio
>>>>>>>> Thanks,
>>>>>>>> David
>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Patricio
>>>>>>>>>> To me the simplest way to see the "re-lock in wait" case is 
>>>>>>>>>> to just:
>>>>>>>>>>
>>>>>>>>>> synchronized(obj) {
>>>>>>>>>>    obj.notifyAll();
>>>>>>>>>>    <= take stack dump here =>
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> Cheers,
>>>>>>>>>> David
>>>>>>>>>>
>>>>>>>>>> On 30/11/2018 5:52 am, Daniel D. Daugherty wrote:
>>>>>>>>>>> Adding serviceability-dev at ... since the 'tmtools' including 
>>>>>>>>>>> 'jstack'
>>>>>>>>>>> are owned by the Serviceability team.
>>>>>>>>>>>
>>>>>>>>>>> Dan
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 11/28/18 4:21 PM, Patricio Chilano wrote:
>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>
>>>>>>>>>>>> Could you review this fix for test 
>>>>>>>>>>>> serviceability/tmtools/jstack/WaitNotifyThreadTest.java?
>>>>>>>>>>>>
>>>>>>>>>>>> On one hand the test was not properly checking what it was 
>>>>>>>>>>>> intended to check, since as mentioned in JBS the logic for 
>>>>>>>>>>>> checking the method name was wrong. Also since there was 
>>>>>>>>>>>> only one monitor in the test, the "for" loop with the 
>>>>>>>>>>>> message "waiting to re-lock in wait()" was never actually 
>>>>>>>>>>>> reached.
>>>>>>>>>>>>
>>>>>>>>>>>> Additionally, with change 8150689 the message "waiting to 
>>>>>>>>>>>> re-lock in wait()" is now shown in the frame where the 
>>>>>>>>>>>> relocking is actually taking place, so the logic for 
>>>>>>>>>>>> checking that should change.
>>>>>>>>>>>>
>>>>>>>>>>>> I fixed the first issues and added logic to check for the 
>>>>>>>>>>>> "waiting to re-lock in wait()" case. I used the 
>>>>>>>>>>>> Thread.State attribute to check desire states are reached 
>>>>>>>>>>>> before getting the thread dump reports through jstack. I 
>>>>>>>>>>>> run the test in mach5 several times for all platforms and 
>>>>>>>>>>>> they all passed.
>>>>>>>>>>>>
>>>>>>>>>>>> Webrev URL: 
>>>>>>>>>>>> http://cr.openjdk.java.net/~pchilanomate/8214148.01/webrev
>>>>>>>>>>>> Bug URL: https://bugs.openjdk.java.net/browse/JDK-8214148 
>>>>>>>>>>>> <https://bugs.openjdk.java.net/browse/JDK-8150689>
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Patricio
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>>
>>>>
>>>
>>