RFR 8214148: [TESTBUG] serviceability/tmtools/jstack/WaitNotifyThreadTest.java is not doing what is expected
Patricio Chilano
patricio.chilano.mateo at oracle.com
Sat Dec 8 22:30:30 UTC 2018
Hi David,
On 12/8/18 4:59 PM, David Holmes wrote:
> Hi Patricio,
>
> On 9/12/2018 6:06 am, Patricio Chilano wrote:
>> Hi David,
>>
>> On 12/8/18 5:54 AM, David Holmes wrote:
>>> Hi Patricio,
>>>
>>> This test is failing now in tier 4.
>> Ok, I found the problem. Tier4 is running the test with flag -Xcomp,
>> and that's making jstack always return the frame
>> java.lang.Object.wait without the monitor address (- waiting on <no
>> object reference available>). Since now we do check for the address
>> of the monitor, the test fails. Sorry I only run this test up to
>> tier3 and as a standalone test that's why it never failed for me before.
>
> Okay so that explains why the original test did not use
> assertMonitorInfo for the OBJECT_WAIT case. So we basically need to
> revert to the same logic as the original.
The assertMonitorInfo() for the OBJECT_WAIT case is not actually
failing, is the one for the RUN_METHOD case. For the RUN_METHOD frame,
jstack does show the address of the monitor (- locked
<0x00000000someaddress>) so when it gets compared with the one from
OBJECT_WAIT is failing. I don't know why for one case jstack finds the
address but not for the other when using the -Xcomp flag.
The reason why the original wasn't failing in the RUN_METHOD case is
because the logic for checking the method name was wrong so the path was
never actually executed.
>> We can stop checking for the address altogether(1) or only for this
>> particular case(2):
>>
>> (1) http://cr.openjdk.java.net/~pchilanomate/8215050.01/webrev/
>> (2) http://cr.openjdk.java.net/~pchilanomate/8215050.02/webrev/
>
> I think (2) gives us closer to what the original logic did, but I'm
> still vague on the difference between the two cases in the original code.
Yes, in the original code the idea is that the monitor address you read
in the OBJECT_WAIT case has to match the one read in the RUN_METHOD
case. So (2) is still trying to do the same. (1) just removes checking
the address.
>> I tested both changes and they work with and without that flag. I
>> filed the bug here https://bugs.openjdk.java.net/browse/JDK-8215050.
>
> Please start a new RFR thread under that bug id.
Ok, I'll send the RFR for the new bug.
Thanks,
Patricio
> Thanks,
> David
>
>> Thanks,
>> Patricio
>>> David
>>>
>>> On 8/12/2018 12:59 am, Patricio Chilano wrote:
>>>> Thanks Coleen!
>>>>
>>>> Patricio
>>>>
>>>> On 12/6/18 8:00 PM, coleen.phillimore at oracle.com wrote:
>>>>> On 12/6/18 7:55 PM, Patricio Chilano wrote:
>>>>>> Hi Coleen, David,
>>>>>>
>>>>>> On 12/6/18 7:19 PM, David Holmes wrote:
>>>>>>> On 7/12/2018 8:09 am, coleen.phillimore at oracle.com wrote:
>>>>>>>>
>>>>>>>> Hi, Sorry this took a while to get to.
>>>>>>>>
>>>>>>>> In analyzeThreadStackWaiting, is it possible to loop through
>>>>>>>> the stack and not find either OBJECT_WAIT or RUN_METHOD now?
>>>>>>>
>>>>>>> It looks like it was already possible in the sense that the loop
>>>>>>> doesn't account for it. But that means you'd have to have a
>>>>>>> completely wrong stack for the thread.
>>>>>>>
>>>>>>>> Could you add a count and a test that you don't drop through
>>>>>>>> the loop, without testing both conditions?
>>>>>>>
>>>>>>> A simple boolean foundWait/foundRun would suffice I think.
>>>>>>
>>>>>> I think the loop should at least execute the "if" for the
>>>>>> RUN_METHOD in analyzeThreadStackWaiting(), since before executing
>>>>>> analyzeThreadStackWaiting() we are waiting for the waitThread to
>>>>>> reach the WAITING state. If the OBJECT_WAIT case does not execute
>>>>>> the test will already fail since the monitorAddress will be null
>>>>>> in the RUN_METHOD case.
>>>>>>
>>>>>> If you still think it is worth testing that I can add a check
>>>>>> that monitorAddress should not be null before returning from
>>>>>> analyzeThreadStackWaiting().
>>>>>
>>>>> Thanks for the explanation! It looks good to me then.
>>>>> Coleen
>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Patricio
>>>>>>>
>>>>>>> Cheers,
>>>>>>> David
>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Coleen
>>>>>>>>
>>>>>>>> On 12/3/18 11:48 AM, Patricio Chilano wrote:
>>>>>>>>> Hi David,
>>>>>>>>>
>>>>>>>>> On 12/3/18 2:14 AM, David Holmes wrote:
>>>>>>>>>> Hi Patricio,
>>>>>>>>>>
>>>>>>>>>> On 1/12/2018 2:31 am, Patricio Chilano wrote:
>>>>>>>>>>> Hi David,
>>>>>>>>>>>
>>>>>>>>>>> On 11/29/18 8:05 PM, David Holmes wrote:
>>>>>>>>>>>> Hi Patricio,
>>>>>>>>>>>>
>>>>>>>>>>>> This seems very complicated and I'm not quite seeing how it
>>>>>>>>>>>> all goes together. The check for waiting to re-lock now
>>>>>>>>>>>> seems to dominant the test and obscure the original checks.
>>>>>>>>>>>>
>>>>>>>>>>>> I'm not sure this is worthwhile in the context of this
>>>>>>>>>>>> test. It might be much simpler to just get rid of the
>>>>>>>>>>>> existing "waiting to re-lock" check which will not be seen
>>>>>>>>>>>> and then if we really want to check that case add a much
>>>>>>>>>>>> simpler form that just checks for that.
>>>>>>>>>>> Ok, I actually had similar thoughts while I was adding the
>>>>>>>>>>> extra code. Here is the new webrev:
>>>>>>>>>>>
>>>>>>>>>>> http://cr.openjdk.java.net/~pchilanomate/8214148.02/webrev/
>>>>>>>>>>>
>>>>>>>>>>> I removed the check for the "waiting to re-lock" case.
>>>>>>>>>>
>>>>>>>>>> Good - the more I look at this test the more I see the
>>>>>>>>>> "waiting to re-lock" case is just not relevant to it.
>>>>>>>>>>
>>>>>>>>>> The addition of the code to wait until the target thread is
>>>>>>>>>> in the right state seems good.
>>>>>>>>>>
>>>>>>>>>> I'm unclear on some of the changes to the checking code in
>>>>>>>>>> analyzeThreadStackWaiting. It seems you removed the code that
>>>>>>>>>> watched for finding the wrong monitor and replaced that with
>>>>>>>>>> a call to assertMonitorInfo. But the latter is passed the
>>>>>>>>>> address you got from the MonitorInfo in the first place so
>>>>>>>>>> the check of the address is never going to fail. ??
>>>>>>>>> I just replaced that "if-else" with assertMonitorInfo()
>>>>>>>>> because the same check "monInfo.getType().equals("waiting on")
>>>>>>>>> && compareMonitorClass(monInfo)" will be done in
>>>>>>>>> assertMonitorInfo() and the error case is also doing the same,
>>>>>>>>> so the code gets simplified. Yes, the extra address check will
>>>>>>>>> never fail.
>>>>>>>>>
>>>>>>>>>> That said you can't be waiting on two monitors so I don't see
>>>>>>>>>> how we can ever have the wrong one ??
>>>>>>>>> I'm not sure why this test is checking for the monitor address
>>>>>>>>> in assertMonitorInfo(). The only case where I see it could
>>>>>>>>> fail is for the RUN_METHOD case if the address is null, but
>>>>>>>>> that has a separate check before assertMonitorInfo(). Maybe at
>>>>>>>>> some point the test had more monitors, because there was also
>>>>>>>>> that "for" loop checking for the "waiting to re-lock in
>>>>>>>>> wait()" case. I can remove the test
>>>>>>>>> "monInfo.getMonitorAddress().equals(monitorAddress)" in
>>>>>>>>> assertMonitorInfo() but I don't think it hurts to keep it.
>>>>>>>>>
>>>>>>>>>> A few minor style nits:
>>>>>>>>>>
>>>>>>>>>> Pre-existing:
>>>>>>>>>>
>>>>>>>>>> 54 //Notify the waiting thread, so it stops
>>>>>>>>>> waiting and sleeps
>>>>>>>>>>
>>>>>>>>>> Please add a space after //
>>>>>>>>>>
>>>>>>>>>> 106 // Start athread that just waits
>>>>>>>>>>
>>>>>>>>>> s/athread/a thread/
>>>>>>>>>>
>>>>>>>>>> 145 throw new RuntimeException(OBJECT_WAIT
>>>>>>>>>> 146 + " method has to contain
>>>>>>>>>> one lock record but it contains " + mi.getLocks().size());
>>>>>>>>>>
>>>>>>>>>> Indentation is wrong - the '+' should align with the O in
>>>>>>>>>> OBJECT. Break into three lines (at second + if needed)
>>>>>>>>>>
>>>>>>>>>> New:
>>>>>>>>>>
>>>>>>>>>> 154 if(mi.getLocks().size() == 1){
>>>>>>>>>>
>>>>>>>>>> Space after "if", and space before {
>>>>>>>>>>
>>>>>>>>>> 157 else{
>>>>>>>>>>
>>>>>>>>>> Space before {
>>>>>>>>>>
>>>>>>>>>> 158 throw new RuntimeException(RUN_METHOD
>>>>>>>>>> + " method has to contain one lock record but it contains "
>>>>>>>>>> 159 + mi.getLocks().size());
>>>>>>>>>>
>>>>>>>>>> Incorrect indentation - '+' should align with R in RUN
>>>>>>>>> Done!
>>>>>>>>>>> I'm not sure if it's okay to keep the change to
>>>>>>>>>>> serviceability/tmtools/jstack/utils/DefaultFormat.java then.
>>>>>>>>>>> It doesn't really affect this test, but it is needed for
>>>>>>>>>>> jstack to detect the locks that appear in the stack report
>>>>>>>>>>> with the message "waiting to re-lock in wait()".
>>>>>>>>>>
>>>>>>>>>> I'd probably revert that change at this stage.
>>>>>>>>> Done!
>>>>>>>>>
>>>>>>>>> Here is the new webrev:
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~pchilanomate/8214148.03/webrev/
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Patricio
>>>>>>>>>> Thanks,
>>>>>>>>>> David
>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Patricio
>>>>>>>>>>>> To me the simplest way to see the "re-lock in wait" case is
>>>>>>>>>>>> to just:
>>>>>>>>>>>>
>>>>>>>>>>>> synchronized(obj) {
>>>>>>>>>>>> obj.notifyAll();
>>>>>>>>>>>> <= take stack dump here =>
>>>>>>>>>>>> }
>>>>>>>>>>>>
>>>>>>>>>>>> Cheers,
>>>>>>>>>>>> David
>>>>>>>>>>>>
>>>>>>>>>>>> On 30/11/2018 5:52 am, Daniel D. Daugherty wrote:
>>>>>>>>>>>>> Adding serviceability-dev at ... since the 'tmtools'
>>>>>>>>>>>>> including 'jstack'
>>>>>>>>>>>>> are owned by the Serviceability team.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Dan
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 11/28/18 4:21 PM, Patricio Chilano wrote:
>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Could you review this fix for test
>>>>>>>>>>>>>> serviceability/tmtools/jstack/WaitNotifyThreadTest.java?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On one hand the test was not properly checking what it
>>>>>>>>>>>>>> was intended to check, since as mentioned in JBS the
>>>>>>>>>>>>>> logic for checking the method name was wrong. Also since
>>>>>>>>>>>>>> there was only one monitor in the test, the "for" loop
>>>>>>>>>>>>>> with the message "waiting to re-lock in wait()" was never
>>>>>>>>>>>>>> actually reached.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Additionally, with change 8150689 the message "waiting to
>>>>>>>>>>>>>> re-lock in wait()" is now shown in the frame where the
>>>>>>>>>>>>>> relocking is actually taking place, so the logic for
>>>>>>>>>>>>>> checking that should change.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I fixed the first issues and added logic to check for the
>>>>>>>>>>>>>> "waiting to re-lock in wait()" case. I used the
>>>>>>>>>>>>>> Thread.State attribute to check desire states are reached
>>>>>>>>>>>>>> before getting the thread dump reports through jstack. I
>>>>>>>>>>>>>> run the test in mach5 several times for all platforms and
>>>>>>>>>>>>>> they all passed.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Webrev URL:
>>>>>>>>>>>>>> http://cr.openjdk.java.net/~pchilanomate/8214148.01/webrev
>>>>>>>>>>>>>> Bug URL: https://bugs.openjdk.java.net/browse/JDK-8214148
>>>>>>>>>>>>>> <https://bugs.openjdk.java.net/browse/JDK-8150689>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Patricio
>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>
>>>>
>>
More information about the hotspot-runtime-dev
mailing list