RFR: JDK-4903717: nsk/jdi/ThreadReference/isSuspended/issuspended002 failing

serguei.spitsyn at oracle.com serguei.spitsyn at oracle.com
Sat Feb 23 04:21:09 UTC 2019


Hi Gary,

Thank you for cosmetic changes.

One thing to notice there is no guard against spurious wakeup for this:
    108 Threadissuspended002a.waitnotifyObj.wait();

But I doubt it causes these timeouts as they are very well reproducible.

This time I looked more to the debugger side.
The code is poorly structured, it is hard to understand what it is doing.

 >   - resumed main thread after breakpoint enabled and thread2
      double resume

Could you, please, explain more how does this help and why timeouts are 
intermittent?

Thanks,
Serguei


On 2/22/19 11:52 AM, Gary Adams wrote:
> After 1000 successful testruns, it seems like a reasonable solution is
> to resume the main thread to resolve any locks holding back the
> thread2 reaching the desired breakpoint.
>
>   Updated webrev: http://cr.openjdk.java.net/~gadams/4903717/webrev.01/
>
> Changes in this iteration:
>   - cosmetic changes
>      - spaces around operators
>      - removed extra spaces around parens
>      - removed extra blank lines
>      - fixed typo in ERRROR
>      - fixed curl braces indenting
>   - resumed main thread after breakpoint enabled and thread2
>      double resume
>
>
> On 2/22/19, 6:19 AM, gary.adams at oracle.com wrote:
>> I'll make a pass over the code for cosmetic fixes after the test
>> is working correctly.
>>
>> I think the purpose of the lockingObject is to ensure the debugger
>> has time to enable the breakpoint and perform the double resume
>> before debuggee thread2 is released to run to the expected breakpoint.
>>
>> I'm assuming the timeout and break point location are correct, because
>> the test passes most of the time. I'll dump the values to make sure.
>>
>> I'll also try to capture a thread dump at the time of the timeout,
>> to see if the debuggee is hung somewhere unexpected.
>>
>> The test leaves the main thread suspended while thread2 is
>> is resumed. I'll see if resuming the main thread unblocks thread2.
>>
>> ...
>>
>> On 2/21/19 10:18 PM, serguei.spitsyn at oracle.com wrote:
>>> Hi Gary,
>>>
>>> I do not see any cause of this timeout yet.
>>> However, the synchronization on the lockingObject looks completely 
>>> wasteful.
>>> Moreover, it adds unreasonable complexity and makes the test harder 
>>> to understand.
>>>
>>> So, I'd suggested to do a small cleanup that includes:
>>>   - removing the lockingObject field and sync around it
>>>   - fix wrong indents in the fragment:
>>>   161         synchronized (waitnotifyObj)                    {
>>>   162             log("entered into block:  synchronized (waitnotifyObj)");
>>>   163             waitnotifyObj.notify();                     }
>>>
>>> Thanks,
>>> Serguei
>>>
>>>
>>> On 2/21/19 10:57, gary.adams at oracle.com wrote:
>>>> The commented out "docontinue" was done to match the style
>>>> used in the other tests. I'll remove it for this test.
>>>>
>>>> In any event, the test still is failing 8 / 500 testruns on
>>>> macosx and windows debug builds.
>>>>
>>>> The change is not sufficient.
>>>>
>>>> Here's a recent failure. You can see in the log that the debugger did
>>>> wait until the new "docontinue" arrives outside the lockingObject
>>>> synchronized block.
>>>>
>>>> It appears with the main thread suspended and the thread2 double 
>>>> resumed,
>>>> it still does not continue executing up to the breakpoint in thread2.
>>>> After the eventQueue.remove(timeout), when the vm is resumed
>>>> you can see the debuggee messages expected leading up to the 
>>>> breakpoint.
>>>>
>>>> ...
>>>>
>>>> --> debugger:       suspending VM with vm.suspend();
>>>> *debugee.stderr> **> mainThread: mainThread is out of: synchronized 
>>>> (lockingObject) {*
>>>> --> debugger:          thread2.suspendCount() == 2
>>>> --> debugger:      :   thread2.isSuspended()
>>>> --> debugger:          mainThread.suspendCount() == 1
>>>> --> debugger:      :   mainThread.isSuspended()
>>>> --> debugger:       double resuming thread2 with thread2.resume();
>>>> --> debugger:          thread2.suspendCount() == 1
>>>> --> debugger:      :   thread2.isSuspended()
>>>> --> debugger:      enabling breakpRequest1
>>>> --> debugger:          thread2.suspendCount() == 0
>>>> --> debugger:      :  !thread2.isAtBreakpoint() before call to breakpoint()
>>>> --> debugger:        waiting for BreakpointEvent
>>>> *--> debugger: new: eventSet = eventQueue.remove();*
>>>> # ERROR: ##> debugger: ERROR:  timeout for waiting for a BreakpintEvent
>>>> The following stacktrace is for failure analysis.
>>>> nsk.share.TestFailure: ##> debugger: ERROR:  timeout for waiting for a BreakpintEvent
>>>> 	at nsk.share.Log.logExceptionForFailureAnalysis(Log.java:428)
>>>> 	at nsk.share.Log.complain(Log.java:399)
>>>> 	at nsk.jdi.ThreadReference.isSuspended.issuspended002.log3(issuspended002.java:104)
>>>> 	at nsk.jdi.ThreadReference.isSuspended.issuspended002.breakpoint(issuspended002.java:555)
>>>> 	at nsk.jdi.ThreadReference.isSuspended.issuspended002.runThis(issuspended002.java:389)
>>>> 	at nsk.jdi.ThreadReference.isSuspended.issuspended002.run(issuspended002.java:90)
>>>> 	at nsk.jdi.ThreadReference.isSuspended.issuspended002.main(issuspended002.java:85)
>>>> 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>> 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>>>> 	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>> 	at java.base/java.lang.reflect.Method.invoke(Method.java:567)
>>>> 	at PropertyResolvingWrapper.main(PropertyResolvingWrapper.java:104)
>>>> 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>> 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>>>> 	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>> 	at java.base/java.lang.reflect.Method.invoke(Method.java:567)
>>>> 	at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:127)
>>>> 	at java.base/java.lang.Thread.run(Thread.java:835)
>>>> *--> debugger: resuming both second and main thread debugee.stderr> 
>>>> **> mainThread: waiting for an instruction from the debugger ... 
>>>> --> debugger: the end of testing debugee.stderr> **> thread2: 
>>>> entered into block: synchronized (lockingObject) debugee.stderr> 
>>>> **> thread2: exited from block: synchronized (lockingObject) 
>>>> debugee.stderr> **> thread2: call to the method 'runt1' 
>>>> debugee.stderr> **> thread2: method 'runt1' enter*
>>>> --> debugger:      : returned string is 'checkend'
>>>> debugee.stderr> **> mainThread: waiting for an instruction from the debugger ...
>>>>
>>>>
>>>> On 2/20/19 5:15 PM, Alex Menkov wrote:
>>>>> +1
>>>>>
>>>>> But could you please remove old "pipe.println("docontinue")" 
>>>>> statement instead of commenting it out (no new webrev is required)
>>>>>
>>>>> --alex
>>>>>
>>>>> On 02/20/2019 09:58, Chris Plummer wrote:
>>>>>> Looks good.
>>>>>>
>>>>>> Chris
>>>>>>
>>>>>> On 2/20/19 7:57 AM, Gary Adams wrote:
>>>>>>> The issuspended002 has been on the ProblemList since the tests 
>>>>>>> were moved to
>>>>>>> the open repos. The proposed changeset applies the same fix that 
>>>>>>> was used in issuspended001.
>>>>>>>
>>>>>>> The current test fails when the debuggee replies with the 
>>>>>>> "docontinue" response,
>>>>>>> while it is still holding a lock in a syncrhonized block. When 
>>>>>>> the debugger then
>>>>>>> suspends the vm the debuggee test thread fails to proceed to the 
>>>>>>> expected breakpoint.
>>>>>>> Testing in progress.
>>>>>>>
>>>>>>>   Issue: https://bugs.openjdk.java.net/browse/JDK-4903717
>>>>>>>   Webrev: http://cr.openjdk.java.net/~gadams/4903717/webrev.00/
>>>>>>
>>>>>>
>>>>>>
>>>>
>>>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20190222/1e1dac80/attachment-0001.html>


More information about the serviceability-dev mailing list