RFR: JDK-8208473: [TESTBUG] nsk/jdb/exclude/exclude001/exclude001.java is timing out on solaris-sparc again

Alex Menkov alexey.menkov at oracle.com
Fri Sep 28 18:04:48 UTC 2018


Hi Gary,

receiveReply(startPos, false, 0)
calls
waitForPrompt(startPos, compoundPromptOnly, count);

and waitForPrompt has:
         if (count <= 0) {
             throw new TestBug("Wrong number of prompts count in 
Jdb.waitForPrompt(): " + count);
         }

So We will get "Wrong number of prompts count" failure?

--alex

On 09/28/2018 04:47, Gary Adams wrote:
> Revised webrev:
> 
>    Webrev: http://cr.openjdk.java.net/~gadams/8208473/webrev.01/
> 
> The final fix includes
>      - updated the timeout for the test (should handle sparc debug 
> slowness)
>      - wait for explicit prompts from cont command (avoids confusion 
> from "int[2]")
>      - fixed a typo in an exclude pattern ("jdk.*")
>      - on wait for message timeout, don't wait for prompt
>         when dumping current
> 
> Should have another reviewer in addition to Chris.
> 
> On 9/27/18, 3:12 PM, Chris Plummer wrote:
>> The extra check after timing out doesn't seem like it should help. 
>> You've already called findMessage() 2100 times at 200ms intervals. Why 
>> would one more call after that help? I think it might be the 
>> receiveReply() call that is fixing it. It does a waitForPrompt(), so 
>> this probably gives us another 420000 ms for the prompt to come in. 
>> This call to receiveReply() is actually a bug itself since we are 
>> doing it just to print the current buffer, not the buffer after 
>> waiting for a prompt to come in.
>>
>> In any case, looks like this prompt is taking more than 420200 
>> milliseconds to come in, but does eventually come in, and extra 
>> waiting in receiveReply() is what is causing you to eventually see the 
>> prompt. I think bumping up the timeout to 600 and the waittime to 10 
>> is the proper fix here.
>>
>> And to address the receiveReply() issue, I'd suggest calling it using 
>> receiveReply(startPos, false, 0), where 0 is the prompt count, and 
>> have receiveReply() not wait for a prompt when the count is 0.
>>
>> Chris
>>
>> On 9/27/18 11:44 AM, Gary Adams wrote:
>>> Speaking of not being bullet proof, during testing of the fix to
>>> wait for a specific prompt an intermittent failure was observed.
>>> ...
>>>
>>> Sending command: trace methods 0x2a9
>>> reply[0]: MyThread-0[1]
>>> Sending command: cont
>>> WARNING: message not recieved: MyThread-0[1]
>>> Remaining debugger output follows:
>>> reply[0]:>
>>> reply[1]: Method exited: return value =<void value>, 
>>> "thread=MyThread-0", nsk.jdb.exclude.exclude001.MyThread.run(), 
>>> line=93 bci=14
>>> reply[2]: 93        }
>>> reply[3]:
>>> reply[4]: MyThread-0[1]
>>> # ERROR: Caught unexpected exception while executing the test: 
>>> nsk.share.Failure: Expected message not received during 420200 
>>> milliseconds:
>>> ...
>>>
>>> The wait for message times out looking for "MyThread-0[1]".
>>> A WARNING is printed and the "remaining debugger output"
>>> shows that "MyThread-0[1]" is in the buffer.
>>>
>>> I'm still investigating why the message match is not found.
>>>
>>> Adding a final check before failing the wait for message
>>> seems to workaround the problem.
>>>
>>> diff --git a/test/hotspot/jtreg/vmTestbase/nsk/share/jdb/Jdb.java 
>>> b/test/hotspot/jtreg/vmTestbase/nsk/share/jdb/Jdb.java
>>> --- a/test/hotspot/jtreg/vmTestbase/nsk/share/jdb/Jdb.java
>>> +++ b/test/hotspot/jtreg/vmTestbase/nsk/share/jdb/Jdb.java
>>> @@ -515,10 +515,11 @@
>>>          long delta = 200; // time in milliseconds to wait at every 
>>> iteration.
>>>          long total = 0;    // total time has waited.
>>>          long max = 
>>> getLauncher().getJdbArgumentHandler().getWaitTime() * 60 * 1000; // 
>>> maximum time to wait.
>>> +        int found = 0;
>>>
>>>          Object dummy = new Object();
>>>          while ((total += delta) <= max) {
>>> -            int found = 0;
>>> +            found = 0;
>>>
>>>              // search for message
>>>              {
>>> @@ -553,6 +554,12 @@
>>>          log.display("WARNING: message not recieved: " + message);
>>>          log.display("Remaining debugger output follows:");
>>>          receiveReply(startPos);
>>> +
>>> +        // One last chance
>>> +        found = findMessage(startPos, message);
>>> +        if (found > 0) {
>>> +            return found;
>>> +        }
>>>          throw new Failure("Expected message not received during " + 
>>> total + " milliseconds:"
>>>                              + "\n\t" + message);
>>>      }
>>>
>>>
>>> On 9/20/18, 5:47 PM, Chris Plummer wrote:
>>>> Looks good. Still not bullet proof, but I'm not sure it's possible 
>>>> to write tests like this in a way that will work no matter what 
>>>> output is produced by the method enter/exit events.
>>>>
>>>> Chris
>>>>
>>>> On 9/20/18 10:59 AM, Gary Adams wrote:
>>>>> The test failure has been identified due to the "int[2]"
>>>>> being misrecognized as a compound prompt.  This caused a cont
>>>>> command to be sent prematurely.
>>>>>
>>>>> The proposed fix waits for the correct prompt before
>>>>> advancing to the next command.
>>>>>
>>>>>   Webrev: http://cr.openjdk.java.net/~gadams/8208473/webrev/
>>>>>   Issue: https://bugs.openjdk.java.net/browse/JDK-8208473
>>>>>
>>>>> Testing is in progress.
>>>>
>>>>
>>>>
>>>
>>
>>
> 


More information about the serviceability-dev mailing list