RFR: JDK-8169718: nsk/jdb/locals/locals002: ERROR: Cannot find boolVar with expected value: false

Wed Jun 6 17:26:27 UTC 2018

Hi Gary,

I thought the issue was having 1 command like "cont" or "step" producing 
two prompts, and the first prompt ending up in the middle of the output 
for the breakpoint. So instead of:

main[1] cont
main[1]
"Breakpoint hit:" <location>
main[1]

We sometimes get:

main[1] cont
"Breakpoint hit:" main[1] <location>
main[1]

I don't think a delay on the receiving end like you have added can 
effect this. You need to make sure the main[1] produced for the "cont" 
command is in the output stream before handling the breakpoint event. 
Dan's solution was a bit different in that it actually delayed reaching 
the breakpoint. I don't think your solution does the same.

Chris

On 6/6/18 10:07 AM, Gary Adams wrote:
> It may be sufficient when the command prompt is delivered
> to introduce a brief pause to allow an asynchronous event to
> be processed before another command can be sent. e.g.
> throttling the stream of commands being sent
>
> diff --git a/test/hotspot/jtreg/vmTestbase/nsk/share/jdb/Jdb.java 
> b/test/hotspot/jtreg/vmTestbase/nsk/share/jdb/Jdb.java
> --- a/test/hotspot/jtreg/vmTestbase/nsk/share/jdb/Jdb.java
> +++ b/test/hotspot/jtreg/vmTestbase/nsk/share/jdb/Jdb.java
> @@ -348,7 +348,17 @@
>
>          int startPos = stdoutBuffer.length();
>          sendCommand(command);
> -        return receiveReply(startPos, compoundPromptOnly, count);
> +        String[] reply = receiveReply(startPos, compoundPromptOnly, 
> count);
> +
> +        if (asyncCommand(command)) {
> +            // delay to allow debuggee time to process async command
> +            try {
> +                Thread.sleep(100);
> +            } catch (InterruptedException e) {
> +                // ignored
> +            }
> +        }
> +        return reply;
>      }
>
>
> On 6/6/18, 11:51 AM, Daniel D. Daugherty wrote:
>> The two prompts are for two different things so I don't think
>> that removing one of the prompts is the right solution either.
>>
>> In the case of an asynchronous command like 'cont', you get
>> the '<prompt=after-cont-command>' right away because the command
>> processor is ready for another command. You will also get the
>> '<prompt-after-breakpoint-event-location1>' prompt when the
>> breakpoint is hit because that identifies the thread that posted
>> the breakpoint event. If it takes a long time to hit the breakpoint,
>> then it will be a long time before the breakpoint prompt is seen.
>> Likewise, if the breakpoint is hit quickly, then the breakpoint
>> prompt will be seen more quickly and will sometimes interfere with
>> the command processor prompt. That's normal for an async command.
>>
>> One possible way to fix this kind of issue is to structure the
>> test flow to slightly delay the target thread so it takes longer
>> to get to the breakpoint. '<prompt=after-cont-command>' will have
>> a better chance to be printed before (and not colliding with) the
>> '<prompt-after-breakpoint-event-location1>' prompt.
>>
>> So if the test program has:
>>
>> <some line of code>;  // breakpoint here
>>
>> change it to something like:
>>
>>     try { Thread.sleep(100); } catch (InterruptedException ie) {};
>> <some line of code>;  // breakpoint here
>>
>> Of course, I'm assuming here that the test driver code is looking
>> for the breakpoint to be hit (and not for 
>> '<prompt=after-cont-command>').
>> Jerry and I did find a few jdb based tests looking for the wrong 
>> prompt...
>>
>> Dan
>>
>>
>> On 6/2/18 7:02 AM, gary.adams at oracle.com wrote:
>>> On 6/1/18 6:54 PM, Chris Plummer wrote:
>>>> Hi Gary,
>>>>
>>>> I'm a bit unclear on one point. According to your description, 
>>>> should we be seeing two prompts for commands like "cont", and 
>>>> "step"? When you say we are expecting:
>>>>
>>>>   "Breakpoint hit:" <location>
>>>> <prompt>
>>>>
>>>> Isn't it really:
>>>>
>>>> <prompt>
>>>>   "Breakpoint hit:" <location>
>>>> <prompt>
>>> Let me stretch out the log a little bit
>>> The <prompt> is main[1] or Thread-1[1], etc.
>>> and not distinguished for command or event completion.
>>>
>>> <prompt>
>>>    stop at <location1>
>>> <prompt-after-stop-command>
>>>    stop at <location2>
>>> <prompt-after-stop-command>
>>>    cont
>>> <prompt=after-cont-command>
>>>    cont
>>>    "Breakpoint hit:" <location1>
>>> <prompt-after-cont-command>
>>>   "Breakpoint hit:"
>>> <prompt-after-breakpoint-event-location1>
>>> <location2>
>>> <prompt-after-breakpoint-event-location2>
>>>
>>> The source of the command or event prompts
>>> was verified with a stacktrace in the printPrompt()
>>> method.
>>>>
>>>> I figure this must be the case, because your fix removes the 
>>>> printing of a prompt, and we still need one prompt, so there must 
>>>> have been two to start with. I think in the above the last <prompt> 
>>>> is being produced sequentially right after the "Breakpoint hit:" 
>>>> output, so is never an issue. The first <prompt> comes from the 
>>>> printPrompt() call after issuing the "cont" command, and it's 
>>>> possible to hit the breakpoint and start producing more output 
>>>> before printPrompt() has completed, thus causing the problem you 
>>>> are seeing.
>>> printPrompt() is a very quick operation and is not being interrupted.
>>>
>>> But the output from the event handler is produced incrementally.
>>>    - event header, e.g. "Breakpont hit:", "Step completed:", etc.
>>>    - from vmInterrupted()
>>>      - printCurrentLocation()
>>>      - monitor command output (if any)
>>>      - printPrompt()
>>>>
>>>> There seems to be some purpose to printing this first prompt and 
>>>> I'm not so sure eliminating it is the right thing to do. I think it 
>>>> might be there to indicate that the command has been initiated, 
>>>> because there could be some delay before the breakpoint is hit. So 
>>>> the issue then is how to make sure this prompt appears in the input 
>>>> stream before any of the "Step completed:" or "Breakpoint hit:" 
>>>> output. Possibly event processing should be blocked until 
>>>> executeCommand() completes.
>>> The primary purposes of prompts is for "user interactive" nudge to 
>>> type another command.
>>> But the failures are from tests that are using the presence of a 
>>> prompt as an ACK of the
>>> last command. This "prompt-protocol", does not include any sequence 
>>> numbers ,
>>> does not pair prompt to command or event, and does not block during
>>> any multipart message formation.
>>>
>>> To prevent the error we are observing we would have to buffer all of 
>>> the event output
>>>
>>>    "Breakpoint hit:" <location> <monitored-commands>
>>> <prompt for breakpoint event>
>>>
>>> Another way to look at the current failure is to say the event 
>>> handler output
>>> has been interrupted by a command sent prematurely. This would never
>>> be seen at user speeds typing in commands.
>>>
>>> I should also mention any other output from the user application can 
>>> also
>>> interfere with the command and event output streams. I have seen
>>> stray prompts in the logs from System.out.println(). That was the
>>> reason earlier I was looking at changing the tests to look for
>>> specific prompts before continuing to the next command.
>>>
>>> \gra
>>>> thanks,
>>>>
>>>> Chris
>>>>
>>>> On 5/26/18 3:50 AM, gary.adams at oracle.com wrote:
>>>>> This is a review request for a previously closed test bug.
>>>>> The test was recently moved to the open repos, and the
>>>>> proposed fix is in the open code.
>>>>>
>>>>>   Webrev: http://cr.openjdk.java.net/~gadams/8169718/webrev/
>>>>>
>>>>>
>>>>> -------- Forwarded Message --------
>>>>> Subject:     RFR: JDK-8169718: nsk/jdb/locals/locals002: ERROR: 
>>>>> Cannot find boolVar with expected value: false
>>>>> Date:     Fri, 25 May 2018 11:35:10 -0400
>>>>> From:     Gary Adams <gary.adams at oracle.com>
>>>>> Reply-To:     gary.adams at oracle.com
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> The jdb tests use stdin to send commands to a jdb process
>>>>> and parses the stdout to determine if a command was
>>>>> successful and when the process is prompting for new commands
>>>>> to be sent.
>>>>>
>>>>> Some commands are synchronous, so when the command is completed
>>>>> a new prompt is sent back immediately.
>>>>>
>>>>> Some commands are asynchronous, so there could be a delay
>>>>> until a breakpoint is reached. The event handler then sends a prompt
>>>>> when the application thread is stopped and new jdb commands can be 
>>>>> sent.
>>>>>
>>>>> The problem causing the intermittent failures was a corruption in the
>>>>> output stream when prompts were being sent at the wrong times.
>>>>>
>>>>> Instead of receiving
>>>>>    "Breakpoint hit:" <location>
>>>>> <prompt>
>>>>>
>>>>> the log contained
>>>>>    "Breakpoint hit:" <prompt> <location>
>>>>>
>>>>> Once out of sync, jdb commands were being sent prematurely
>>>>> and the wrong values were being compared against expected behavior.
>>>>> The simple fix proposed here recognizes that commands like "cont",
>>>>> "step" and "next" are asynchronous commands and should not send back
>>>>> a prompt immediately. Instead. the event handler will deliver the 
>>>>> next prompt
>>>>> when the next "Breakpoint hit:" or "Step completed:" state change 
>>>>> occurs.
>>>>>
>>>>> The bulk of the testing was done on windows-x64-debug builds where 
>>>>> the
>>>>> intermittent failures were observed in ~5 in 1000 testruns. The 
>>>>> fix has
>>>>> also been tested on linux-x64-debug, solaris-sparcv9-debug,
>>>>> and macosx-x64-debug, even though the failures have never been 
>>>>> reported
>>>>> against those platforms.
>>>>>
>>>>> Failures have been observed in many of the nsk/jdb tests with 
>>>>> similar corrupted
>>>>> output streams, but never directly associated with this issue before.
>>>>>
>>>>>   redefine001, caught_exception002, locals002, eval001, next001,
>>>>>   stop_at003, step002, print002, trace001, step_up001, read001,
>>>>>   clear004, kill001, set001
>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>