RFR: JDK-8169718: nsk/jdb/locals/locals002: ERROR: Cannot find boolVar with expected value: false
Chris Plummer
chris.plummer at oracle.com
Wed Jun 6 17:26:27 UTC 2018
Hi Gary,
I thought the issue was having 1 command like "cont" or "step" producing
two prompts, and the first prompt ending up in the middle of the output
for the breakpoint. So instead of:
main[1] cont
main[1]
"Breakpoint hit:" <location>
main[1]
We sometimes get:
main[1] cont
"Breakpoint hit:" main[1] <location>
main[1]
I don't think a delay on the receiving end like you have added can
effect this. You need to make sure the main[1] produced for the "cont"
command is in the output stream before handling the breakpoint event.
Dan's solution was a bit different in that it actually delayed reaching
the breakpoint. I don't think your solution does the same.
Chris
On 6/6/18 10:07 AM, Gary Adams wrote:
> It may be sufficient when the command prompt is delivered
> to introduce a brief pause to allow an asynchronous event to
> be processed before another command can be sent. e.g.
> throttling the stream of commands being sent
>
> diff --git a/test/hotspot/jtreg/vmTestbase/nsk/share/jdb/Jdb.java
> b/test/hotspot/jtreg/vmTestbase/nsk/share/jdb/Jdb.java
> --- a/test/hotspot/jtreg/vmTestbase/nsk/share/jdb/Jdb.java
> +++ b/test/hotspot/jtreg/vmTestbase/nsk/share/jdb/Jdb.java
> @@ -348,7 +348,17 @@
>
> int startPos = stdoutBuffer.length();
> sendCommand(command);
> - return receiveReply(startPos, compoundPromptOnly, count);
> + String[] reply = receiveReply(startPos, compoundPromptOnly,
> count);
> +
> + if (asyncCommand(command)) {
> + // delay to allow debuggee time to process async command
> + try {
> + Thread.sleep(100);
> + } catch (InterruptedException e) {
> + // ignored
> + }
> + }
> + return reply;
> }
>
>
> On 6/6/18, 11:51 AM, Daniel D. Daugherty wrote:
>> The two prompts are for two different things so I don't think
>> that removing one of the prompts is the right solution either.
>>
>> In the case of an asynchronous command like 'cont', you get
>> the '<prompt=after-cont-command>' right away because the command
>> processor is ready for another command. You will also get the
>> '<prompt-after-breakpoint-event-location1>' prompt when the
>> breakpoint is hit because that identifies the thread that posted
>> the breakpoint event. If it takes a long time to hit the breakpoint,
>> then it will be a long time before the breakpoint prompt is seen.
>> Likewise, if the breakpoint is hit quickly, then the breakpoint
>> prompt will be seen more quickly and will sometimes interfere with
>> the command processor prompt. That's normal for an async command.
>>
>> One possible way to fix this kind of issue is to structure the
>> test flow to slightly delay the target thread so it takes longer
>> to get to the breakpoint. '<prompt=after-cont-command>' will have
>> a better chance to be printed before (and not colliding with) the
>> '<prompt-after-breakpoint-event-location1>' prompt.
>>
>> So if the test program has:
>>
>> <some line of code>; // breakpoint here
>>
>> change it to something like:
>>
>> try { Thread.sleep(100); } catch (InterruptedException ie) {};
>> <some line of code>; // breakpoint here
>>
>> Of course, I'm assuming here that the test driver code is looking
>> for the breakpoint to be hit (and not for
>> '<prompt=after-cont-command>').
>> Jerry and I did find a few jdb based tests looking for the wrong
>> prompt...
>>
>> Dan
>>
>>
>> On 6/2/18 7:02 AM, gary.adams at oracle.com wrote:
>>> On 6/1/18 6:54 PM, Chris Plummer wrote:
>>>> Hi Gary,
>>>>
>>>> I'm a bit unclear on one point. According to your description,
>>>> should we be seeing two prompts for commands like "cont", and
>>>> "step"? When you say we are expecting:
>>>>
>>>> "Breakpoint hit:" <location>
>>>> <prompt>
>>>>
>>>> Isn't it really:
>>>>
>>>> <prompt>
>>>> "Breakpoint hit:" <location>
>>>> <prompt>
>>> Let me stretch out the log a little bit
>>> The <prompt> is main[1] or Thread-1[1], etc.
>>> and not distinguished for command or event completion.
>>>
>>> <prompt>
>>> stop at <location1>
>>> <prompt-after-stop-command>
>>> stop at <location2>
>>> <prompt-after-stop-command>
>>> cont
>>> <prompt=after-cont-command>
>>> cont
>>> "Breakpoint hit:" <location1>
>>> <prompt-after-cont-command>
>>> "Breakpoint hit:"
>>> <prompt-after-breakpoint-event-location1>
>>> <location2>
>>> <prompt-after-breakpoint-event-location2>
>>>
>>> The source of the command or event prompts
>>> was verified with a stacktrace in the printPrompt()
>>> method.
>>>>
>>>> I figure this must be the case, because your fix removes the
>>>> printing of a prompt, and we still need one prompt, so there must
>>>> have been two to start with. I think in the above the last <prompt>
>>>> is being produced sequentially right after the "Breakpoint hit:"
>>>> output, so is never an issue. The first <prompt> comes from the
>>>> printPrompt() call after issuing the "cont" command, and it's
>>>> possible to hit the breakpoint and start producing more output
>>>> before printPrompt() has completed, thus causing the problem you
>>>> are seeing.
>>> printPrompt() is a very quick operation and is not being interrupted.
>>>
>>> But the output from the event handler is produced incrementally.
>>> - event header, e.g. "Breakpont hit:", "Step completed:", etc.
>>> - from vmInterrupted()
>>> - printCurrentLocation()
>>> - monitor command output (if any)
>>> - printPrompt()
>>>>
>>>> There seems to be some purpose to printing this first prompt and
>>>> I'm not so sure eliminating it is the right thing to do. I think it
>>>> might be there to indicate that the command has been initiated,
>>>> because there could be some delay before the breakpoint is hit. So
>>>> the issue then is how to make sure this prompt appears in the input
>>>> stream before any of the "Step completed:" or "Breakpoint hit:"
>>>> output. Possibly event processing should be blocked until
>>>> executeCommand() completes.
>>> The primary purposes of prompts is for "user interactive" nudge to
>>> type another command.
>>> But the failures are from tests that are using the presence of a
>>> prompt as an ACK of the
>>> last command. This "prompt-protocol", does not include any sequence
>>> numbers ,
>>> does not pair prompt to command or event, and does not block during
>>> any multipart message formation.
>>>
>>> To prevent the error we are observing we would have to buffer all of
>>> the event output
>>>
>>> "Breakpoint hit:" <location> <monitored-commands>
>>> <prompt for breakpoint event>
>>>
>>> Another way to look at the current failure is to say the event
>>> handler output
>>> has been interrupted by a command sent prematurely. This would never
>>> be seen at user speeds typing in commands.
>>>
>>> I should also mention any other output from the user application can
>>> also
>>> interfere with the command and event output streams. I have seen
>>> stray prompts in the logs from System.out.println(). That was the
>>> reason earlier I was looking at changing the tests to look for
>>> specific prompts before continuing to the next command.
>>>
>>> \gra
>>>> thanks,
>>>>
>>>> Chris
>>>>
>>>> On 5/26/18 3:50 AM, gary.adams at oracle.com wrote:
>>>>> This is a review request for a previously closed test bug.
>>>>> The test was recently moved to the open repos, and the
>>>>> proposed fix is in the open code.
>>>>>
>>>>> Webrev: http://cr.openjdk.java.net/~gadams/8169718/webrev/
>>>>>
>>>>>
>>>>> -------- Forwarded Message --------
>>>>> Subject: RFR: JDK-8169718: nsk/jdb/locals/locals002: ERROR:
>>>>> Cannot find boolVar with expected value: false
>>>>> Date: Fri, 25 May 2018 11:35:10 -0400
>>>>> From: Gary Adams <gary.adams at oracle.com>
>>>>> Reply-To: gary.adams at oracle.com
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> The jdb tests use stdin to send commands to a jdb process
>>>>> and parses the stdout to determine if a command was
>>>>> successful and when the process is prompting for new commands
>>>>> to be sent.
>>>>>
>>>>> Some commands are synchronous, so when the command is completed
>>>>> a new prompt is sent back immediately.
>>>>>
>>>>> Some commands are asynchronous, so there could be a delay
>>>>> until a breakpoint is reached. The event handler then sends a prompt
>>>>> when the application thread is stopped and new jdb commands can be
>>>>> sent.
>>>>>
>>>>> The problem causing the intermittent failures was a corruption in the
>>>>> output stream when prompts were being sent at the wrong times.
>>>>>
>>>>> Instead of receiving
>>>>> "Breakpoint hit:" <location>
>>>>> <prompt>
>>>>>
>>>>> the log contained
>>>>> "Breakpoint hit:" <prompt> <location>
>>>>>
>>>>> Once out of sync, jdb commands were being sent prematurely
>>>>> and the wrong values were being compared against expected behavior.
>>>>> The simple fix proposed here recognizes that commands like "cont",
>>>>> "step" and "next" are asynchronous commands and should not send back
>>>>> a prompt immediately. Instead. the event handler will deliver the
>>>>> next prompt
>>>>> when the next "Breakpoint hit:" or "Step completed:" state change
>>>>> occurs.
>>>>>
>>>>> The bulk of the testing was done on windows-x64-debug builds where
>>>>> the
>>>>> intermittent failures were observed in ~5 in 1000 testruns. The
>>>>> fix has
>>>>> also been tested on linux-x64-debug, solaris-sparcv9-debug,
>>>>> and macosx-x64-debug, even though the failures have never been
>>>>> reported
>>>>> against those platforms.
>>>>>
>>>>> Failures have been observed in many of the nsk/jdb tests with
>>>>> similar corrupted
>>>>> output streams, but never directly associated with this issue before.
>>>>>
>>>>> redefine001, caught_exception002, locals002, eval001, next001,
>>>>> stop_at003, step002, print002, trace001, step_up001, read001,
>>>>> clear004, kill001, set001
>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>
More information about the serviceability-dev
mailing list