RFR: JDK-8169718: nsk/jdb/locals/locals002: ERROR: Cannot find boolVar with expected value: false
Gary Adams
gary.adams at oracle.com
Wed Jun 6 17:07:21 UTC 2018
It may be sufficient when the command prompt is delivered
to introduce a brief pause to allow an asynchronous event to
be processed before another command can be sent. e.g.
throttling the stream of commands being sent
diff --git a/test/hotspot/jtreg/vmTestbase/nsk/share/jdb/Jdb.java
b/test/hotspot/jtreg/vmTestbase/nsk/share/jdb/Jdb.java
--- a/test/hotspot/jtreg/vmTestbase/nsk/share/jdb/Jdb.java
+++ b/test/hotspot/jtreg/vmTestbase/nsk/share/jdb/Jdb.java
@@ -348,7 +348,17 @@
int startPos = stdoutBuffer.length();
sendCommand(command);
- return receiveReply(startPos, compoundPromptOnly, count);
+ String[] reply = receiveReply(startPos, compoundPromptOnly, count);
+
+ if (asyncCommand(command)) {
+ // delay to allow debuggee time to process async command
+ try {
+ Thread.sleep(100);
+ } catch (InterruptedException e) {
+ // ignored
+ }
+ }
+ return reply;
}
On 6/6/18, 11:51 AM, Daniel D. Daugherty wrote:
> The two prompts are for two different things so I don't think
> that removing one of the prompts is the right solution either.
>
> In the case of an asynchronous command like 'cont', you get
> the '<prompt=after-cont-command>' right away because the command
> processor is ready for another command. You will also get the
> '<prompt-after-breakpoint-event-location1>' prompt when the
> breakpoint is hit because that identifies the thread that posted
> the breakpoint event. If it takes a long time to hit the breakpoint,
> then it will be a long time before the breakpoint prompt is seen.
> Likewise, if the breakpoint is hit quickly, then the breakpoint
> prompt will be seen more quickly and will sometimes interfere with
> the command processor prompt. That's normal for an async command.
>
> One possible way to fix this kind of issue is to structure the
> test flow to slightly delay the target thread so it takes longer
> to get to the breakpoint. '<prompt=after-cont-command>' will have
> a better chance to be printed before (and not colliding with) the
> '<prompt-after-breakpoint-event-location1>' prompt.
>
> So if the test program has:
>
> <some line of code>; // breakpoint here
>
> change it to something like:
>
> try { Thread.sleep(100); } catch (InterruptedException ie) {};
> <some line of code>; // breakpoint here
>
> Of course, I'm assuming here that the test driver code is looking
> for the breakpoint to be hit (and not for '<prompt=after-cont-command>').
> Jerry and I did find a few jdb based tests looking for the wrong
> prompt...
>
> Dan
>
>
> On 6/2/18 7:02 AM, gary.adams at oracle.com wrote:
>> On 6/1/18 6:54 PM, Chris Plummer wrote:
>>> Hi Gary,
>>>
>>> I'm a bit unclear on one point. According to your description,
>>> should we be seeing two prompts for commands like "cont", and
>>> "step"? When you say we are expecting:
>>>
>>> "Breakpoint hit:" <location>
>>> <prompt>
>>>
>>> Isn't it really:
>>>
>>> <prompt>
>>> "Breakpoint hit:" <location>
>>> <prompt>
>> Let me stretch out the log a little bit
>> The <prompt> is main[1] or Thread-1[1], etc.
>> and not distinguished for command or event completion.
>>
>> <prompt>
>> stop at <location1>
>> <prompt-after-stop-command>
>> stop at <location2>
>> <prompt-after-stop-command>
>> cont
>> <prompt=after-cont-command>
>> cont
>> "Breakpoint hit:" <location1>
>> <prompt-after-cont-command>
>> "Breakpoint hit:"
>> <prompt-after-breakpoint-event-location1>
>> <location2>
>> <prompt-after-breakpoint-event-location2>
>>
>> The source of the command or event prompts
>> was verified with a stacktrace in the printPrompt()
>> method.
>>>
>>> I figure this must be the case, because your fix removes the
>>> printing of a prompt, and we still need one prompt, so there must
>>> have been two to start with. I think in the above the last <prompt>
>>> is being produced sequentially right after the "Breakpoint hit:"
>>> output, so is never an issue. The first <prompt> comes from the
>>> printPrompt() call after issuing the "cont" command, and it's
>>> possible to hit the breakpoint and start producing more output
>>> before printPrompt() has completed, thus causing the problem you are
>>> seeing.
>> printPrompt() is a very quick operation and is not being interrupted.
>>
>> But the output from the event handler is produced incrementally.
>> - event header, e.g. "Breakpont hit:", "Step completed:", etc.
>> - from vmInterrupted()
>> - printCurrentLocation()
>> - monitor command output (if any)
>> - printPrompt()
>>>
>>> There seems to be some purpose to printing this first prompt and I'm
>>> not so sure eliminating it is the right thing to do. I think it
>>> might be there to indicate that the command has been initiated,
>>> because there could be some delay before the breakpoint is hit. So
>>> the issue then is how to make sure this prompt appears in the input
>>> stream before any of the "Step completed:" or "Breakpoint hit:"
>>> output. Possibly event processing should be blocked until
>>> executeCommand() completes.
>> The primary purposes of prompts is for "user interactive" nudge to
>> type another command.
>> But the failures are from tests that are using the presence of a
>> prompt as an ACK of the
>> last command. This "prompt-protocol", does not include any sequence
>> numbers ,
>> does not pair prompt to command or event, and does not block during
>> any multipart message formation.
>>
>> To prevent the error we are observing we would have to buffer all of
>> the event output
>>
>> "Breakpoint hit:" <location> <monitored-commands>
>> <prompt for breakpoint event>
>>
>> Another way to look at the current failure is to say the event
>> handler output
>> has been interrupted by a command sent prematurely. This would never
>> be seen at user speeds typing in commands.
>>
>> I should also mention any other output from the user application can
>> also
>> interfere with the command and event output streams. I have seen
>> stray prompts in the logs from System.out.println(). That was the
>> reason earlier I was looking at changing the tests to look for
>> specific prompts before continuing to the next command.
>>
>> \gra
>>> thanks,
>>>
>>> Chris
>>>
>>> On 5/26/18 3:50 AM, gary.adams at oracle.com wrote:
>>>> This is a review request for a previously closed test bug.
>>>> The test was recently moved to the open repos, and the
>>>> proposed fix is in the open code.
>>>>
>>>> Webrev: http://cr.openjdk.java.net/~gadams/8169718/webrev/
>>>>
>>>>
>>>> -------- Forwarded Message --------
>>>> Subject: RFR: JDK-8169718: nsk/jdb/locals/locals002: ERROR:
>>>> Cannot find boolVar with expected value: false
>>>> Date: Fri, 25 May 2018 11:35:10 -0400
>>>> From: Gary Adams <gary.adams at oracle.com>
>>>> Reply-To: gary.adams at oracle.com
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> The jdb tests use stdin to send commands to a jdb process
>>>> and parses the stdout to determine if a command was
>>>> successful and when the process is prompting for new commands
>>>> to be sent.
>>>>
>>>> Some commands are synchronous, so when the command is completed
>>>> a new prompt is sent back immediately.
>>>>
>>>> Some commands are asynchronous, so there could be a delay
>>>> until a breakpoint is reached. The event handler then sends a prompt
>>>> when the application thread is stopped and new jdb commands can be
>>>> sent.
>>>>
>>>> The problem causing the intermittent failures was a corruption in the
>>>> output stream when prompts were being sent at the wrong times.
>>>>
>>>> Instead of receiving
>>>> "Breakpoint hit:" <location>
>>>> <prompt>
>>>>
>>>> the log contained
>>>> "Breakpoint hit:" <prompt> <location>
>>>>
>>>> Once out of sync, jdb commands were being sent prematurely
>>>> and the wrong values were being compared against expected behavior.
>>>> The simple fix proposed here recognizes that commands like "cont",
>>>> "step" and "next" are asynchronous commands and should not send back
>>>> a prompt immediately. Instead. the event handler will deliver the
>>>> next prompt
>>>> when the next "Breakpoint hit:" or "Step completed:" state change
>>>> occurs.
>>>>
>>>> The bulk of the testing was done on windows-x64-debug builds where the
>>>> intermittent failures were observed in ~5 in 1000 testruns. The fix
>>>> has
>>>> also been tested on linux-x64-debug, solaris-sparcv9-debug,
>>>> and macosx-x64-debug, even though the failures have never been
>>>> reported
>>>> against those platforms.
>>>>
>>>> Failures have been observed in many of the nsk/jdb tests with
>>>> similar corrupted
>>>> output streams, but never directly associated with this issue before.
>>>>
>>>> redefine001, caught_exception002, locals002, eval001, next001,
>>>> stop_at003, step002, print002, trace001, step_up001, read001,
>>>> clear004, kill001, set001
>>>>
>>>>
>>>
>>>
>>
>
More information about the serviceability-dev
mailing list