RFR: JDK-8169718: nsk/jdb/locals/locals002: ERROR: Cannot find boolVar with expected value: false

Gary Adams gary.adams at oracle.com
Mon Aug 6 16:54:21 UTC 2018


On 7/30/18, 12:46 PM, Chris Plummer wrote:
> On 7/30/18 12:47 AM, serguei.spitsyn at oracle.com wrote:
>> Hi Chris,
>>
>> Just one quick simple comment below.
>>
>> On 7/29/18 22:05, Chris Plummer wrote:
>>> Hi Gary,
>>>
>>> I updated my changes to do the wait() for step, stepi, cont, and 
>>> next. Some issues turned up when testing. A number of tests were 
>>> timing out occasionally, I believe because sometimes the event was 
>>> handled (and the notify() call already completed) before we ever got 
>>> to the handler.wait() call.
>>
>> I agree as I had this exact concern when was looking at the webrev.0.
>> Need more time to understand all the details.
> Hopefully it's a moot point since my 2nd version doesn't rely on any 
> synchronization.
>
> http://cr.openjdk.java.net/~cjplummer/8169718/webrev.01/
>
> I think a properly working synchronized version would require a large 
> synchronized block in TTY::executeCommand() to block out any event 
> handling until the prompt has been printed. I think this is safe even 
> when you have a monitor command in place. Although that results in 
> TTY::executeCommand() being called from the event handler, and it may 
> temporarily block if the command thread is already in 
> TTY::executeCommand(), eventually the command thread will exit the 
> synchronized block allowing the event thread to make progress. Having 
> said that, I just assume stick with the simpler webrev above if no one 
> see issues with it.

I like the approach used here - to print the prompt immediately
and skip the the prompt at the bottom of the command processing.

The original error though was an interspersed prompt disrupting
the event handler output, which came from another event
being processed. e.g.
   "Breakpoint hit:"
   prompt from previous event
   printCurrentLocation()

Let me try a few test runs with your patch
along with a diagnostic in printPrompt()
so we can identify the printPrompt caller.

   new Exception().printStackTrace()

...
>
> thanks,
>
> Chris
>>
>> Thanks,
>> Serguei
>>
>>
>>> Although I never confirmed this was the cause, it certainly seems 
>>> feasible.
>>>
>>> There were also issues with the prompt printed by 
>>> TTY:executeCommand(). I've come to realize that it should really 
>>> always be printed before any event handling. Currently that seems to 
>>> be the case, probably due to the Thread.yield() I mention below in 
>>> event handling code like stepEvent(), although I don't think it's 
>>> necessarily guaranteed. However, with my changes prompt was always 
>>> printed after the event was done being handled, and this proved to 
>>> be a problem for some tests.
>>>
>>> When executing the "cont" command, it seems this prompt is expected 
>>> to be "> ", which is the prompt printed if the VM is not currently 
>>> suspended. Some tests count on this, mostly because they call 
>>> jdb.receiveReplyFor(), and pass in the number of prompts that are 
>>> expected. jdb.receiveReplyFor() does not count "> ". My change ended 
>>> up making the printPrompt() in TTY:executeCommand() print as 
>>> "main[1]" after a "cont" command, because the added synchronizing 
>>> was making it not print until the thread was suspended. This caused 
>>> some tests to fail because the expected reply was not yet received, 
>>> even after jdb.receiveReplyFor() had returned (it was basically 
>>> returning one prompt too soon).
>>>
>>> I started thinking that the synchronized approach probably needs a 
>>> large synchronized block in TTY:executeCommand(). But then I figured 
>>> out a much easier solution.  Thinking more about the "prompt after 
>>> cont" issue, it occurred to me that maybe we just need to print out 
>>> the prompt before executing the command, so it's not possible for 
>>> events to come in before the prompt is printed (or worse yet, in the 
>>> middle of event handling output). Doing this ends up looking much 
>>> like your original suggestion to set printPrompt = false for "cont", 
>>> "step", "stepi", and "next", except I also added a call 
>>> printPrompt() before doing this (Dan an I had rejected your proposal 
>>> to simply remove the prompt, since it did server a purpose). For 
>>> example:
>>>
>>>                         } else if (cmd.equals("cont")) {
>>>                             MessageOutput.printPrompt();
>>>                             showPrompt = false;
>>>                             evaluator.commandCont();
>>>
>>> But this does cause the issue I brought up above that the prompt 
>>> from "cont" is expected (by a number of tests) to be "> ", but since 
>>> the above code is executed while the vm is suspended, it ends up 
>>> being main[1]. I fixed this by adding a "simple" argument to 
>>> printPrompt() so we can force it to be "> " when needed. This will 
>>> maintain compatibility with anyone making assumptions about what the 
>>> prompt should look like.
>>>
>>> Here's the webrev, which seems to be working. I've run about 1000 
>>> iterations on all platforms. So far it seems to be working. There 
>>> have been some failures, but for the most part I've also seen those 
>>> with clean repos, and the ones that I haven't I don't believe are 
>>> related to my changes.
>>>
>>> http://cr.openjdk.java.net/~cjplummer/8169718/webrev.01/
>>>
>>> One other comment about your initial fix to set showPrompt = false:
>>>
>>> + showPrompt = false; // asynchronous command
>>>
>>> I'm not sure if the "asynchronous command" comment was intentional 
>>> or just the result of the copy-and-paste. I just want to clarify 
>>> what is meant by "asynchronous command" in the other places where 
>>> this comment appears. There are a few commands that are executed 
>>> async on a different thread. If you look in Commands.java, you'll 
>>> see how these commands use the AsyncExecution class to handle 
>>> execution of the command. I'm not sure why this is done, because all 
>>> these commands look like ones that are inherently synchronous, yet 
>>> they are always run on a different thread, and as a result may not 
>>> be complete when execution returns to TTY::executeCommand(). These 
>>> commands also all set showPrompt = false, but the prompt is printed 
>>> by AsyncExecution when complete. So, the point is the commands which 
>>> are inherently async, like "step" and "cont", are not async w.r.t. 
>>> these "asynchronous command" comments or the AsyncExecution class.
>>>
>>> Chris
>>>
>>>
>>> On 7/27/18 4:27 PM, Chris Plummer wrote:
>>>> Hi Gary,
>>>>
>>>> Sorry about the delay in the review. Been trying to fully 
>>>> understand all the various code flows, which is tricky due to 
>>>> having the separate event thread whose output needs to be 
>>>> co-ordinated with command thread. Also made more complicated by the 
>>>> multiple entrypoints into executeCommand(), and there being various 
>>>> places where output produced and we care about the order. I think I 
>>>> have it all mapped out now and a pretty good understanding of how 
>>>> it all works.
>>>>
>>>> My first impression is that it seems like a large number of changes 
>>>> and a lot of passing around of a StringBuffer object to fix a 
>>>> problem which I think could be resolved with some strategic 
>>>> wait/notify code. For example, it might be as simple as doing 
>>>> something like this:
>>>>
>>>> http://cr.openjdk.java.net/~cjplummer/8169718/webrev.00/
>>>>
>>>> This only covers the step command. You'd have to figure out which 
>>>> other commands need to ensure an event has been processed before 
>>>> printing the prompt, and set waitForEventCompletion = true for 
>>>> them. Is there a reason not to go with something like this?
>>>>
>>>> As for you current changes, just a couple of things I've noticed, 
>>>> but I have not given it a thorough review (for example, making sure 
>>>> that the StringBuffer is always eventually printed, and done so in 
>>>> the right order).
>>>>
>>>> In EventHandler.run(), it's unclear who is handling the printing of 
>>>> sb when handleEvent() or handleDisconnectedException() are called.
>>>>
>>>> In the following EventHandler code:
>>>>
>>>>  123     @Override
>>>>  124     public void stepEvent(StepEvent se, StringBuilder sb)  {
>>>>  125         Thread.yield();  // fetch output
>>>>  126         sb.append(MessageOutput.format("Step completed:"));
>>>>  127     }
>>>>
>>>> Do you understand why the Thread.yield() is there? What does the 
>>>> comment mean by "fetch output"? Just seems like it has something to 
>>>> do with the ordering of output, but it seems bad that the code ever 
>>>> relied on a yield to make sure the output was ordered properly.
>>>>
>>>> Also in the above code snippet, you need to be careful when 
>>>> replacing MessageOutput.lnprint() with MessageOutput.format(). 
>>>> lnprint() adds a newline to the start of the line. That appears to 
>>>> be missing in your code above.
>>>>
>>>> thanks,
>>>>
>>>> Chris
>>>>
>>>> On 7/20/18 12:11 PM, Gary Adams wrote:
>>>>> Here's another attempt to clear up the overlapping output from
>>>>> the command processing and event handler in the jdb tests.
>>>>>
>>>>> The fundamental problem is observed when "prompts"
>>>>> are produced interleaved with command and event output.
>>>>>
>>>>> This attempts to fix the issue by buffering the output and
>>>>> printing it fully assembled.
>>>>>
>>>>>  Webrev: http://cr.openjdk.java.net/~gadams/8169718/webrev.01/
>>>>>
>>>>> On 5/26/18, 6:50 AM, gary.adams at oracle.com wrote:
>>>>>> This is a review request for a previously closed test bug.
>>>>>> The test was recently moved to the open repos, and the
>>>>>> proposed fix is in the open code.
>>>>>>
>>>>>>   Webrev: http://cr.openjdk.java.net/~gadams/8169718/webrev/
>>>>>>
>>>>>>
>>>>>> -------- Forwarded Message --------
>>>>>> Subject: 	RFR: JDK-8169718: nsk/jdb/locals/locals002: ERROR: 
>>>>>> Cannot find boolVar with expected value: false
>>>>>> Date: 	Fri, 25 May 2018 11:35:10 -0400
>>>>>> From: 	Gary Adams <gary.adams at oracle.com>
>>>>>> Reply-To: 	gary.adams at oracle.com
>>>>>>
>>>>>> 	
>>>>>>
>>>>>>
>>>>>>
>>>>>> The jdb tests use stdin to send commands to a jdb process
>>>>>> and parses the stdout to determine if a command was
>>>>>> successful and when the process is prompting for new commands
>>>>>> to be sent.
>>>>>>
>>>>>> Some commands are synchronous, so when the command is completed
>>>>>> a new prompt is sent back immediately.
>>>>>>
>>>>>> Some commands are asynchronous, so there could be a delay
>>>>>> until a breakpoint is reached. The event handler then sends a prompt
>>>>>> when the application thread is stopped and new jdb commands can be sent.
>>>>>>
>>>>>> The problem causing the intermittent failures was a corruption in the
>>>>>> output stream when prompts were being sent at the wrong times.
>>>>>>
>>>>>> Instead of receiving
>>>>>>    "Breakpoint hit:"<location>
>>>>>>     <prompt>
>>>>>>
>>>>>> the log contained
>>>>>>    "Breakpoint hit:"<prompt>  <location>
>>>>>>
>>>>>> Once out of sync, jdb commands were being sent prematurely
>>>>>> and the wrong values were being compared against expected behavior.
>>>>>> The simple fix proposed here recognizes that commands like "cont",
>>>>>> "step" and "next" are asynchronous commands and should not send back
>>>>>> a prompt immediately. Instead. the event handler will deliver the next prompt
>>>>>> when the next "Breakpoint hit:" or "Step completed:" state change occurs.
>>>>>>
>>>>>> The bulk of the testing was done on windows-x64-debug builds where the
>>>>>> intermittent failures were observed in ~5 in 1000 testruns. The fix has
>>>>>> also been tested on linux-x64-debug, solaris-sparcv9-debug,
>>>>>> and macosx-x64-debug, even though the failures have never been reported
>>>>>> against those platforms.
>>>>>>
>>>>>> Failures have been observed in many of the nsk/jdb tests with similar corrupted
>>>>>> output streams, but never directly associated with this issue before.
>>>>>>
>>>>>>   redefine001, caught_exception002, locals002, eval001, next001,
>>>>>>   stop_at003, step002, print002, trace001, step_up001, read001,
>>>>>>   clear004, kill001, set001
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180806/a9cb0250/attachment-0001.html>


More information about the serviceability-dev mailing list