jmx-dev [PATCH] JDK-8005472: com/sun/jmx/remote/NotificationMarshalVersions/TestSerializationMismatch.sh failed on windows

Wed Feb 6 00:01:40 PST 2013

On 02/05/2013 11:47 PM, Stuart Marks wrote:
> Hi, thanks for the updates.
> 
> Capturing the Client output in a file seems preferable to putting a
> shell pipeline in backquotes. But, the old code piped both stdout and
> stderr into grep using 2>&1, whereas the new code redirects only stdout,
> which is then searched by grep. Was that intentional? I'm not sure
> whether the error message in question occurs in stdout or stderr.

Argh. I missed the redirect :/ Thanks!
I will update the webrev shortly.

> 
> Otherwise, it looks like you've taken care of the common failure modes.
> It just goes to show how deceptively simple shell programming can be,
> when in fact catching all the potential errors can be quite tedious.
> 
> The following discussion is mainly for your information, to consider the
> next time you decide to write a shell test. :-) I'm not suggesting any
> action for this changeset.
> 
> You had mentioned earlier (see thread below) that you needed to compile
> incompatible versions of the same class. It's indeed difficult to coerce
> jtreg to do that, so this is the rationale for writing a shell test.
> 
> It's possible and somewhat hacky to use jtreg for this, though; see
> 
> http://hg.openjdk.java.net/lambda/lambda/jdk/file/tip/test/java/io/Serializable/defaultSVID/
> 
> 
> Basically it uses the @compile tag to compile the first version of the
> file, then @run with an arg that tells the test to move the generated
> classfile somewhere, another @compile tag to compile the second version
> of the file, and finally an @run tag to run the test for real.
> 
> I don't suggest that you copy this technique. :-)
> 
> Jon Gibbons suggested invoking the compiler API directly from java
> instead of writing a shell script. Doing this seems fairly simple, and I
> think it would be advantageous to keep things entirely in Java. I may
> attempt to rewrite the defaultSVID test using the compiler API.

This seems rather feasible. I had just assumed that writing the shell
script would have been easier. But apparently wasn't ...

Anyway, thanks for going through this extensive review.

-JB-

> 
> s'marks
> 
> 
> On 2/5/13 6:40 AM, Jaroslav Bachorik wrote:
>> Updates in http://cr.openjdk.java.net/~jbachorik/8005472/webrev.05
>>
>> Comments inline
>>
>> On 01/10/2013 10:44 PM, Stuart Marks wrote:
>>> On 1/10/13 7:20 AM, Jaroslav Bachorik wrote:
>>>> Update: http://cr.openjdk.java.net/~jbachorik/8005472/webrev.04
>>>
>>> Thanks for the update.
>>>
>>> Note, argv[0] is used before argv.length is checked, so if no args are
>>> passed this gives index out of bounds instead of the usage message.
>>
>> It's gone. Shouldn't have been left there anyway.
>>
>>>
>>> I see you take pains to write and flush the URL to stdout before writing
>>> the signaling file. Good. The obvious alternative (which I started
>>> writing but then erased) is just to put the URL into the signaling file.
>>> But this has a race between creation of the file and the writing of its
>>> contents. So, what you have works. (This kind of rendezvous problem
>>> occurs a lot; it seems like there ought to be a simpler way.)
>>>
>>> I suspect the -e option caused hangs because if something failed, it
>>> would leave the server running, spoiling the next test run. The usual
>>> way to deal with this is to use the shell 'trap' statement, to kill
>>> subprocesses and remove temp files before exiting the shell. Probably a
>>> good practice in general, but perhaps too much shell hackery for this
>>> change. (Up to you if you want to tackle it.)
>>
>> I would rather not ...
>>
>>>
>>> Regarding how the test is detecting success/failure, the concern is that
>>> if the client fails for some reason other than the failure being checked
>>> for, the test will still report passing. Since the error message is
>>> coming out of the client JVM, in principle it ought to be possible to
>>> redirect it somehow in order to do the assertion checking in Java. With
>>> the current shell scheme, not only are other failures reported as the
>>> test passing, these other failures are erased in the grep pipeline, so
>>> they're not even visible in the test log.
>>
>> I've changed the logic slightly to check for the java process exit
>> status as well as for the presence of the trigger text in the process
>> output. This should catch all the regular failures.
>>
>> Unfortunately, the way the notification handling is implemented now it
>> is not really possible to check for the error from within the
>> application - the error state is captured by the *NotificationForwarder
>> class and only reported to the logger.
>>
>> -JB-
>>
>>>
>>> This last issue is rather far afield from this webrev, and fixing it
>>> will probably require some rearchitecting of the test. So maybe it
>>> should be considered independently. I just happened to notice this going
>>> on, and I noticed the similarity to what's going on in the RMI tests.
>>>
>>> s'marks
>>>
>>>
>>>
>>>> On 01/10/2013 09:52 AM, Stuart Marks wrote:
>>>>> On 1/7/13 3:23 AM, Jaroslav Bachorik wrote:
>>>>>> On 01/04/2013 11:37 PM, Kelly O'Hair wrote:
>>>>>>> I suspect it is not hanging because it does not exist, but that some
>>>>>>> other windows process has it's hands on it.
>>>>>>> This is the stdout file from the server being started up right?
>>>>>>> Could the server from a previous test run be still running?
>>>>>>
>>>>>> Exactly. Amy confirmed this and provided a patch which resolves the
>>>>>> hanging problem.
>>>>>>
>>>>>> The update patch is at
>>>>>> http://cr.openjdk.java.net/~jbachorik/8005472/webrev.01
>>>>>
>>>>> Hi Jaroslav,
>>>>>
>>>>> The change to remove the parentheses from around the server program
>>>>> looks right. It avoids forking an extra process (at least in some
>>>>> shells) and lets $! refer to the actual JVM, not an intermediate shell
>>>>> process. The rm -f from Kelly's suggestion is good too.
>>>>>
>>>>> But there are other things wrong with the script. I don't think they
>>>>> could cause hanging, but they could cause the script to fail in
>>>>> unforeseen ways, or even to report success incorrectly.
>>>>>
>>>>> One problem is introduced by the change, where the Server's stderr is
>>>>> also redirected into $URL_PATH along with stdout. This means that
>>>>> if the
>>>>> Server program reports any errors, they'll get mixed into the URL_PATH
>>>>> file instead of appearing in the test log. The URL_PATH file's
>>>>> contents
>>>>> is never reported, so these error messages will be invisible.
>>>>
>>>> Fixed, only the stdout is redirected to $URL_PATH.
>>>>
>>>>>
>>>>> The exit status of some of the critical commands (such as the
>>>>> compilations) isn't checked, so if javac fails for some reason, the
>>>>> test
>>>>> might not report failure. Instead, some weird error might or might not
>>>>> be reported later (though one will still see the javac errors in the
>>>>> log).
>>>>
>>>> Fixed, introduced the check. The "set -e" was hanging the script so I
>>>> have to check for the exit status manually.
>>>>
>>>>>
>>>>> I don't think the sleep at line 80 is necessary, since the client runs
>>>>> synchronously and should have exited by this point.
>>>>
>>>> And it's gone.
>>>>
>>>>>
>>>>> The wait loop checking for the existence of the URL_PATH file doesn't
>>>>> actually guarantee that the server is running or has initialized yet.
>>>>> The file is actually created by the shell before the Server JVM starts
>>>>> up. Thus, runClient might try to read from it before the server has
>>>>> written anything to it. Or, as mentioned above, the server might have
>>>>> written some error messages into the URL_PATH file instead of the
>>>>> expected contents. Thus, the contents of the JMXURL variable can quite
>>>>> possibly be incorrect.
>>>>
>>>> The err is not redirected to the file. A separate file is used to
>>>> signal
>>>> the availability of the server and that file is created from the java
>>>> code after the server has been started. Also, the err and out  streams
>>>> are flushed to make sure the JMX URL makes it into the file.
>>>>
>>>>>
>>>>> If this occurs, what will happen when the client runs? It may emit
>>>>> some
>>>>> error message, and this will be filtered out by the grep pipeline.
>>>>> Thus,
>>>>> HAS_ERRORS might end up empty, and the test will report passing, even
>>>>> though everything has failed!
>>>>
>>>> Shouldn't happen with only the controlled stdout redirected to the
>>>> file.
>>>>
>>>>>
>>>>> For this changeset I'd recommend at a minimum removing the redirection
>>>>> of stderr to URL_PATH. If the server fails we'll at least see
>>>>> errors in
>>>>> the test log.
>>>>>
>>>>> For checking the notification message, is there a way to modify the
>>>>> client to report an exit status or throw an exception? Throwing an
>>>>> exception from main() will exit the JVM with a nonzero status, so this
>>>>> can be checked more easily from the script. I think this is less
>>>>> error-prone than grepping the output for a specific error message. The
>>>>> test should fail if there is *any* error; it should not succeed if an
>>>>> expected error is absent.
>>>>
>>>> This is unfortunately not possible. The notification processing
>>>> needs to
>>>> be robust enough to prevent exiting JVM in cases like this.
>>>> Therefore it
>>>> only reports the problem, dumps the notification and carries on. The
>>>> only place one can find something went wrong is the err stream.
>>>>
>>>>>
>>>>> You might consider having jtreg build the client and server classes.
>>>>> This might simplify some of the setup. Also, jtreg is meticulous about
>>>>> aborting the test if any compilations fail, so it takes care of
>>>>> that for
>>>>> you.
>>>>
>>>> I need same name classes with incompatible code compiled to two
>>>> different locations - client and server. I was not able to figure out
>>>> how to use jtreg to accomplish that.
>>>>
>>>> -JB-
>>>>
>>>>>
>>>>> It would be nice if there were a better way to have the client
>>>>> rendezvous with the server. I hate to suggest it, but sleeping
>>>>> unconditionally after starting the server is probably necessary.
>>>>> Anything more robust probably requires rearchitecting the test,
>>>>> though.
>>>>>
>>>>> Sorry to dump all this on you. But one of the shell-based RMI tests
>>>>> suffers from *exactly* the same pathologies. (I have yet to fix it.)
>>>>> Unfortunately, I believe that there are a lot of other shell-based
>>>>> tests
>>>>> in the test suite that have similar problems. The lesson here is that
>>>>> writing reliable shell tests is a lot harder than it seems.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> s'marks
>>>>
>>