RFR for JDK-7190106 RMI benchmark fails intermittently because of use of fixed port

Wed Nov 6 01:50:17 UTC 2013

On 11/1/13 9:18 AM, Tristan Yan wrote:
> Hi Everyone
> http://cr.openjdk.java.net/~pzhang/Tristan/7190106/webrev/
>
> Description:
> 1. Convert shell script test to Java program test.
> 2. Using random server port by reusing Darryl Mocek's work to replace fixed
> server port.
> 3. Using java Process class to start client process.
> 4. Also convert other shell script test runSerialBench.sh to java program test also

Hi Tristan,

Several comments on this webrev.

** The original arrangement within the test/java/rmi/reliability/benchmark 
directory had the main benchmark files (scripts) at the top, some benchmark 
framework files in the "bench" subdirectory, and the actual RMI and 
serialization benchmarks in bench/rmi and bench/serial subdirectories.

The webrev moves all the RMI benchmarks to the top benchmark directory but 
leaves the serial benchmarks in bench/serial. The RMI benchmarks are now all 
cluttering the top directory, but the main serial benchmark test is now buried 
in the bench/serial directory. The nice organization that was there before is 
now spoiled. Is this rearrangement necessary in order to convert the scripts to 
Java? I would prefer the original arrangement be left in place.

** The RMI benchmark Main.java file has a @run tag of the form,

     @run main/othervm/policy=policy.all/timeout=1800 -server  Main -c config

There is a subtle but serious problem here: the -server option is passed to the 
 >>JVM<< and not as an argument to the main() method. The main() method gets 
neither a -server nor a -client argument, so its default "run mode" as defined 
by the benchmark itself is SAMEVM. This runs the client and server in the same 
JVM, which is different from the shell script, which ran separate client and 
server JVMs.

** The logic to process the -server argument still expects to take a port, even 
though the port is assigned automatically. So the obvious fix to the above,

     @run main/othervm/policy=policy.all/timeout=1800 Main -server -c config

doesn't work, since a port is missing. The logic to increment the argument index 
to collect the port argument should be removed. Also, the -server line should be 
restored to the usage message, but without the port argument.

** After this is done, the client's command line is constructed improperly. The 
command line ends up looking something like this:

     java client -cp <classpath> Main client localhost:58583 -c config

The word "client" appears twice, but what's really required is "-client" to 
appear as an argument after Main.

** The client is run using ProcessBuilder, which by default sends stdout and 
stderr to pipes to be read by the parent. But the parent never reads them, thus 
any messages from the client are never seen. The client is the one that emits 
the benchmark report, so its output needs to be seen. It might be sufficient to 
have the client inherit the parent's stdout and stderr. This might intermix the 
client's and server's output, but it's better than nothing.

** The config file is checked with the following code:

     try {
         confFile = args[i];
         confstr = new FileInputStream(System.getProperty("test.src")
                 + System.getProperty("file.separator") + confFile);
     } catch (IOException e) {
         die("Error: unable to open \"" + args[i] + "\"");
     }

This is potentially misleading, as the message doesn't print the actual filename 
that was attempted to be opened.

This is important, as the test.src property doesn't exist in the client JVM.

Note that the original shell script passed full pathnames for the config file to 
both the client and the server. The new @run tag merely says "-c config" which 
redefines the config filename to be relative to the test.src directory. You 
could pass -Dtest.src=... to the client, but it seems like there should be 
something better than can be done.

** The client needs to have its security policy set up. This is missing from the 
construction of the client's command line.

** ProcessBuilder takes a List<String> for its command; there is no need to turn 
the list into an array.

** In the benchmark main methods, code of the form,

     while (true) {
         runBenchmarks();
         if (exitRequested) {
             System.exit();
         }
     }

was replaced with

     while (!exitRequested) {
         runBenchmarks();
     }

This is a subtle logic change, in that the former code always executed the loop 
at least once. It seems unlikely, but it's possible that a timer could set 
exitRequested before loop entry, resulting in the benchmark running zero times. 
I guess, if you really want to clean this up (we do need to avoid System.exit in 
jtreg tests), use a do-while loop instead.

** Don't forget to remove the 7190106/runRmiBench.sh entry from ProblemList.txt.

** Remove the references to RMISecurityManager and just use SecurityManager. 
This is just general cleanup. (I deprecated RMISecurityManager last week.) :-)

It would be good if you could fix up these issues and post another webrev.

Thanks.

s'marks

> Thank you
> Tristan
>
> On 01/11/2013 23:58, Stuart Marks wrote:
>> On 10/31/13 10:22 PM, Tristan Yan wrote:
>>> I am working on bug https://bugs.openjdk.java.net/browse/JDK-7190106. Based
>>> on my research, it looks like the issue of fixed port was already addressed
>>> by Stuart Marks in other RMI tests which are Java based. I would like to
>>> reuse his solution, however it does not work for shell based tests.
>>
>> (Darryl Mocek did the unique port work for the RMI tests.)
>>
>> Was the patch attached to your message? If so, it didn't get through. Most
>> OpenJDK mailing lists strip off attachments before forwarding the message to
>> the recipients.
>>
>>> 2. My recommendation would be to convert this shell script test into Java
>>> based test and re-use the dynamic port allocation solution by Stuart Marks to
>>> address the issue
>>>
>>> 3. Also this test was written with server/client mode in shell script. In the
>>> past there have been sync issues between server/client which caused the test
>>> to fail. If we convert the shell script into Java based test, it would avoid
>>> using "sleep 10" mechanism to allow for server and client to start up and
>>> also give us better control in synchronizing server and client.
>>
>> (Background for interested readers.) In general, yes, it's quite difficult to
>> make reliable shell tests, especially for multi-process tests like this one.
>> There is the unique port issue, and there is also the issue of how long for
>> the client to wait until the server is ready. Error handling is also a
>> problem, for example, if one of the JVMs gets an unexpected exception, it's
>> easy for shell tests to mishandle this case. They might hang or erroneously
>> report success.
>>
>> --
>>
>> If this is a rewrite, it's probably fairly large, so you need to upload it
>> somewhere (e.g., cr.openjdk.java.net) and then post a link to it.
>>
>> Thanks.
>>
>> s'marks
>