RFR: JDK-8210106: sun/tools/jps/TestJps.java timed out

Fri Dec 7 12:58:53 UTC 2018

On 12/6/18, 7:52 PM, David Holmes wrote:
> Hi Gary,
>
> On 6/12/2018 11:27 pm, Gary Adams wrote:
>> On a local linux-x64-debug build this test consistently runs in less 
>> than 40 seconds.
>> On the mach5 testing machines there was a large fluctuation in the 
>> time to complete this test.
>>
>> Since the test runs jps with different combinations of arguments, the 
>> total
>> test time is dependent on the number of processes running java programs.
>> Since the mach5 test infrastructure runs java programs I have seen a 3X
>> in the amount of output the test produces compared to local test
>> runs.
>>
>> I've run the test several hundred times through mach5 on the slower 
>> platforms
>> and then examined the test logs using a 3X setting from the default 
>> 120 second
>> jtreg timeout. The slowest reported elapse time from the test logs 
>> showed
>> 280 seconds to complete.
>>
>> To improve the reliability of the test to complete, I'd like to 
>> increase the
>> timeout to 360 seconds.
>>
>>    Issue: https://bugs.openjdk.java.net/browse/JDK-8210106
>>
>> Proposed fix:
>>
>> diff --git a/test/jdk/sun/tools/jps/TestJps.java 
>> b/test/jdk/sun/tools/jps/TestJps.java
>> --- a/test/jdk/sun/tools/jps/TestJps.java
>> +++ b/test/jdk/sun/tools/jps/TestJps.java
>> @@ -27,7 +27,7 @@
>>    * @modules jdk.jartool/sun.tools.jar
>>    * @build LingeredAppForJps
>>    * @build LingeredApp
>> - * @run main/othervm TestJps
>> + * @run main/othervm/timeout=360 TestJps
>>    */
>
> Doesn't that then get scaled by the timeout factor resulting in a much 
> longer timeout than the 360 seconds you intended?
>
> For other timeout adjustments the needed time has been divided by the 
> timeout factor to get the actual intended timeout.

This bug was filed fairly recently in Aug 2018.
At that time the timeout and timeout factor were not sufficient
to avoid the test failing.

The mach5 timeout factors were adjusted recently, so this test may
no longer be an issue.

If that is true, then we could simply close this bug as "cannot reproduce".
An argument could be made that the change in timeout factor may be
responsible for fixing a lot more of the intermittent bugs and that they
should be closed in a similar manner.

Historically, we could say this particular bug should have had timeouts
reassessed when the infrastructure switched from jprt to mach5 testing
where there were more visible Java processes running.

Using a higher explicit timeout will not make the test run any longer
than it needs. It will simply allow the test to not be terminated sooner
in a hung test scenario.

What is your preference for this particular issue:
    - increase the explicit timeout
    - close as cannot reproduce attributed to the timeout factor adjustments

What would you recommend going forward for other similar issues:
    - determine a new explicit timeout
    - close if no timeout failures have been observed since the timeout 
factor
       was raised

>
> Cheers,
> David
>