RFR: JDK-8210106: sun/tools/jps/TestJps.java timed out

David Holmes david.holmes at oracle.com
Mon Dec 10 07:55:16 UTC 2018


Hi Dan,

> Gary saw a longest time value of 280 seconds in his testing with
> fastdebug bits. The default timeout value is 120 seconds with a
> default timeout factor of 4 in Mach5 gives a total timeout of
> 480 seconds (8 minutes). With Gary's new default timeout of
> 360 seconds, we'll have a total timeout value of 1440 seconds
> (24 minutes). 

So based on that it seems Gary's observations are not relevant to what 
has been seen in mach5 because at 280 seconds we should never have been 
timing out!

I guess these changes won't hurt.

Thanks,
David

On 8/12/2018 1:12 am, Daniel D. Daugherty wrote:
> On 12/7/18 7:58 AM, Gary Adams wrote:
>> On 12/6/18, 7:52 PM, David Holmes wrote:
>>> Hi Gary,
>>>
>>> On 6/12/2018 11:27 pm, Gary Adams wrote:
>>>> On a local linux-x64-debug build this test consistently runs in less 
>>>> than 40 seconds.
>>>> On the mach5 testing machines there was a large fluctuation in the 
>>>> time to complete this test.
>>>>
>>>> Since the test runs jps with different combinations of arguments, 
>>>> the total
>>>> test time is dependent on the number of processes running java 
>>>> programs.
>>>> Since the mach5 test infrastructure runs java programs I have seen a 3X
>>>> in the amount of output the test produces compared to local test
>>>> runs.
>>>>
>>>> I've run the test several hundred times through mach5 on the slower 
>>>> platforms
>>>> and then examined the test logs using a 3X setting from the default 
>>>> 120 second
>>>> jtreg timeout. The slowest reported elapse time from the test logs 
>>>> showed
>>>> 280 seconds to complete.
>>>>
>>>> To improve the reliability of the test to complete, I'd like to 
>>>> increase the
>>>> timeout to 360 seconds.
>>>>
>>>>    Issue: https://bugs.openjdk.java.net/browse/JDK-8210106
>>>>
>>>> Proposed fix:
>>>>
>>>> diff --git a/test/jdk/sun/tools/jps/TestJps.java 
>>>> b/test/jdk/sun/tools/jps/TestJps.java
>>>> --- a/test/jdk/sun/tools/jps/TestJps.java
>>>> +++ b/test/jdk/sun/tools/jps/TestJps.java
>>>> @@ -27,7 +27,7 @@
>>>>    * @modules jdk.jartool/sun.tools.jar
>>>>    * @build LingeredAppForJps
>>>>    * @build LingeredApp
>>>> - * @run main/othervm TestJps
>>>> + * @run main/othervm/timeout=360 TestJps
>>>>    */
>>>
>>> Doesn't that then get scaled by the timeout factor resulting in a 
>>> much longer timeout than the 360 seconds you intended?
>>>
>>> For other timeout adjustments the needed time has been divided by the 
>>> timeout factor to get the actual intended timeout.
>>
>> This bug was filed fairly recently in Aug 2018.
>> At that time the timeout and timeout factor were not sufficient
>> to avoid the test failing.
>>
>> The mach5 timeout factors were adjusted recently, so this test may
>> no longer be an issue.
>>
>> If that is true, then we could simply close this bug as "cannot 
>> reproduce".
>> An argument could be made that the change in timeout factor may be
>> responsible for fixing a lot more of the intermittent bugs and that they
>> should be closed in a similar manner.
>>
>> Historically, we could say this particular bug should have had timeouts
>> reassessed when the infrastructure switched from jprt to mach5 testing
>> where there were more visible Java processes running.
>>
>> Using a higher explicit timeout will not make the test run any longer
>> than it needs. It will simply allow the test to not be terminated sooner
>> in a hung test scenario.
>>
>> What is your preference for this particular issue:
>>    - increase the explicit timeout
>>    - close as cannot reproduce attributed to the timeout factor 
>> adjustments
>>
>> What would you recommend going forward for other similar issues:
>>    - determine a new explicit timeout
>>    - close if no timeout failures have been observed since the timeout 
>> factor
>>       was raised
> 
> General guidance that I've been giving folks is that you bump the
> default timeout when you see timeouts with 'release' bits with
> a timeout factor of '1'. The default timeout factor in Mach5 is 4
> so we run both 'release' and 'fastdebug' bits with that timeout
> factor. That means our total timeout value for 'release' bits is
> probably a little long and our total timeout value for 'fastdebug'
> bits is probably just right for a concurrency factor of 12.
> 
> On my personal servers I use the following timeout factors:
> 
>          release)
>              TIMEOUT_FACTOR="4"
>              ;;
>          fastdebug)
>              TIMEOUT_FACTOR="6"
>              ;;
>          slowdebug)
>              TIMEOUT_FACTOR="12"
>              ;;
> 
> However, I run with a concurrency factor of 16. I rarely see
> timeout failures.
> 
> All that is well and good, but a 'jps' test is a different
> beast since it is affected by factors outside of most tests.
> Since the number of java processes running on the system can
> affect the test, I'm okay with using a default timeout of '360'
> for this test as a guard against increases in load or...
> 
> Gary saw a longest time value of 280 seconds in his testing with
> fastdebug bits. The default timeout value is 120 seconds with a
> default timeout factor of 4 in Mach5 gives a total timeout of
> 480 seconds (8 minutes). With Gary's new default timeout of
> 360 seconds, we'll have a total timeout value of 1440 seconds
> (24 minutes).
> 
> Before the recent changes to Mach5, the default timeout factor
> was 10 so we had a total timeout value of 1200 seconds (20
> minutes). Gary's change only bumps the total timeout value
> by 4 minutes from what we had back in August... when this bug
> was filed... unfortunately, the log for that sighting is gone
> so all we know is that the test timed out after 20 minutes
> 
>  > Timeout refired 1200 times
> 
> We know that JTREG timeout handling kicks in at timeout expiration,
> but without the log we don't know how much longer than 20 minutes
> the test ran before JTREG killed it. 24 minutes might do it, but...
> 
> Dan
> 
> 
>>
>>
>>>
>>> Cheers,
>>> David
>>>
>>
>>
> 


More information about the serviceability-dev mailing list