Passing time factor to tests run under jtreg

Fri Nov 18 01:58:20 UTC 2011

Gary,

On 18/11/2011 6:28 AM, Gary Adams wrote:
> Here's my first concrete slow machine timed out test ...
> jdk/test/java/lang/concurrent/forkjoin/Integrate.java
>
> I had been looking at tests that had a declared "timeout=xxx",
> but today I just started running the java/util/concurrent
> tests at a variety of clock speeds using ejdk1.7.0 and
> found a test that passes when running at 600MHz and
> timed out at 300Mhz. The test passes at 300 MHz if I
> include "-timeout:2" on the jtreg command line.

I think I have been misunderstanding the point you've been trying to 
make here.

I'm not sure there is a simple relationship here with the use of 
internal delays/timeouts in a test. delays (wait long enough til XXX 
should have happened) would seem to need to be scaled under the same 
considerations as used for -timeout. Internal timeouts (give up after 
XXX time units because something seems to have gone wrong) on the other 
hand are typically much coarser/larger and so already accommodate a 
range of -timeout values implicitly.

The scaling factor need to come from the environment launching the test, 
but the tests need to be modified to use it.

> At 600Mhz the test runs for 84 seconds (under the default
> 120 second timeout). At 300Mhz the test runs for 168
> seconds.
>
> Since this test does not do an internal wait or delay operation
> passing in a timeout factor would not help in this case.
>
> In general it seems that tests that declare a timeout less than 120
> seconds are indicating that an early termination for the test is
> acceptable.

I agree with Alan that it doesn't make sense to specify timeouts less 
than the default.

> Tests declaring a longer than 120 second timeout recognize that additional
> processing time may be required.

Most likely the test failed somewhere sometime and bumping the timeout 
fixed it. wash-rinse-repeat

Cheers,
David

> I'll try a longer overnight run at 300MHz to see if I can catch some
> other tests that are close to the 120 second threshold.
>
> ...
>
>
> On 11/15/11 08:33 PM, David Holmes wrote:
>> Hi Gary,
>>
>> On 16/11/2011 6:14 AM, Gary Adams wrote:
>>> I've been scanning a number of the slow machine test
>>> bugs that are reported and wanted to check to see if
>>> anyone has looked into time dependencies in the regression
>>> tests previously. From what I've been able to learn so far
>>> individual bugs can use the "timeout" parameter to indicate to
>>> the test harness an expected time to run.
>>>
>>> The test harness has command line arguments where it can
>>> filter out tests that take too long (timelimit) or can apply a
>>> multiplier to
>>> to the timeout when conditions are known to slow down the process
>>> (timeoutFactor). e.g. 8X for a slow machine or running with -Xcomp
>>>
>>> I see that there are some wrappers that can be applied around running
>>> a particular test to allow processing before main(). Could this
>>> mechanism
>>> be exploited so the harness command line options could be made known
>>> to the time dependent tests as command line arguments or as system
>>> properties?
>>>
>>> My thought is the current timeout granularity is too large and only
>>> applies
>>> to the full test execution. If a test knew that a timeoutFactor was to
>>> be applied,
>>> it could internally adjust the time dependent delays appropriately. e.g.
>>> not every sleep(), await(), join() with timeouts would need the
>>> timeoutFactor applied.
>>
>> I don't quite get what you mean about the timeouts applied to sleeps,
>> awaits etc. Depending on the test some of these are delays (eg sleep
>> is usually used this way) because it may not be feasible (or even
>> possible) to coordinate the threads directly; while others (await,
>> wait etc) are actual timeouts - if they expire it is an error because
>> something appears to have gone wrong somewhere (of course this can be
>> wrong because the timeout was too short for a given situation).
>>
>> As many of these tests have evolved along with the testing
>> infrastructure it isn't always very clear who has responsibility for
>> programming defensive timeouts. And many tests are designed to be run
>> stand-alone or under a test harness, where exceptions due to timeouts
>> are preferable to hangs.
>>
>> Further, while we can add a scale factor for known retarding factors -
>> like Xcomp - there's no general way to assess the target machine
>> capability (# cores, speed) and load, as it may impact a given test.
>> And historically there has been a lack of resources to investigate and
>> solve these issues.
>>
>> Cheers,
>> David
>>
>>> Before any test could be updated the information would need to be
>>> available
>>> from the test context.
>>>
>>> Any feedback/pointers appreciated!
>>>
>>>
>>> See
>>> timeoutFactorArg
>>> jtreg/src/share/classes/com/sun/javatest/regtest/Main.java
>>> runOtherJVM()
>>> jtreg/src/share/classes/com/sun/javatest/regtest/MainAction.java
>>> maxTimeoutValue
>>> jtreg/src/share/classes/com/sun/javatest/regtest/RegressionParameters.java
>>>
>>>
>>>
>