Unhappy about test/ProblemList.txt

Tue Mar 30 16:26:38 UTC 2010

Kelly O'Hair wrote:
>
> On Mar 29, 2010, at 7:04 PM, Martin Buchholz wrote:
>
>> On Mon, Mar 29, 2010 at 17:14, Kelly O'Hair <kelly.ohair at sun.com> wrote:
>>>
>>> Jonathan is working on some changes to jtreg that may allow many of the
>>> tests on the ProblemList to be removed, I am hopeful anyway.
>>> Once we have that new jtreg available, trimming this list down is next.
>>>
>>> I admit to being a little quick to add some tests to the list, it was a
>>> somewhat
>>> frustrating experience to isolate these things when looking at all 12
>>> platforms/vms.
>>>
>>> I think in this case it was timing out on Solaris sparc and appeared 
>>> to be a
>>> stress test.
>>> This test is already marked othervm, so I can only assume that it is 
>>> very
>>> close
>>> to the default timeout,
>>
>> If increasing the timeout will make failures go away, then that is
>> perfectly fine.
>> You can reasonably increase the timeout up to 1 hour.
>
> I'm not sure having this one test take an hour is fair to the other 
> 1,000's of tests.
> It it can't run in under 5min, I'd question the test, maybe it's 
> sleeping or blocking
> too much if that is the case.
> Part of the goal in all this was to provide wide solid test coverage 
> in a reasonable amount
> of time, but if we allow many tests to run for an hour, that's a bit 
> of a problem.
>
> Regardless, I don't think this test takes an hour, I'll put a 10min 
> timeout on it
> and see how that goes. But I suspect it will run in well under 5min.
>
> -kto
>
>>
>> Martin
>

I think it is worth noting that just because a test is on the 
ProblemList does not mean it does not deserve to be run. It just means 
it needs more TLC than other tests, for one of any number of reasons.  
Since we have (regrettably) historically had trouble running all the 
tests and getting all them to pass,  the  approach is to separate the 
set of tests into two groups -- those that are well behaved and run in 
reasonable time with reasonable resources, and reliably pass on a good 
JDK -- and the rest.  The well behaved tests are good candidates for 
automated running on systems like JPRT or Hudson, and failures should be 
indicative of a fault in what is being tested.   The other tests should 
still be run, but by their nature, they are harder to run, and may 
require manual checking of the results to see if failures are real or 
spurious.   The real problem, in all of this, is the dwindling resources 
to expend on such TLC :-(

-- Jon