jtreg testing integrated

Thu May 22 15:04:47 PDT 2008

Hi Kelly,

On Thu, 2008-05-22 at 11:59 -0700, Kelly O'Hair wrote:
> Mark Wielaard wrote:
> > I am not sure. It does take about 3 hours to run all the included tests
> > (and I assume that when we add more tests or integrate things like mauve
> > it will rise). But I do hope people, not just integrators, will run them
> > regularly. Especially when they are working on/integrating larger
> > patches. And we can always fall back on autobuilders so we have a full
> > report at least soon after something bad happens so there is some chance
> > to revert a change relatively quickly.
> 
> 3 hours for runs with:
>     -client and -server?
>     one OS?
>     32bit and 64bit?
> 
> And you are only talking about the tests in the jdk/test area I assume.

No all the jtreg based test (-a -ignore:quiet) currently included in
langtools, jdk and hotspot. On a x86_64, dual core, 3.2Ghz, Fedora 8.

> The issue I have seen with the testing is that if it isn't done on a good
> spread of options and platforms and variations, something gets missed,
> there is little consistency between developers as to what the official
> requires test matrix is. Once we have a 100% pass list, and a required
> matrix, I have a system to help enforce these tests being run, but so
> far, nobody has given me a way to run just that list.
> Once I have it, I think we can keep all or most repositories golden.

Yes, that would be ideal.

> The Hotspot team runs many many variations, using options like -Xcomp
> and -Xbatch and loads more. But they are trying to stress the VM.

I assume there are many more hotspot tests than the 4 currently
included. Hopefully they can be liberated so more people can run them.

> I'm more of a 'test before integrate' person, with streamlining and
> automating the testing process, making it part of the developer push process,
> adapting the tests as major regressions sneak by (you can never catch all
> regressions, no matter what you do). Blocking pushes on any failure.
> So I'm trying to throw hardware at the problem until we can possible do
> the "exhaustive testing" that Martin mentions, as part of a developer
> pushing a change in, before anyone else sees it, all automated.

With a more distributed version control system a lot more can be
separated I guess. Your idea of a core test of tests that should always
pass 100% is good. Then autobuilders could take over. And everybody that
cares about a particular architecture/setup/configuration could add
their own autobuilder to the mix and make sure the full blown testsuite
keeps passing completely.

For GCC there is a nice system where when someone commits something on a
platform that they don't have access to a build machine runs and send
email to that person "After your latest commit the gnats Ada compiler
cross compiled from mips64 to ppc AIX failed the following tests. GO FIX
IT!".

> But for automation, I want a test guarantee:
>    "these tests should pass 100% of the time on all platforms in all situations"
> and then we can think about enforcing it.
> None of this wishy washy "this test fails sometimes due to the phase of
> the moon" crap. ...  oops, can I say crap in a public email??...  ;^}

Yes, that is the biggest danger "flaky tests". With Mauve we actually
have that problem. And we are constantly fighting it. It is a huge cost
to all involved :{

Cheers,

Mark