jtreg testing integrated

Thu May 22 12:13:36 PDT 2008

It's been my experience that when people say
    "Too many times ... broken by changes in hotspot ..."

That it's not that frequent an event at all, just that any single
event like it becomes a huge effort to isolate and resolve, two of them
quickly become "too many". And I indeed understand that very much.
It's like a gcc or C compiler bug, nasty nasty problems and mostly
just makes people angry or upset that it ever happened and that you
lost so much time tracking it down.

Nobody likes this to happen, but asking every team to run all your
tests on all platforms before they integrate isn't a very good solution
either, not when many of the tests are not easily runnable by that team.
It's like asking the plumber to run all the electrical tests to make sure
he hasn't caused a short somewhere, easier said than done, and will
probably cost you much more in the long run.

I will try and look into creating an automated test of some of the jdk
basic java tests, but I need help in creating a set of 100% pass tests
that run in a reasonable amount of time, say 15min-20min on a fast machine?
I could probably get that into their automated system easily enough, but
asking them to run everything just won't fly.

-kto

Martin Buchholz wrote:
> [+quality-discuss, jdk7-gk]
> 
> On Thu, May 22, 2008 at 7:27 AM, Mark Wielaard <mark at klomp.org> wrote:
>> Hi Martin,
>>
>> On Tue, 2008-05-20 at 06:00 -0700, Martin Buchholz wrote:
>>> On Tue, May 20, 2008 at 2:32 AM, Mark Wielaard <mark at klomp.org> wrote:
>>>>> I like a policy of "Read my lips; no new test failures" but OpenJDK
>>>>> is not quite there; we get test failure creep when changes in
>>>>> one component break another component's tests.
>>>> Yes, that would be idea. At least for openjdk6/icedtea we seem to be
>>>> pretty close actually. It will be more challenging for openjdk7. I
>>>> haven't quite figured out all the dynamics around "workspace
>>>> integration". But I assume we can get the master tree to zero fail and
>>>> then demand that any integration cycle doesn't introduce regressions.
>>> There are too many tests to require team integrators to run
>>> them all on each integration cycle.
>> I am not sure. It does take about 3 hours to run all the included tests
>> (and I assume that when we add more tests or integrate things like mauve
>> it will rise).
> 
> Not all the regression tests are open source yet, and not all the
> test suites available are open source (and some are likely to be
> permanently encumbered).  And we should be adding more
> static analysis tools to the testing process.
> 
> It sure would be nice to run all tests with -server and -client,
> and with different GCs, and on 32 and 64-bit platforms,
> with Java assertions enabled and disabled,
> with C++ assertions enabled and disabled.
> 
> Soon a "full" testing cycle looks like it might take a week.
> 
> But I do hope people, not just integrators, will run them
>> regularly. Especially when they are working on/integrating larger
>> patches. And we can always fall back on autobuilders so we have a full
>> report at least soon after something bad happens so there is some chance
>> to revert a change relatively quickly.
> 
> Much of the world works on this model -
> commit to trunk, wait for trouble, revert.
> It's certainly much cheaper, and gets feedback quicker,
> but creates fear among developers ("Notoriously careless
> developer X just did a commit.  I think I'll wait for a week
> before pulling")
> 
>>>   For a few years I've advocated
>>> adding another level to the tree of workspaces.  My model is to
>>> rename the current MASTER workspace to PURGATORY, and
>>> add a "golden MASTER".
>>> The idea is that once a week or so all tests are run exhaustively,
>>> and when it is confirmed that there are no new test failures,
>>> the tested code from PURGATORY is promoted to MASTER.
>> This is fascinating. Intuitively I would call for less levels instead of
>> more because that makes issues show up earlier. It is one of the things
>> I haven't really wrapped my head around. The proliferation of separate
>> branches/workspaces. One main master tree where all work goes into by
>> default and only have separate (ad hoc) branches/workspaces for larger
>> work items that might be destabilizing seems an easier model to work
>> with.
> 
> It's certainly more work for the integrators.  But for the developers
> my model is simple and comfortable.  Youir integrator will give you
> a workspace to commit changes to.
> Commit there whenever you feel like.  Go on to the next coding task.
> Your changes will take a while to percolate into MASTER,
> but what do you care?
> When you sync, you pull in changes from MASTER, which are
> *guaranteed* to not break any of your tests.  If you want specific
> changes quickly, pull from PURGATORY or a less-tested team
> workspace.
> 
> If you have a project where you need to share your work
> with other developers immediately,
> no problem - just create a project-specific shared workspace
> that all project team members can commit to directly.
> Decide on a level of testing the team is comfortable with -
> including none at all.
> 
> Developers in my model are more productive partly because
> they don't have to be afraid of breaking other developers.
> They can do enough testing for 95% confidence
> (which for many changes might mean no testing at all)
> then commit.  The system will push back buggy changes
> automatically.
> 
> Too many times I've suffered because tests in library land
> have been broken by changes in hotspot.  Nevertheless,
> the JDK MASTER is remarkably stable for a project with so
> many developers, largely because of the gradual integration
> process, with changes going into MASTER only after being
> tested by integrators.  JDK developers don't go around chatting
> about "build weather" - is the build broken today?  AGAIN?
> 
> This development model doesn't work as well for most
> open source projects, because they have fewer, smarter, and more
> dedicated developers, so there is less need.
> Also, it's hard to find good integrators.  Most people (like myself)
> end up doing it as a part-time job.  But just like source code
> control systems have gotten sexy, perhaps someday
> "code integration and testing systems" will become sexy,
> and everyone will want to write one.
> 
> Martin