From martinrb at google.com Thu May 22 08:12:07 2008 From: martinrb at google.com (Martin Buchholz) Date: Thu, 22 May 2008 08:12:07 -0700 Subject: jtreg testing integrated In-Reply-To: <1211466426.4054.42.camel@dijkstra.wildebeest.org> References: <1211188871.5783.26.camel@dijkstra.wildebeest.org> <17c6771e0805190756l3abb06d0g74158054589471fb@mail.gmail.com> <1ccfd1c10805190830j37ef4f8bg12de006c9e051298@mail.gmail.com> <1211275953.3284.33.camel@dijkstra.wildebeest.org> <1ccfd1c10805200600m74b6f735g9159f10a27f8dc26@mail.gmail.com> <1211466426.4054.42.camel@dijkstra.wildebeest.org> Message-ID: <1ccfd1c10805220812p2a81641ct49c4f6a8cd339ca7@mail.gmail.com> [+quality-discuss, jdk7-gk] On Thu, May 22, 2008 at 7:27 AM, Mark Wielaard wrote: > Hi Martin, > > On Tue, 2008-05-20 at 06:00 -0700, Martin Buchholz wrote: >> On Tue, May 20, 2008 at 2:32 AM, Mark Wielaard wrote: >> >> I like a policy of "Read my lips; no new test failures" but OpenJDK >> >> is not quite there; we get test failure creep when changes in >> >> one component break another component's tests. >> > >> > Yes, that would be idea. At least for openjdk6/icedtea we seem to be >> > pretty close actually. It will be more challenging for openjdk7. I >> > haven't quite figured out all the dynamics around "workspace >> > integration". But I assume we can get the master tree to zero fail and >> > then demand that any integration cycle doesn't introduce regressions. >> >> There are too many tests to require team integrators to run >> them all on each integration cycle. > > I am not sure. It does take about 3 hours to run all the included tests > (and I assume that when we add more tests or integrate things like mauve > it will rise). Not all the regression tests are open source yet, and not all the test suites available are open source (and some are likely to be permanently encumbered). And we should be adding more static analysis tools to the testing process. It sure would be nice to run all tests with -server and -client, and with different GCs, and on 32 and 64-bit platforms, with Java assertions enabled and disabled, with C++ assertions enabled and disabled. Soon a "full" testing cycle looks like it might take a week. But I do hope people, not just integrators, will run them > regularly. Especially when they are working on/integrating larger > patches. And we can always fall back on autobuilders so we have a full > report at least soon after something bad happens so there is some chance > to revert a change relatively quickly. Much of the world works on this model - commit to trunk, wait for trouble, revert. It's certainly much cheaper, and gets feedback quicker, but creates fear among developers ("Notoriously careless developer X just did a commit. I think I'll wait for a week before pulling") >> For a few years I've advocated >> adding another level to the tree of workspaces. My model is to >> rename the current MASTER workspace to PURGATORY, and >> add a "golden MASTER". >> The idea is that once a week or so all tests are run exhaustively, >> and when it is confirmed that there are no new test failures, >> the tested code from PURGATORY is promoted to MASTER. > > This is fascinating. Intuitively I would call for less levels instead of > more because that makes issues show up earlier. It is one of the things > I haven't really wrapped my head around. The proliferation of separate > branches/workspaces. One main master tree where all work goes into by > default and only have separate (ad hoc) branches/workspaces for larger > work items that might be destabilizing seems an easier model to work > with. It's certainly more work for the integrators. But for the developers my model is simple and comfortable. Youir integrator will give you a workspace to commit changes to. Commit there whenever you feel like. Go on to the next coding task. Your changes will take a while to percolate into MASTER, but what do you care? When you sync, you pull in changes from MASTER, which are *guaranteed* to not break any of your tests. If you want specific changes quickly, pull from PURGATORY or a less-tested team workspace. If you have a project where you need to share your work with other developers immediately, no problem - just create a project-specific shared workspace that all project team members can commit to directly. Decide on a level of testing the team is comfortable with - including none at all. Developers in my model are more productive partly because they don't have to be afraid of breaking other developers. They can do enough testing for 95% confidence (which for many changes might mean no testing at all) then commit. The system will push back buggy changes automatically. Too many times I've suffered because tests in library land have been broken by changes in hotspot. Nevertheless, the JDK MASTER is remarkably stable for a project with so many developers, largely because of the gradual integration process, with changes going into MASTER only after being tested by integrators. JDK developers don't go around chatting about "build weather" - is the build broken today? AGAIN? This development model doesn't work as well for most open source projects, because they have fewer, smarter, and more dedicated developers, so there is less need. Also, it's hard to find good integrators. Most people (like myself) end up doing it as a part-time job. But just like source code control systems have gotten sexy, perhaps someday "code integration and testing systems" will become sexy, and everyone will want to write one. Martin