Proposal to revise forest graph and integration practices for JDK 9

Sat Nov 23 08:34:26 PST 2013

Hello,

The current arrangements of sets of integration forests for a JDK 
platform release, like JDK 8, impose high overheads on development. I'm 
proposing we use an alternate forest arrangement for JDK 9 that will 
dramatically reduce the propagation time of fixes across the set of 
forests. More details below.

JDK release projects for new Java SE platforms, like JDK 8 for Java SE 
8, have long used a graph of forests structured roughly as follow:

* A master forest for the release

* A thicket of integration forests for particular teams or technology 
areas. Today in JDK 8, the "TL" forest hosts changes in tools related to 
javac as well as core libraries. Forests for various client libraries 
(2D, awt, swing) host changes in those areas. A HotSpot forest, fed in 
turn by several HotSpot team forests, hosts VM-related changes. 
Generally, each integration forest only accepts changes to a subset of 
repositories. For example, the HotSpot-related forests typically only 
accept changes to the hotspot repository. The TL forest accepts changes 
to library-related repos (jdk, jaxp, jaxws, etc.) and langtools, but not 
to hotspot.

After some amount of testing and other verification steps, changes in 
one integration forest are integrated into master, typically on a weekly 
or bi-weekly basis. From master, the fix then propagates down to 
integration forests according to the policies of that forest. A fix 
could be in master for several days or more before being propagated to a 
particular integration forest.

While this structure has provided a great deal of cross-team isolation, 
it has come at the cost of high propagation delays of fixes to all 
forests. This propagation delay combined with only pushing fixes to a 
subset of repos also severely complicates making coordinated changes 
which span across repositories, as often occurred in Project Lambda and 
which chronically occurs for technologies like servicability. To give a 
representative example, consider a hypothetical change which requires 
updates to both the HotSpot runtime area as well as core libraries. If 
the change is first pushed to HotSpot, the propagation proceeds like:

     fix pushed to HotSpot runtime forest -> HotSpot main -> JDK 8 
master -> TL -> core libs engineer's forest

At this point, the libraries half of the change can be pushed:

     core libs fix pushed to TL -> JDK 8 master -> HotSpot main -> 
HotSpot runtime -> HotSpot runtime engineer's forest

This cycle to separately push both halves of what is conceptually a 
single fix and wait for them to propagate can take about four weeks. 
(Worse, often there needs to be a third push to complete the fix since 
the first push is often to "accept old way or new way" and the third 
push updates this to "accept new way only." When such a clean-up push is 
needed, it takes another two weeks to fully propagate.)

Since more projects requiring cross-repository coordination are expected 
in the future, I'm proposing a number of changes to the forest structure 
and management policies in JDK 9 to reduce the propagation delays.

* A master forest that is a time-delayed version of dev; dev described 
below.

* The dev forest conceptually replaces TL and hosts all 
libraries-related changes. When the sources in dev are in a known-good 
state, that state can be integrated to master. This integration cycle 
would happen at least weekly. In a change from current practice, HotSpot 
changes would be integrated into dev and *not* into master. All other 
team forests would also integrate into dev rather than master.

* Coordinated HotSpot + other component fixes would *both* get first 
pushed through the HotSpot forest. From hotspot the full fix would be 
integrated into dev. If additional testing was appropriate for the 
non-HotSpot fix, that testing should occur before integration into dev.

* Regular promoted builds based on master would continue.

By having team forests integrate directly into dev as well as having 
many libraries developers pushing directly to dev, the dev forest serves 
as an active collaboration area with greatly reduced propagation times 
across the whole system. With this model there is less cross-team 
isolation; teams and individuals are responsible for promptly fixing any 
breakage which is introduced. If problems are not quickly addressed, a 
problematic changeset may be anti-delta'ed.

(Conceptually, in this model a separate master forest is not strictly 
needed since the known-good states could be indicated using a mechanism 
like Hg tags. However, while adjusting to the new model and to allow for 
fixes directly to master in exceptional circumstances, I'm proposing a 
physically separate master forest be retained. The distinct URL of 
master will also clearly indicate known-good states of the source code.)

Please send comments on the above by November 29.

Thanks,

-Joe