New candidate JEP: 357: Migrate from Mercurial to Git

Joe Darcy joe.darcy at oracle.com
Sat Jul 27 21:49:38 UTC 2019


Hi Maurizio,

On 7/26/2019 2:31 AM, Maurizio Cimadamore wrote:
>
> On 15/07/2019 20:33, Joe Darcy wrote:
>> Hello,
>>
>> The intention is to separate the decision of whether not to migrate 
>> to git from the decision of how to host the git repos, assuming the 
>> migration occurs.
>>
>> So there is an implicit dependency between delivering 357 and on a 
>> yet-to-be- created JEP to discussion git hosting options.
>
> Picking up this slightly old thread.
>
> Joe, does that mean (I hope) that we won't witness _two_ separate 
> transitions (e.g. hg to self-hosted git, then self-hosted git to 
> hosted git) ? Or is that still on the card somehow?


No; there would be one hg -> git transition to the to-be-determined git 
hosting arrangement, self-hosting or 3rd party hosting.


>
> Every time we change repository (as has happened twice already with 
> the consolidation effort first, and the creation of the jdk/jdk repo 
> later), there is a lot of cost for all the infrastructure that 
> surrounds the JDK. Being the maintainer of several automated build and 
> test and merge systems for the various Amber, Valhalla and Panama 
> projects make me feel this pain acutely :-) (and I know I'm not alone 
> in this)


 From the first OCW last year, slides 8 through 29 of

http://cr.openjdk.java.net/~darcy/Presentations/ocw-2018-08-01-skara.pdf

discuss the backstory of how the JDK has used its SCM in JDK 8 through 
12. The way the SCM was used in 8 was also the same as was used in JDK 7 
after the JDK was open sourced and moved to hg from the early 
distributed SCM Teamware, which in turn was built on top of SCCS. 
Additionally, the way the SCM was used in JDK 7, before or after hg, was 
isomorphic to how it was used in JDK 6 and JDK 5.0 and probably earlier 
too, but I wasn't around for all that much of the pre-JDK 5.0 work :-)

The SCM model from JDK 5.0 to JDK 8, a twelve year period 2002 to 2014, 
was very stable, I would argue to the point of being stagnant. During 
those years, there was a graph of integration forests and one master 
forest where a forest was a collection of repos in the base SCM. I did 
benefit from this structure at times; it let me relatively easily update 
the jaxp, jax-ws, and corba portions of OpenJDK 6 to match the 
corresponding components of 6u10 independently of the other parts of the 
parts of the JDK (hotspot, langtools, core-libs, etc.). However, there 
were significant drawbacks to the graph of forests model, including lack 
of atomicity of semantically a single change if it happened to cross 
repo boundaries and long propagation delays of fixes across the graph of 
forests. In contrast, for the last few releases the "make JDK N into JDK 
(N+1)" changes have been pushed as single changeset to master that spans 
hotspot, core libs, and javac components. Making the same sort of 
changes under the JDK 8 model would take a least three pushes and would 
take a month or more for all the fixes to propagate to all the repos in 
the graph, a long time if a six-month release schedule was desired!

The changes since JDK 9 (graph of forests => tree of forests, repo 
consolidation, combining lines of development, etc.) have been working 
toward a simpler and I would argue generally better model: a single 
master forest for a feature release where all developers work and where 
sufficient testing catches problems quickly. We aren't quite there yet, 
the client team still uses a separate repo, and improving testing is an 
unending task, but other than some implementation issues like long clone 
times, I think the current development model is much more sound than it 
was a few years ago. These SCM changes have been made possible by 
significant supporting work to improve testing, the build, 
infrastructure, and other areas.

The changes above where done in phases for several reasons, including to 
minimize disruption at a given point in time, and also to allow the 
necessarily supporting work to be but in place, such as more reliable 
testing.


> So, while I'm not opposed to a move from hg to git (I find the two to 
> be close enough in terms of user experience, perhaps with a slight 
> question mark over git's extensibility - as I had fun a couple of 
> times writing python plugins for mercurial), I'll just plead for 
> attempting to minimize disruption (and hence number of moves) as much 
> as possible.

The git tool also has a plugin mechanism; Skara has a number of commands 
using it.

I agree it is prudent to avoid unnecessary churn in the JDK's SCM usage; 
I don't know of any additional plans for bulk moves of files within the 
repo and a move to git would be the last item on my SCM changes list :-)

If you have interest and some spare cycles, the git mirrors of the projects

     https://github.com/openjdk/panama
     https://github.com/openjdk/amber
     https://github.com/openjdk/valhalla

(among others) are available for infra experiments and prototypes.

Cheers,

-Joe




More information about the discuss mailing list