New candidate JEP: 357: Migrate from Mercurial to Git

Aleksey Shipilev shade at redhat.com
Tue Jul 16 14:23:43 UTC 2019


On 7/16/19 2:19 PM, Severin Gehwolf wrote:
>> Unfortunately, this JEP only discusses the advantages, and does not discuss the disadvantages of
>> performing this move.
> 
> Should this JEP go into the details of advantages/disadvantages of HG
> vs. Git itself? I wouldn't think so, but your mileage may vary.

No, but it should discuss both positive and negative impacts on the OpenJDK project. I don't mind
switching to Git in principle, *BUT* for the project of OpenJDK scale, we should do it based on
accurate cost/benefit assessment, and the complete list of mitigations to minimize the cost. I
believe this move is a heavily disruptive change, and should be treated with the utmost caution. The
JEP text does not leave the impression it is the case.


>>  *) Developers who are already quite constrained to deliver things with 6 months pace would have to
>> re-adjust their workflows, some would need re-training to Git, many would have to accept the
>> temporary productivity losses, and/or modify their delivery schedules;
> 
> Right. As David Lloyd said, this seems to be a one-time investment.
> Consider 3 years down the line after a conversion, would you still have
> a productivity loss?

Look at it this way: I routinely deal with multiple JDK release trains.

They are:
 * 8u: Mercurial forest
 * 9u+: Mercurial forest with major Jigsaw-inspired reshuffling
 * 10u+: Mercurial monorepo with yet another major reshuffling
 * 14+: Git monorepo (?)

The cost of moving the changes between these differently shaped repos is significant. Here, 8u is
way over "3 years down the line", and we are still paying the cost. I cannot honestly say
introducing another repo shape would simplify this story in foreseeable future. This is the cost
maintainers would pay for quite a long time, and it should be acknowledged and agreed upon.

>>  *) Downstream builders would need to refit their pipelines after the move -- and there are lots of
>> them;
> 
> Could you perhaps in more detail explain what bearing source control
> has on the actual building of projects? I understand that source code
> has to get fetched somewhere, but once local, the build pipeline would
> work in a similar fashion beyond that point, right?
> 
> FWIW, for Fedora fresh source tarball clones (package updates) would be
> significantly faster via git shallow clones (we don't need the history
> at all, just the source code).

I am much more concerned about builders who have to retarget their build pipelines, and not
accidentally building from the old/archived read-only Mercurial mirror. Recent debacles about tags
even within one repo point to the possibility of this. Perhaps there is a way to reject connections
to hgserver from the clients (thus breaking automatic scripts), but still allow connections to
hg-web (thus retaining the web links)?

But all of this speculates on how the transition is handled, because it is not really covered in
current JEP text.

>> Additionally, not addressed:
>>  *) Existing hgupdater links in JBS would have to be updated, or they would break;
> 
> Only if current mercurial repositories cease to exist entirely, I'd
> think, no?

Yes. And we know they did disappear before. Try to follow the hgupdater comment that points to
jdk/hs-comp, for example. I don't see the agreement anywhere that Mercurial repos would stay put for
links to work after the conversion. It should be explicitly written down and agreed upon.


>>  *) There are improvements to Mercurial that can make the conversion advantages less appealing. For
>> example, clonebundles that I pointed out multiple times over the year (and Mark promised to deliver,
>> at OpenJDK Committers Workshop in February 2019) is still not enabled:
>> https://bugs.openjdk.java.net/browse/JDK-8211383. Instead, we have "Alternatives: Keep using
>> Mercurial" (sic!).
> 
> That's a valid alternative measure. It would not address the *local*
> storage requirements though, would it? The JEP says:
> 
> """
> For example, the .git directory of
> the jdk/jdk repository is approximately 300 MB with Git and the .hg directory
> is around 1.2 GB with Mercurial, depending on the Mercurial version being used.
> """

Yes. Most of those 1.2 GB is self-inflicted damage due to major moves in 9 and 10. Still, I wonder
how bad that actually is, considering the non-transparent alternative conversion. I have 20 separate
JDK trees on my work machine, so I pay about 15..20 GB extra disk space for the benefit of not
breaking my workflow -- would gladly pay twice as much.

Also, we are checking out workspaces to actually build them. And if you do, you would need
additional 6 GB per build, and that would be the major disk space hog during development.

> Fresh clone times: Git has fewer metadata to transfer (above). JDK-
> 8211383 could help for Mercurial how does it compare? 

Last I checked, jdk-jdk zstd bundle is about 260 MB. This is roughly how much data transfer would
happen with clonebundles.

> Is there a "shallow" clone option for mercurial? The JEP seems to allude it's not
> present in HG[1].

Facebook solved it with remotefilelog:
  https://bitbucket.org/facebook/hg-experimental/src/default/remotefilelog/


-- 
Thanks,
-Aleksey



More information about the discuss mailing list