Call for Discussion: New Project: Skara -- investigating source code management options for the JDK sources
Aleksey Shipilev
shade at redhat.com
Mon Jul 30 11:53:09 UTC 2018
On 07/30/2018 01:13 PM, Weijun Wang wrote:
> Joe said on Jul 28:
>
>> In Mercurial, when a file is moved, its history is restarted, meaning a full copy of the file is stored. Therefore, lots of file moves will tend to make a Mercurial repo get disproportionally larger. In the JDK, many files were moved in JDK 9 for modularity and large numbers of files were moved again in JDK 10 for the repo consolidation.
>>
>> The Mercurial representation of JDK 8 GA takes about 412 MB, JDK 9 GA ~808 MB, and JDK 10 GA ~1553 MB.
>
> So this is related to Mercurial's design that a rename equals to a remove and a create.
>
> Maybe we can fix Mercurial to make this a real "move", and I doubt if there is a space-time tradeoff here.
What I meant to say is that space-time tradeoff between on-the-wire format (bundles) and on-the-disk
format (.hg folder) is there, and you can choose either, depending on the context. Publishing blobs
in on-the-wire format has better compatibility, while tarballs in on-the-disk format are ultimately
faster to "clone".
Two mega-moves (Jigsaw in 9, and monorepo in 10) inflated the on-the-disk size quite badly, as Joe
indicated above, but on-the-wire format size seems to remain okay. So, if we enabled CDN-backed
bundles-assisted clone, it should probably cut down clone pains, at least for our Europe-side folks,
at the expense of some client CPU churn associated with converting on-the-wire to on-the-disk during
the clone.
Some optimization for on-the-disk size is possible if you re-clone the repo with
"--config=format.generaldelta=1 --config=format.aggressivemergedeltas=1", thus optimizing internal
.hg metadata. That would take a lot of time. If you have some time to spare, then it makes sense to
do so. My build scripts do that automatically before packaging the .hg snapshots.
Also, it seems that doing the "clone --pull" twice with generaldelta enabled compacts metadata even
more: jdk/jdk .hg size fell from 1.5 GB to 1.2 GB uncompressed, and from 750M to 590M
xz9-compressed. I just fixed my build scripts and currently testing them.
-Aleksey
More information about the discuss
mailing list