Consolidated repo vs. old forest size differences
joe darcy
joe.darcy at oracle.com
Thu Sep 28 22:48:54 UTC 2017
Hi Volker,
When there is a file move in Hg, it starts a fresh snapshot of the file.
With the reorganized source structure, having a single top-level src
subdirectory instead of jdk/src, langtools/src, etc. all the source
files were moved.
This accounts for such of the greater size post consolidation. We're
looking into some ways to mitigate the impact of the larger size.
Thanks,
-Joe
On 9/28/2017 3:15 PM, Volker Simonis wrote:
> Hi,
>
> not sure if this has been discussed before but at least I couldn't
> find any references in the previous mail threads on the repo
> consolidation.
>
> I've just realized that the size of the repository history (i.e.
> everything under .hg) has doubled in the new consolidated repo (800mb
> vs. 1600mb) and I don't exactly understand why:
>
> $ du -shc jdk10-hs-old/*/.hg jdk10-hs-old/.hg
> 16M jdk10-hs-old/corba/.hg
> 141M jdk10-hs-old/hotspot/.hg
> 49M jdk10-hs-old/jaxp/.hg
> 57M jdk10-hs-old/jaxws/.hg
> 453M jdk10-hs-old/jdk/.hg
> 76M jdk10-hs-old/langtools/.hg
> 33M jdk10-hs-old/nashorn/.hg
> 8,1M jdk10-hs-old/.hg
> 829M total
>
> $ du -sh jdk10-hs/.hg
> 1,6G jdk10-hs/.hg
>
> I wonder why this is the case?
>
> Is this because the consolidated repo has more and bigger merge changes?
>
> The consolidated repo has a total of 47297 changes with about 13878
> merge changes:
>
> $ hg -R jdk10-hs log --template "{rev}\n" -r tip
> 47297
> $ hg -R jdk10-hs log --template "{rev}\n" -k Merge | wc
> 13878 13878 79600
>
> The old forest had a total of 43102 changes with about 10408 merge changes:
>
> $ bash common/bin/hgforest.sh log --template "{rev}\n" | wc
> 43102 86285 1295798
> $ bash common/bin/hgforest.sh log -k "Merge" --template "{rev}\n" | wc
> 10408 20897 312491
>
> So the new consolidated repo has about 3000-4000 more changes of which
> all are merge changesets. Does anybody know a nice command to sum up
> the size of all merge changesets?
>
> Any other insights or comments? It would be especially interesting to
> know how this will evolve in the future.
>
> Regards,
> Volker
>
> PS: this also partially explains why downloading the new repo takes
> considerably longer compared to the old forest (the fact that the
> get_sources.sh script downloaded the forest in parallel being the
> second reason).
More information about the jdk10-dev
mailing list