Consolidated repo vs. old forest size differences
    joe darcy 
    joe.darcy at oracle.com
       
    Thu Sep 28 22:48:54 UTC 2017
    
    
  
Hi Volker,
When there is a file move in Hg, it starts a fresh snapshot of the file. 
With the reorganized source structure, having a single top-level src 
subdirectory instead of jdk/src, langtools/src, etc. all the source 
files were moved.
This accounts for such of the greater size post consolidation. We're 
looking into some ways to mitigate the impact of the larger size.
Thanks,
-Joe
On 9/28/2017 3:15 PM, Volker Simonis wrote:
> Hi,
>
> not sure if this has been discussed before but at least I couldn't
> find any references in the previous mail threads on the repo
> consolidation.
>
> I've just realized that the size of the repository history (i.e.
> everything under .hg) has doubled in the new consolidated repo (800mb
> vs. 1600mb) and I don't exactly understand why:
>
> $ du -shc jdk10-hs-old/*/.hg jdk10-hs-old/.hg
> 16M    jdk10-hs-old/corba/.hg
> 141M    jdk10-hs-old/hotspot/.hg
> 49M    jdk10-hs-old/jaxp/.hg
> 57M    jdk10-hs-old/jaxws/.hg
> 453M    jdk10-hs-old/jdk/.hg
> 76M    jdk10-hs-old/langtools/.hg
> 33M    jdk10-hs-old/nashorn/.hg
> 8,1M    jdk10-hs-old/.hg
> 829M    total
>
> $ du -sh jdk10-hs/.hg
> 1,6G    jdk10-hs/.hg
>
> I wonder why this is the case?
>
> Is this because the consolidated repo has more and bigger merge changes?
>
> The consolidated repo has a total of 47297 changes with about 13878
> merge changes:
>
> $ hg -R jdk10-hs log --template "{rev}\n" -r tip
> 47297
> $ hg -R jdk10-hs log --template "{rev}\n" -k Merge | wc
>    13878   13878   79600
>
> The old forest had a total of  43102 changes with about 10408 merge changes:
>
> $ bash common/bin/hgforest.sh log --template "{rev}\n" | wc
>    43102   86285 1295798
> $ bash common/bin/hgforest.sh log -k "Merge" --template "{rev}\n" | wc
>    10408   20897  312491
>
> So the new consolidated repo has about 3000-4000 more changes of which
> all are merge changesets. Does anybody know a nice command to sum up
> the size of all merge changesets?
>
> Any other insights or comments? It would be especially interesting to
> know how this will evolve in the future.
>
> Regards,
> Volker
>
> PS: this also partially explains why downloading the new repo takes
> considerably longer compared to the old forest (the fact that the
> get_sources.sh script downloaded the forest in parallel being the
> second reason).
    
    
More information about the jdk10-dev
mailing list