hg clone is unbelievably slow
Aleksey Shipilev
ashipile at redhat.com
Tue Feb 6 11:05:05 UTC 2018
On 02/06/2018 11:50 AM, Andrew Haley wrote:
> Half an hour or more here. AFAIK the problem is due to the
> inefficiency of Mercurial itself and the hg protocol.
The compounding factors are:
- hg.openjdk.java.net is too far from Europe, and bandwidth-delay-product kills TCP performance
with regular-sized buffers (which have to be tuned on both client and server side). It was partially
alleviated by forests where you had several concurrent hg clones (hotspot, jdk, ...) at the same time;
- the Jigsaw and monorepo file moves inflated the repository size dramatically. See
https://builds.shipilev.net/workspaces/: 8u is 240 MB compressed, 9 is 420 MB compressed, 10 is 760
MB compressed!
> Aleksey Shipilev has done an experiment whereby trees are regularly
> cloned and compressed tarballs created; these can be downloaded in a
> couple of minutes. But really we don't want to depend on the largesse
> of one developer: if we could download the OpenJDK trees directly by
> means of wget (or something similar) we would reduce the load on the
> servers and reduce the time taken to download as well.
I second that. Also, we can do Mercurial and compressing tricks to make the compressed archive
easier to download. Happy to share the script that makes the densest .tar.xz without going
full-crazy (maybe other simple tricks missing?):
function repack-jdk8 {
URL=$1
NAME=$2
if [ ! -d $NAME ]; then
hg clone $URL $NAME
fi
cd $NAME
hg pull
hg update
HGFOREST_GLOBALOPTS=" --config=format.generaldelta=1 --config=format.aggressivemergedeltas=1" sh
common/bin/hgforest.sh clone
sh common/bin/hgforest.sh pull
sh common/bin/hgforest.sh update null
hg update null
cd ..
# Cluster similar files together
find $NAME/ -type f | awk -F '/' '{ print $(NF-2) " " $(NF-1) " " $(NF) " " $L; }' | sort | awk
'{ print $4; }' > list.txt
tar -T list.txt -c -f - | xz -6 > $NAME.tar.xz
rsync $NAME.tar.xz builds at builds.shipilev.net:~/wwwroot/workspaces/
}
function repack-jdk10 {
URL=$1
NAME=$2
if [ ! -d $NAME ]; then
hg --config=format.generaldelta=1 --config=format.aggressivemergedeltas=1 clone $URL $NAME
fi
cd $NAME
hg pull
hg update null
cd ..
# Cluster similar files together
find $NAME/ -type f | awk -F '/' '{ print $(NF-2) " " $(NF-1) " " $(NF) " " $L; }' | sort | awk
'{ print $4; }' > list.txt
tar -T list.txt -c -f - | xz -6 > $NAME.tar.xz
rsync $NAME.tar.xz builds at builds.shipilev.net:~/wwwroot/workspaces/
}
repack-jdk8 http://hg.openjdk.java.net/jdk8u/jdk8u/ jdk8u-jdk8u
repack-jdk10 http://hg.openjdk.java.net/jdk/jdk jdk-jdk
-Aleksey
More information about the jdk-dev
mailing list