Cloning the Linux Kernel repository takes time. We don’t need every commit ever for our work. But we do need multiple branches. Here are some numbers for how long it takes to do various operations.
First, the full clone
time git clone git@gitlab.com:AmpereComputing/linux/linux.git
Cloning into ‘linux’…
…
Updating files: 100% (81108/81108), done.
real 7m56.985s
user 22m41.394s
sys 6m26.205s
Now the shallow clone. For us, this is actually the wrong branch, but there should be significant overlap.
time git clone --depth=1 git@gitlab.com:AmpereComputing/linux/linux.git
Cloning into 'linux'...
...
Updating files: 100% (81108/81108), done.
real 0m55.075s
user 0m27.710s
sys 0m7.393s
Now a blobless clone
time git clone –filter=blob:none git@gitlab.com:AmpereComputing/linux/linux.git
Cloning into ‘linux’…
…
Updating files: 100% (81108/81108), done.
real 3m43.477s
user 5m37.896s
sys 2m35.509s
Now a treeless clone
time git clone –filter=tree:0 git@gitlab.com:AmpereComputing/linux/linux.git
Cloning into ‘linux’…
…
Updating files: 100% (81108/81108), done.
real 1m53.469s
user 1m5.809s
sys 1m1.048s
Combining treeless and shallow?
time git clone --depth=1 --filter=tree:0 git@gitlab.com:AmpereComputing/linux/linux.git
Cloning into 'linux'...
...
Updating files: 100% (81108/81108), done.
real 1m11.402s
user 0m31.235s
sys 0m6.813s
What about a shallow clone since a certain tag:
time git clone --shallow-exclude=v6.10 git@gitlab.com:AmpereComputing/linux/linux.git
Cloning into 'linux'...
...
Updating files: 100% (81108/81108), done.
real 0m56.481s
user 0m27.536s
sys 0m7.306s
We have a long build process that makes use of extensive cherry picking of branches. It starts with a git clone of the Linux Kernel, and I want to see the timing differences using the different cloning options.
Because the shallow clone does not include the tree information, we cannot use it to do the cherry-picking, and so I will restrict my testing to the full clone, blob-less, and tree-less variants.
Full clone:
real??61m17.366s
user??507m48.236s
sys??171m28.766s
Tree-less
real??69m8.975s
user??492m34.034s
sys??157m14.231s
Treeless was actually slower. Here is blob-less
real??52m9.143s
user??486m23.953s
sys??163m25.927s
9 Minutes faster. This makes sense: once it has the base tree synchronized, the only additional blobs it needs to sync are the ones specific to each of the topic branches. This limits the additional communication to one stream of blobs per topic branch. It does not need to synchronize the older blobs, which are what I assume was the additional cost of the full clone, but it already has the tree information it needs to perform the cherry-pick meta-data operations.
For our purposes, we are going to go with the blob-less option.