Clone git repo bringing as few commits as possible for merge-base

49 Views Asked by At

Scenario

I have a really large repository that takes ~10 minutes to clone.

In a CI job I need to:

  • fetch 2 branches.
  • compute the merge-base (lowest common ancestor).
  • perform a diff between the merge-base commit and one of the branches and some other things based on that diff.

Currently I do a plain git clone which, as expected, takes up most of the time.

Is it possible to speed the process by using partial clones?

Assumptions:

  • one of the branches can be safely assumed to be at most 100 commits away from the merge-base commit.
  • the other branch could be further away (thousands of commits) from the merge-base.

What I tried

I experimented with a manual partial clone, which seems to yield good results when fetching the branches:

git init
git remote add origin "repo_url"
git sparse-checkout set "directory_i_am_interested_in"
git fetch origin branch1 --depth 1
git fetch origin branch2 --depth 1

This takes ~1 minute in total. The trouble is that merge-base fails, because there are just two commits present in my clone, so not enough history to compute.

I figured out I would then git fetch origin branchx --deepen n (where I choose n based on the assumptions above), to fetch the next n commits from each branch until I get a result from merge-base.

However, this ends up taking even more time than the simple git clone.

Is there something I'm missing here that could be further optimized? Or is there another way to achieve the same thing more efficiently?

0

There are 0 best solutions below