I have two git repositories R1
and R2
, which contain commits
from two periods of a product's development: 1995-1997 and 1999-2013.
(I created them by converting existing RCS and CVS repositories into Git.)
R1:
A---B---C---D
R2:
K---L---M---N
How can I combine the two repositories into a single one that contains an accurate view of the project's linear history?
A---B---C---D---K---L---M---N
Note that between R1
and R2
files have been added, deleted, and renamed.
I tried creating an empty repository and then merging their contents onto it.
git remote add R1 /vol/R1.git
git fetch R1
git remote add R2 /vol/R2.git
git fetch R2
git merge --strategy=recursive --strategy-option=theirs R1
git merge --strategy=recursive --strategy-option=theirs R2
However, this leaves in the end files that were in revision D
,
but not in revision K
.
I could craft a synthetic commit to remove the extra files between the merges,
but this seems inelegant to me.
Furthermore, through this approach the end-result contains merges that
didn't actually occur.
Using git filter-branch
Using the trick straight from the git-filter-branch man page:
First, create a new repository with the two original ones as remotes, just as you did before. I am assuming that both use the branch name "master".
Next, point "master" (the current branch) to the tip of R2's "master".
Now we can graft the history of R1's "master" to the beginning.
In other words, we are inserting a fake parent commit between
D
andK
so the new history looks like:The only change to
K
throughN
is thatK
's parent pointer changes, and thus all of the SHA-1 identifiers change. The commit message, author, timestamp, etc., stay the same.Merging more than two repositories together with filter-branch
If you have more than two repositories to do, say R1 (oldest) through R5 (newest), just repeat the
git reset
andgit filter-branch
commands in chronological order.Using grafts
As an alternative to using the
--parent-filter
option tofilter-branch
, you may instead use the grafts mechanism.Consider the original situation of appending
R2/master
as a child of (that is, newer than)R1/master
. As before, start by pointing the current branch (master
) to the tip ofR2/master
.Now, instead of running the
filter-branch
command, create a "graft" (fake parent) in.git/info/grafts
that links the "root" (oldest) commit ofR2/master
(K
) to the tip (newest) commit inR1/master
(D
). (If there are multiple roots ofR2/master
, the following will only link one of them.)At this point, you can look at your history (say, through
gitk
) to see if it looks right. If so, you can make the changes permanent via:Finally, you can clean everything up by removing the graft file.
Using grafts is likely more work than using
--parent-filter
, but it does have the advantage of being able to graft together more than two histories with a singlefilter-branch
. (You could do the same with--parent-filter
, but the script would become very ugly very fast.) It also has the advantage of allowing you to see your changes before they become permanent; if it looks bad, just delete the graft file to abort.Merging more than two repositories together with grafts
To use the graft method with R1 (oldest) through R5 (newest), just add multiple lines to the graft file. (The order in which you run the
echo
commands does not matter.)What about git rebase?
Several others have suggested using
git rebase R1/master
instead of thegit filter-branch
command above. This will take the diff between the empty commit andK
and then try to apply it toD
, resulting in:This will most likely cause a merge conflict, and may even result in spurious files being created in
K'
if a file was deleted betweenD
andK
. The only case in which this will work is if the trees ofD
andK
are identical.(Another slight difference is that
git rebase
alters the committer information forK'
throughN'
, whereasgit filter-branch
does not.)