git subtree while keeping history

179 Views Asked by At

I have multiple git repositories that I want to combine under one single git repository while keeping all of their commit histories, but do that gradually so I can still pull changes before completely abandoning those old repositories. I am able to use git subtree add to add those but the history on individual files is lost.

I want my combined repository will have the following structure:

.
-─ src
   └── stores
       |── storeA
       |   ├── README.md
       |   ├── package.json
       |   ├── src
       |   │   ├── index.ts
       |   │   ├── models
       |   │   │   └── index.ts
       |   │   └── stores
       |   │       ├── FileA.ts
       |   │       ├── FileB.ts
       |   └── yarn.lock
       └── storeB
           ├── README.md
           ├── package.json
           ├── src
           │   ├── index.ts
           │   ├── models
           │   │   └── index.ts
           │   └── stores
           │       ├── FileA.ts
           │       ├── FileB.ts
           └── yarn.lock

I start by going into the git repository and saying

git subtree add --message="old repo - storeA" --prefix src/stores/storeA [email protected]/storeA main

and this will bring the git repository where I want it and also all the commits

// git log --graph --oneline
*   8e276e5 (HEAD -> main) old repo - storeA
|\
| * 2b59054 fix 2
| * 70ede5a feat 2
| * 57041d5 fix 1
| * de39054 feat 1
| * 85d213b Added README.md
* 49942fb init

But when I try to get the history of a specific file I only see one commit and the rest is lost:

// git log --graph --oneline src/stores/storeA/src/stores/FileA.ts
* 8e276e5 (HEAD -> main) old repo - storeA

although I know that there are more commits for this file in the old git repository. I suspect it has something to do with the paths because git blame shows the different commits and git show 57041d5 (one of the commits from the old repository) shows

diff --git a/src/stores/FileA.ts b/src/stores/FileA.ts
--- a/src/stores/FileA.ts
+++ b/src/stores/FileA.ts

but these are the old paths.

I have tried using git-filter-repo --path-rename src/stores/:src/stores/storeA/src/stores/FileA.ts and it seems to be working although it rewrites the history and it can only work for one (the first) repository. If I try to re-run it again on the other git repositories that have the same structure it will mess up all the paths.

Is there any alternative or have I misused the git-subtree command in any way?

1

There are 1 best solutions below

0
eftshift0 On

I have never had to integrate stuff like this but, at least in my head without ever having done it, I think I would not do anything fancy. If the objective is to have everything in a single place without losing any history, I would just "merge" stuff into a single project.

Let's suppose that you have project A and project B which they are separated in 2 github repos.... I indent to put them together into a single project where what is the root of each project right now should end up in 2 different directories: projectA and projectB in the final project. So....

git init final-project
cd final-project
git remote add remoteA url-to-projectA
git remote add remoteB url-to-projectB
git fetch --all
# now I can see both projects and their branches
git checkout -b masterProjectA remoteA/master
# let's move stuff into projectA directory
mkdir projectA
git mv dirA dirB dirC fileA fileB fileC projectA/
git commit -m "move projectA stuff into projectA dir"
git checkout -B masterProjectB remoteB/master
# same trick
mkdir projectB
git mv dirD dirE dirF fileD fileE fileF projectB/
git commit -m "move projectB stuff into projectB dir"

Now we have 2 local branches where we moved all directories in project A and project B into their non-colliding directories.... now, let's merge them

git checkout -b master masterProjectA
git merge --allow-unrelated-histories -m "Merging project B into this big project" masterProjectB

And now you have your 2 projects (well... their main branches) merged into a single project while keeping both their histories.... if the projects continue to be developed separately, you will still be able to merge stuff into this big project. Not saying it is bullet proof, by the way, there are a lot of corners and rough edges (new files? Files than change a lot?).