Let's call my-dirty-repository
an existing Git repository containing lots of scripts which are not related. It is a catchall repository which needs to be properly cleaned.
As a Minimal, Complete, and Verifiable example, let's say this repository only contains:
script1.sh
script2.sh
With various commits, which independently updated them, among several branches.
The aim is to create 2 100% independant Git repositories, with ONLY the history of kept files (references).
Let's call them my-clean-repository1
and my-clean-repository2
, the first one having only history about script1, and the second having only history about script2.
I tried 3 ways to reach my needs, without success:
- Simple clone +
git rm
to remove unwanted references - Sparse Checkout which is not adapted at all
- Shallow Clone
I'm pretty sure there is a way to perform it properly.
Edit: I created dedicated tool cloneToCleanGitRepositories to answer this need.
It is complete version of the old following one.
@mkasberg thank you for your advices about interactive rebase which is very interesting in some simple history situation.
I tried it, and it resolves my issue for some of the scripts for which I wanted a clean dedicated, independent, git repository.
Eventually, it was not enough for most of them, and I tried again another solution with Git filtering system.
Finally, I wrote this little script:
Usage :
In destination directory, it will create a new clean git repository for EACH file in root directory of specified source Git repository. In each one, the history is clean and is only related to the kept script.
In addition it disconnects/removes the remote to ensure avoiding issue pushing back the changes to the source repository.
This way, it is easy to 'migrate' from a big dirty catchall Git Repository, to various clean ones :-)