I plan to use LibGit2/LibGit2Sharp and hence GIT in an unorthodox manner and I am asking anyone familiar with the API to confirm that what I propose will in theory work. :)
Scenario
Only the master branch will exist in a repository. A large number of directories containing large binary and non-binary files will be tracked and committed. Most of the binary files will change between commits. The repository should contain no more than 10 commits due to disk space limitations (disk fills up quite often now).
What the API does not provide is a function that will truncate commit history starting at a specified CommitId back to the initial commit of the master branch and delete any GIT objects that would be dangling as a result.
I have tested using the ReferenceCollection.RewiteHistory method and I can use it to remove the parents from a commit. This creates me a new commit history starting at CommitId going back to the HEAD. But that still leaves all of the old commits and any references or blobs that are unique to those commits. My plan right now is to simply clean up these dangling GIT objects myself. Does anyone see any problems with this approach or have a better one?
While rewriting the history of the repository, LibGit2Sharp takes care of not discarding the rewritten reference. The namespace under which they are stored is, by default,
refs/original
. This can be changed through theRewriteHistoryOptions
parameter.In order to remove old commits, trees and blobs, one would first have to remove those references. This can be achieved with the following code:
Next step would be purge the now dangling git objects. However, this cannot be done through LibGit2Sharp (yet). One option would be to shell out to git the following command
This will reduce, in a very effective/destructive/non recoverable way, the size of your repository.
Your approach looks valid.
Update
If the limit is the disk size, another option would be to use a tool like git-annex or git-bin to store large binary files outside of the git repository. See this SO question to get some different views on the subject and potential drawbacks (deployment, lock-in, ...).
Beware, this may be a bumpy road to go
.git\objects
folder are usually read-only files.File.Delete
can't remove them in this state. You'd have to unset the read-only attribute first with a call toFile.SetAttributes(path, FileAttributes.Normal);
, for instance.Tree
s andBlob
s may turn into quite a complex task.