Issues Cleaning Up a Git Repository

150 Views Asked by At

I have a private remote git repository, which I have been using for a project for years. In the course of some testing and bug fixing, I inadvertently uploaded some very large files (MySQL dumps, each dump > 100 MB) and 2 directories with test files (total 8 GB). I now cannot clone or update the repository. That was a few commits ago, not the current commit. Once I realized the issue, I added entries to .gitignore (*.sql, MemorabiliaJSON/documents, designs/)

I was looking into how to remove these two folders and files. The BFG seemed to be promising, but I can't even get past step 1:

$ git clone --mirror <path to repo>memorabilia-JSON.git
Cloning into bare repository 'memorabilia-JSON.git'...
remote: Enumerating objects: 15031, done.
remote: Counting objects: 100% (15031/15031), done.
error: pack-objects died of signal 9782/13739)   
error: git upload-pack: git-pack-objects died with error.
fatal: git upload-pack: aborting due to possible repository corruption on the remote side.
remote: aborting due to possible repository corruption on the remote side.
fatal: early EOF
fatal: index-pack failed

git-filter-branch is another option, but seems more fraught with the opportunity for mistakes.

Any idea on how I can remove these large directories and the *.sql files? Is there something else going on that is causing this problem?

Thanks!

Mark

UPDATE: Taking Kay's advice, I did a local mirror copy of my remote repository on the remote machine that houses the remote repository. There, I used BFG and git-statistics to trim the mirrored remote repository from 8GB to 145MB. I then looked in the git log for both the updated mirrored repository and the local repository on my local machine (not the remote machine), and found that the last 4 commits from my local repo are not in the updated mirrored remote repo. So, I am at a crossroads....

Option 1 push the updated mirrored repo to the remote repo, so now it is fixed. I think that if I commit the local repo to the updated remote, all the big files will be included (since they are in my local git repo) and I am back to where I started.

Option 2 Start over, mirror my local repo, fix it using the process I used for the mirrored remote repo, push the updated mirrored local repo to my local repo so now the local repo is clean, and the last 4 commits are in place. Then I push my local repo to the remote to replace the remote with the update local repo, and now the remote is clean and the 4 missing commits are in place in the remote.

Does Option 2 sound like the best course of action? I am not a git expert, so I am a bit uncertain about how this all works, and worried I may lose all 138 commits in my local repo!

Thanks,

Mark

0

There are 0 best solutions below