filter-branch works only after running it twice

2.2k Views Asked by At

To remove build results from our repository, I ran git filter-branch -f --index-filter 'git rm --cached --ignore-unmatch */obj/*' -- --all. It appeared to work.

Oddly, running git log --oneline --name-only --all -- */obj/* displayed a list of commits that still contain the obj directory.

So, I ran filter-branch a second time. Git told me:

Rewrite d57c56e00f713854d8b5889a259e10bd9be6a83c (316/316)
WARNING: Ref 'refs/heads/master' is unchanged
WARNING: Ref 'refs/remotes/origin/master' is unchanged
WARNING: Ref 'refs/remotes/origin/carousel' is unchanged
WARNING: Ref 'refs/remotes/origin/master' is unchanged

Now when I again run the log command, the history is empty. The obj directory is no longer in the history. That's good, of course, but the question remains: Why did I need to run the filter-branch twice?

1

There are 1 best solutions below

1
On BEST ANSWER

Presumably the ones showing up after the first filter-branch were named via refs/original/refs/heads/master and the like. --all means all references underneath refs/. Remember that filter-branch saves (backs up, as it were) the original references under refs/original/ (or any other name space you define with --original). You can use those to put things back if the filter-branch operation made a mess.

With -f (force), filter-branch will run even if there's already a backup branch name-space: it removes the old backup to make room for a new one. Without -f it instead does this:

die "Cannot create a new backup.
A previous backup already exists in $orig_namespace
Force overwriting the backup with -f"

By removing the backup, the second filter-branch operation deleted all the refs/original/ entries — something you are normally supposed to do manually once you're happy with the result of the filter. In this case, since the re-application of the filter harmlessly did nothing, the second filter-branch took care of the cleanup for you.