git notes after BFG?

658 Views Asked by At

I migrated from SVN to git and i had a note in each git commit referencing to SVN revision number. After repo import i used BFG repo cleaner to clean git history from binary files and other trash. Unfortunately now i do not see notes when i type git log. I suppose BFG forgets to update references on commits for notes. BFG leaves a *.txt report with mapping old object id to new object id in the following format:

0001b24011381e8885683cd1119ba4cb077fa64b c81149b1b52b9e1e1767d6141f292891d715edb5
00024eecdc31f2f6e67018f7d6f00e7c1ad03f1f 326ee3b508e3dd2934ec1f50069195f86ea1a1c7
00028e04dcc2d59bd835b447bd3a207ae481696c 3d18e9b9d3336e59d62093200b81603ffefcc747

Can you suggest some script to quickly fix notes given the above mapping?

PS: I am almost sure the problem is caused by not updated refs, because when i type git notes in second place i can see refs which are considered old in BFG repot object-id-map.old-new.txt

2

There are 2 best solutions below

1
On BEST ANSWER

I wrote the following bash script to transfer my notes from old objects. The solution is slow in single thread, not sure if it is safe to run several git notes commands in parallel.

while read string; do
    hashesArray=($string)
    git notes copy ${hashesArray[0]} ${hashesArray[1]}
    git notes remove --ignore-missing ${hashesArray[0]}
done <object-id-map.old-new.txt
5
On

[Edit: Git version 2.15, released in Nov 2017, added --state-branch to filter-branch. This option saves the map in a file in a branch that the first filter-branch operation creates. Subsequent filter-branch operations will use the existing map. So, since Git 2.15, add --state-branch <name> to your filter-branch operation, then use the map in the newly created branch.]

There's nothing built in to do this; you will have to write the script or program yourself. On the bright side, BFG left you the map file: that's much nicer than git filter-branch, which throws it away, so that the information you need to update your notes is gone.

The underlying implementation within notes is that refs/notes/commits (or whatever you set for core.notesRef) points to an ordinary commit, which you can at least in theory git checkout (probably into a temporary work tree you set up specially for this purpose). This tree contains files whose names are the annotated commits—only, slightly modified. For instance, if:

0001b24011381e8885683cd1119ba4cb077fa64b c81149b1b52b9e1e1767d6141f292891d715edb5

is a mapping entry, with 0001b24011381e8885683cd1119ba4cb077fa64b being an old commit, and if 0001b24011381e8885683cd1119ba4cb077fa64b has a notes entry, there will be a file whose name is 0001b24011381e8885683cd1119ba4cb077fa64b—only, it might be 00/01b2... or 00/01/b2....

The nesting depth of all these added subdirectories is managed dynamically by the notes code, with the general idea being "add as many trees as needed to making finding whether there is a note, fast; but not so many trees as to take up a lot of space in the repository when there are very few notes starting with 0001b2.... This fan-out is not crucial to your purposes although you may wish to maintain it for the same speed reasons.

Your job will be to find each file in this tree under its old name, and move (or copy) it to a new name that matches the new commit ID. Since the new name in this case would be c81149b1b52b9e1e1767d6141f292891d715edb5, you would rename the file as c8/1149b1b52b9e1e1767d6141f292891d715edb5, or c8/11/49b1b52b9e1e1767d6141f292891d715edb5, etc. Once you have renamed all the files (via the index: use git mv or git rm --cached and git add as needed), you can turn them into a regular commit object with git write-tree followed by git commit-tree. Make the parent of the new commit be the existing refs/notes/commits commit, and use git update-ref to update refs/notes/commits to point to the new commit, and your notes should reappear, post-filtering.

(Once you have such a thing working, it would be nice to join it up with git filter-branch and/or BFG itself.)