Given a single branch in a git repository, how do I grab all versions of a file? The final desired output is one new output file per version (although printing all of the versions to STDOUT is fine as well, as long as it's easy to determine the extent of each version).
Here's what I'm currently using:
branch=master
filename=...whatever...
myids=( $(git rev-list --objects $branch -- $filename | grep $filename | perl -e 'for (<>) { print substr($_, 0, 40) . "\n"; }') )
for (( i=0,j=1; i < ${#myids[@]}; i++,j++ ))
do
echo $i $j
name="output_$j.txt"
git cat-file -p ${myids[$i]} > $name
done
Explanation of what I'm doing:
- use
rev-list
to find relevant hashes - strip the hashes of commits and trees, leaving just the hashes of the files (these lines also include the filename)
- strip the filenames
- run the hashes through
cat-file
and generate output files
The two main problems I have with this are that 1) it's not robust, because the grep
could have false positives, and I'm not sure if it would follow file renames, and 2) it feels super hacky.
Is there a better way to do this, perhaps making better use of git's API?
Not the best solution, but it's one possibility
If this is just a one-off thing that you're trying to do, you could try this with the following script, but it uses Git porcelain from version 1.9.4 though, so it's definitely not a robust, reliable solution, since it's dependent on what version of Git you're using:
It simply uses
git log
to find all commits that modified the file:Then uses the
<revision>:<filepath>
syntax to output the version of the file from that revision.git log
can sometimes simplify your graph history though, so you might even want to pass the--full-history
flag togit log
, though I'm not exactly sure if it would be necessary for this particular use case.How well would this follow renamed files though? You'd probably need to make the script a little smarter about keeping track of that, in order to use the right file path.
Again, however, I'd like to emphasize that this is not the best solution, a better solution would make use of Git plumbing commands instead, since they won't be so dependent on the Git version, and will be more backward and forward compatible.
Documentation