How to track the trace of a file through commits in the forward direction in a git repository?

230 Views Asked by At

Suppose we have a file F in a certain commit C. Given a future commit C', what is the best way to deduce the list of files in C' that are derived from F?

A file can be modified, renamed, moved, copied, deleted, split and merged several times in any order on its journey from one commit to another.

I want to determine the final list of files - which can be empty - that are "derived" from the original file through such operations, ideally using plumbing commands of git.

Yes, there is no way to reliably say wether a file is split, merged, moved or copied for certain. But the algorithm that is used in git-log or git-blame for the same purposes should be ok. (Thanks @SpaceKatt).

TL;DR I want a function that produces the following Output given the Input:

Input: {F, C, C'}
  C is an earlier commit.
  C' is a later commit (and C is reachable from C')
  F is a file in C

Output: {F1', F2', F3', ...}
  Fx's are list of files in C' that are derived from F

Note: If It was the opposite problem (i.e finding history of a file) a solution could probably be derived using git-log or git-blame, although I am not aware of a perfect solution (not involving porcelain) for that either.

1

There are 1 best solutions below

0
torek On

The short answer is that Git can only go backwards, not forwards (as matt noted).

The workaround for this is simple: go backwards. Let's say you have a branch name BR that identifies some final commit:

... <-F <-G <-H ... <-Z   <--BR

and you wish to "go forward from G". Start by listing out every commit backwards, from Z on:

git rev-list BR

When you reach commit G, stop. Take the list of hash IDs you just generated, and now you can work through them one pair at a time:

(H, G)
(I, H)
(...)
(Z, Y)

Note that git blame --reverse already "knows how to do this", as it were, but still needs both end-points, i.e., you have to locate Z, assuming you want to go forward to Z. But you'll have to pick the file names yourself, which is kind of the problem. (You could run it once for every file in Z, automatedly; this will be slow.) There is no good answer to this, at least not today.