Isn't git merge --squash really git rebase -squash?

747 Views Asked by At

Trying to understand why the command

git merge --squash mybranch

isn't called

git rebase -squash mybranch

Since it seems the operation it does is much more like a rebase than a merge. My understanding is it searches back in the commit tree until it finds a common base commit for the current branch and mybranch. Then it reapplies (rebases) all the commits from that base node up to the head of mybranch onto the head of the current branch. But does this into the index/workspace so it can be applied as a single commit. When done there is no merge node as there is in a normal merge showing the two branches that were merged. Do I have a correct understanding?

2

There are 2 best solutions below

0
On BEST ANSWER

Well, merging and rebasing are fundamentally different operations. Merge—by which I mean a regular git merge that creates a new merge commit—does indeed search back through the commit graph for the most recent common commit:

...--o--*--o--o--A   <-- mainbr
         \
          B--C--D--E   <-- sidebr

Here, the most recent common commit is *. The merge process will then compare (as in git diff) commit * with commit A to find out "what we did", and diff * against commit E to find out "what they did". It then makes a new single merge commit, with two parents:

...--o--*--o--o--A---M   <-- mainbr
         \          /
          B--C--D--E   <-- sidebr

which joins the two histories, and combines "what we did" and "what they did", so that diffing * vs M gives "one of each change".

Note that you get no choice of merge base here: Git figures it out, and that's that.

Rebase, on the other hand, can be told both which commits to copy, and where to copy them, separately. It's true that, by default, it locates commit * again1—but then it copies the original commits, one by one, using git cherry-pick or the equivalent; and finally, it moves the branch label to point at the last copied commit.

...--o--*--o--o--A   <-- mainbr
         \        \
          \        B'-C'-D'-E'   <-- sidebr
           \
            B--C--D--E

The original chain of commits (B-C-D-E, in this case) is still in the repository, and still findable: they can be found by hash ID, and in the reflog for sidebr, and if any other branch or tag name makes them reachable, they remain reachable by that name.

What git merge --squash does is to modify the merge process just slightly: instead of making merge commit M, Git goes through the merge machinery as usual, diffing the merge base—which you don't get to choose—against the current commit and the other commit, and combining the changes in the index and work-tree. It then—for no obvious reason2—stops and makes you run git commit to commit the result, and when you do, it's an ordinary, non-merge commit, so that the whole graph-fragment looks like this:

...--o--*--o--o--A--F   <-- mainbr
         \
          B--C--D--E   <-- sidebr

Now, the contents of commit F—the snapshot tree resulting from the merge—is the same as the contents of commit M when we do a real merge, and—here's the real kicker—it's also the same as the contents of commit E' when we do a rebase.

Moreover, suppose there were only one commit (B) on the side branch sidebr. Now all three of merge, merge --squash, and rebase will give you a picture that ends with something we might just draw like this:

...--o--*--o--o--A--B'   <-- ???
         \          ?
          B?????????   <-- ???

and the contents of new commit B', i.e., the final snapshot, is the same in all three cases. However, for git merge, the new commit will be on branch mainbr and will point back to commits A and B, with sidebr pointing to B; for git merge --squash, the new commit will be on mainbr and will point back only to A; and for git rebase, the new commit will be on sidebr, with nothing obvious pointing to B at all, and we should draw this as:

...--o--*--o--o--A   <-- mainbr
         \        \
          B        B'  <-- sidebr

since mainbr will continue to point to commit A.

In the end, this looks a bit more like a merge than it does like a rebase. (However, I would be happier if it weren't called a "squash merge" at all.)


1The method by which Git finds * is somewhat different: it's actually not a single commit, but rather just the last of a (usually) very large set of commits, namely, all those reachable from the <upstream> argument to git rebase. (Confusingly, a merge base can also be a set of commits, but it is a much more restricted set. Best not to dive into the graph theory yet. :-) )

2If we wanted it to stop, we could use --no-commit just as we do for regular, non-squash git merge. So why does it stop automatically?

0
On

Conceptually: Kind of

Historically: No

It seems that git merge --squash was introduced in commit 7d0c68871a (git-merge --squash, 2006-06-23).[1] If you check out this commit and look at the documentation for git-rebase(1) you will notice that it says nothing about interactive rebase, including operations like “squash”; git-rebase(1) was apparently only used for replaying commits on top of some other branch or commit, not also to reorder, manipulate, and squash changes.

Kind of

My expectation when using git-rebase(1) is that some branch will be updated, possibly in some non-fast-forward way. But git merge --squash conceptually creates a new commit with a “squashed” commit message and a tree which is equal to the tip of the branch that you want include. So you don’t change any branch in some non-fast-forward way; you leave the other branch alone and create a new commit which is conceptually speaking cherry-picked onto the branch which is supposed to be merged into.

Notes

  1. The commit messages motivates the switch nicely:

    git-merge --squash

    Some people tend to do many little commits on a topic branch, recording all the trials and errors, and when the topic is reasonably cooked well, would want to record the net effect of the series as one commit on top of the mainline, removing the cruft from the history. The topic is then abandoned or forked off again from that point at the mainline.

    The barebone porcelainish that comes with core git tools does not officially support such operation, but you can fake it by using "git pull --no-merge" when such a topic branch is not a strict superset of the mainline, like this:

        git checkout mainline
        git pull --no-commit . that-topic-branch
        : fix conflicts if any
        rm -f .git/MERGE_HEAD
        git commit -a -m 'consolidated commit log message'
        git branch -f that-topic-branch ;# now fully merged
    

    This however does not work when the topic branch is a fast forward of the mainline, because normal "git pull" will never create a merge commit in such a case, and there is nothing special --no-commit could do to begin with.

    This patch introduces a new option, --squash, to support such a workflow officially in both fast-forward case and true merge case. The user-level operation would be the same in both cases:

        git checkout mainline
        git pull --squash . that-topic-branch
        : fix conflicts if any -- naturally, there would be
        : no conflict if fast forward.
        git commit -a -m  'consolidated commit log message'
        git branch -f that-topic-branch ;# now fully merged
    

    When the current branch is already up-to-date with respect to the other branch, there truly is nothing to do, so the new option does not have any effect.

    This was brought up in #git IRC channel recently.

    Signed-off-by: Junio C Hamano [email protected]