How can I ensure dangling commits won't be garbage collected by Gitlab?

86 Views Asked by At

I use a gitlab-hosted repository, but I hope my question is more general than that. Suppose I have an issue #N, and I work on a feature branch feature-N and make 10 commits. In each of these commits I tag the issue #N in the commit message, so that in the future if I view the issue I can easily link to any of the commits. I then merge the feature-N branch into main, and I use the squash-commits option in order to keep the git history a bit cleaner.

I am aware that if I choose NOT to delete the remote/feature-N branch, then there will always be a branch reference to those 10 commits, and I will still be able to access them. By "access", I mean that I can go to the Issue page, see the hyperlinks to the 10 commit IDs, and those links will be valid links.

However, suppose I do want to delete the remote/feature-N branch to keep things clean. Based on what I have read, those 10 commits are now unreferenced, which means they are subject to garbage collection whenever gitlab decides to perform garbage collection on the remote repo. Does this mean that after some amount of time, the hyperlinks in the Issue page will be broken, because those commits no longer exist? Or is GitLab doing something behind the scenes to maintain references to those commits, to prevent garbage collection from deleting them.

So far, I have done the above scenario, and deleted the remote branch. The links on the Issue page to the dangling commits are still working links, and my guess is that this is because garbage collection has yet to take place. Since I can't force garbage collection on the remote, now all I can do is wait and see if one day those links are broken. In lieu of waiting, I figured I'd ask the question here to get some clarity, so I can make up my mind about whether a best practice is to delete the remote branch or not following a merge.

2

There are 2 best solutions below

2
jthill On

and I use the squash-commits option in order to keep the git history a bit cleaner

If you don't want to see the merged history, just the results, use -m --first-parent on your log displays (-m is a good thing to tack on so when you ask to see patches you don't forget). If GitLab's web ui doesn't offer that as an option (I see an open issue asking for it), that's on them, Git will do it for you just fine, it's built to help you see what's important for your present purpose without losing what might be important detail for other uses.

0
TTT On

tl;dr: the squashed commits should stay around forever, on the server, even if you delete the source branch.

GitLab automatically keeps hidden references to any commit referenced by the history of a Merge Request. Fortunately these are only maintained on the server, and are not referenced by regular branches or tags, and therefore do not increase the size of clones. My understanding is the hidden references should persist forever, so you need not worry about losing them with the default configuration.

Note, I know this is also true for both Azure DevOps and GitHub, but with a quick search I struggled to find this explicitly documented for GitLab. I do see some issues (e.g. here, and here) where users are trying to figure out how to garbage collect some of those hidden references. It appears the answer to those types of issues is officially described here, where specific refs can be purged manually to enable garbage collection. The following statement:

These refs are not automatically downloaded and hidden refs are not advertised, but we can remove these refs using a project export.

implies that those hidden references will stick around forever. If that wasn't true then links in historical Merge Requests would eventually break.