loading...

My preferred merge strategy for Pull Requests

igorsantos07 profile image Igor Santos ・1 min read

In every decent team project I've worked before (that had PRs in place), the merge strategy was to squash all commits into a single one, to be done into the master branch.

In the beginning, it felt odd, as "all my hard work and organized commits" would become a single, standardized blob of sorts. Then I got used to the flow and, well, whatever, right? It works®.

But then it came the time it was me who was deciding the path a team would start to walk through. And the question came, why squash merges? Why rebases?

As people say, first you resist, then you accept, and then you question what was happening.

Squash diagram

So, to help with understanding why I chose Squash Merge to be the default for our repository:

Squash merges are usually preferred not only because of the "flat history" (that is usually easier to navigate with any tooling), but actually, because the history gets much cleaner and "straight to the point", without all the partial commits that are common during development.
If the squash is done correctly, you'll still maintain all commits' text in the squashed commit, so you can still get some details about the development of that feature, without clogging your master history.

One should also remember history is not only used to see code progress on a timeline/tree, but also for blaming changes. Even though you can use tools or extra options to "ignore" partial commits, blame probably won't - or will get too convoluted, at least.

I did some quick research to see discussions on the subject, and I can link you to two references that show my opinion and that I used to enhance my arguments above:

Answer about squash preference, on a CLI tool for Git Workflow:

[...] And over time, that's all we really care about in the history. Who made this change and why was it made. Squash merging allows us to do that while still giving all of our developers the individual freedom to develop in the way that suits them best.

GitHub manual about PR merge strategies:

You can use squash and merge to create a more streamlined Git history in your repository. Work-in-progress commits are helpful when working on a feature branch, but they aren’t necessarily important to retain in the Git history. If you squash these commits into one commit while merging to the default branch, you can retain the original changes with a clear Git history.


(this was also posted to my Medium)

Best wishes of no-conflict merges!

Posted on Aug 3 '18 by:

igorsantos07 profile

Igor Santos

@igorsantos07

Remote developer with ~10 years of experience. Mostly worked with PHP and with a passion for REST APIs and front-end interfaces, UX and DX. I also have a foot on React and Python/Django :)

Discussion

markdown guide
 

Going along with what Alain was saying, this is something that confuses me too about why people like the squash workflow. I get uncomfortable at the idea of throwing away all that context! For example, my most recent PR at work had 541 additions and 26 removals across 23 commits. If I had squashed and ended up breaking something, git bisect would be useless for finding the commit that broke things, and if you're looking through the diff, it's much harder to figure out which logical idea in the commit message was responsible for which changed line.

In the interest of building an understanding about why others work the way they do, what are some "pro-squash" benefits?

 

your PR is too big
its not really checkable nor approvable - noone can hot approval button and consciously share responsiblilty for merging this to master. So go back drawing board, split it up! Respect your git history amd your coworkers time )

I do it all the time - an idea, or a feature first lands in a prototype branch (even TDDed one!) and the I split it to smaller branches that are easy for my collegues to comprehend and check.

PS safe (automated) refactoring, like migrating to new codestyle can be as big as needed, and doesnt really require checking. But as you mentioned 23 commit and “context” attached to them - I take it as it was not the case )

PPS I have to admit, squashing works well with short-lived feature branches, CI/CD and instant releases. If you, for any reason, have different lifecycle (I bet its organisations’ processes/structure) squashing might seem like unnecessary or strange idea

 

Thanks for the input Andrew - I definitely had the feeling that PR was getting too big! The way I went about it was very TDD - authoring a test for the bug I was fixing, then getting it to pass, and then repeating that test + pass cycle until I had covered all of the corner cases. How would you recommend splitting that up? I wouldn't really feel comfortable merging a fix into master that I didn't feel was complete, so I'm curious what strategies you would have in mind!

I’m not sure if you will like this answer =) IMO if tests came out so big, component was also quite big, so maybe its possible to extact/isolate it and cover a small coherent part. Tests would likely to become smaller, and integration tests could be added in a separate branch.

PS yes it means refactoring before fixing a bug, and actually this is what I do too )

 

I guess you aren't the guy that has to reverse cherry pick, in order, resolving conflicts as you go, to get your feature onto another branch.

We squash, as it makes cherry picking easier. You still have to resolve conflicts of course, but only from a single commit, not up to 23 conflict resolutions from 23 commits that all need to be cherrypicked in reverse order

 

I've never had to do that, so that explains why I would devalue squashing. Out of curiosity, why squash + cherry pick rather than just merge? Also, are you familiar with git-rerere? It repeats conflict resolutions for rebases; I wonder if it does the same for cherry picking!

We have two development branches at most times, a branch with only critical fixes (next), and a development branch with it all (nextnext).

All work branches from develop, but only a few of those go to develop and turn onto the next release branch. We obviously can't merge development branch, as it's got stuff we don't want, and branches are deleted when they merge to development branch.

The only other way would be two pull requests. One to develop and one to next, but that is not without its problems either.

Interesting - thanks for the insight!

 

Rob mentioned rerere, but that doesn't help when the conflicting code is a moving target between commits :)

 

Even in your service-branch and development-branch scenario, I think you can still do a normal merge to one of them, and then do a 3-way rebase on the other. So you don't have to do cherry-pick.

 

This has probably been asked a thousand times already, but don't you lose a substantial amount of debugging information?

As you mentioned, the commit messages are indeed retained, but they are also bundled. For a larger PR, that means that you have you have to deduce (i.e. guess) which commit message might be relevant for the specific lines of code that you are debugging. That (to me) seems to negate most of the rather significant benefits of doing small commits, which is a serious downside since developers tend to spend far more time and effort on debugging than they do on writing new code.

Or am I missing something?

 

Github does allow restoring deleted branches after squashing the PR, so when a PR is found to be faulty the branch can then be bisected. Slightly less efficient maybe, but I've only needed to do it rarely.

 

Ah, good to know that that exists. Since you mention not needing to resort to that very often, is that because the applications you work with have a fairly low bug count, or that the PRs are long-lived enough, or that the code flows (and as such, developer intent) always sufficiently clear? I just ask because I shiver at the thought of all the extra time that I would have spent fixing bugs without the context of a detailed commit history.

That's a good question. I think part of it is low regression bug count. Also the code is usually clear enough that tracking down the code is pretty easy.

That would indeed make a big difference. The code that I'm used to working with tends to have complex flows (not in a spaghetti way, it just needs to do a lot), with fairly large amount of business-related decisions (i.e. 'what' and 'how' are insufficient, you need 'why' as well).

 

Note that github saves the branches, so you can still get back to the unsquashed one, it just takes more work.

I'm curious what you do when multiple people push to the same branch/PR.

 

My last team did the same thing. I'm not sure how it came about even though I was in the conversation.

But no one has questioned it in the last two years.

 

But this works only if you are only one who is working on that feature/bug branch you squash-merge in final. This doesn't work when more people are involved, because their credit is lost...

 

That's a fact. But it's easy to remedy (and also easy to get forgotten, obviously): the PR author gets to push the merge button :)

 

I thought it was more of a merge squash vs rebase topic from the title, but nevertheless good article.

 

Sorry, it's about my preferred strategy :)
For the actual discussions, you can check the links I referenced!

 

I become hopeless and don't know what to do when my teammates don't see the benefit of squash merging. I'm the one who is worshiping the squash merge. It made my life so easier and efficient.