_juliettech

Posted on Jan 28, 2020 • Edited on Jan 29, 2020

Git merge, git rebase, and crawling out of the git hole

#git #merge #rebase #bash

To master git is like being a time wizard ✨. You can move across points in time, flying like a bird from one commit to the next, always having a stepping point to go back to when you fuck up. Because let’s be honest, at one point or another, we will.

And that's how we get into the git hole. For those of you lucky enough to have never experienced it, the git hole is that dark place in a developer’s life when your timeline becomes a blur and you’re not sure what happened or where you lost your perfectly-working code.

So what's this big fuss between git merge and git rebase?

The most important thing to understand is that when we’re talking about merging and rebasing, it’s all about how your branch ends where it ends.

What do I mean by this?

What they have in common

git merge and git rebase both join your code with another branch, usually your remote’s master branch.

Where they differ

git merge

When you’re doing a merge, your git tree is coming full circle and you can see this graphically if you download a visual tool like Sourcetree. This means that:

The branch you’re currently working on, with all your commits and lines of code, will be joined back into master, leaving a trail of where it first left and when you came back into the code.
Once the conflicts are fixed, if any, it will generate an additional merge commit to keep track of where you were when your code joined master’s.

git rebase

When you rebase, you’re basically playing with time. The result is as if you were coding from and into masters the whole time, except you were really coding off of your own branch. If you look at your tree after a git rebase, it doesn’t have any branches connected back into master because git rebase pretends as if you were "always" coding in master.

This means that when fixing conflicts, rebasing chooses your remote master version as the Queen of All Truth 👸🏻. It’s as if you’re changing the base of your code to be the base of the code found in your masters branch and it does not leave a merging commit because it acts as if it had been “there the whole time”.

Going deeper

Now, to my understanding, git’s superpower is keeping track of where we are at any given point thanks to us writing commits that mark specific moments to go back to if needed.

So, in my brain, I’ve never really understood why people would want to rebase and play git kong-fu rather than just merge and be on the safe of history. Playing with time like this had to be either contradictory or brilliant.

Indeed, what I hadn’t understood was that the reason why people use rebase instead of merging. Rebasing allows your codebase to be slimmer and have a cleaner overall history without any branches coming off of your linear tree.

With both git rebase and git merge, you can squash all your commits into one so that your codebase has only one commit per feature, but git merge will always produce that final merge commit that only fattens your code and provides no additional value.

And it makes sense. If you have 30+ developers working on the same code, and each is committing every step of their process, you will quickly have a fat and dirty codebase filled with random commits like 'fixed typo'. If you rebase and squash all your commits together, however, the result is a linear tree detailing every feature built and their code.

In practice

Setup

I’ll first create a random index.rb file and make sure git is initiated. This assumes you already have git in your computer connected to a remote repository somewhere. I’ll also create a branch for my first feature.

# Create file
mkdir music-app && touch index.rb    
# Initiate git   
cd music-app && git init    
# Creates github repo with the name of the folder (I’m in using the [hub gem](https://hub.github.com/)) 
hub create    
# Checkout of master and create a branch called bands4tonight    
git checkout -b bands4tonight

What we'll build?

I will create a small Ruby app that iterates over an array of my favorite bands and throws at me one by random.

bands = ['Incubus', 'Arctic Monkeys', 'Papooz', 'L’imperatrice', 'Poolside']    
puts bands.sample

I then go to the terminal to save up my progress. God forbid I lose my beautiful work.

# Check everything is as I expect    
git status    
# Adds everything to the bucket    
git add .    
# Commit my changes    
git commit -m 'program returns a sample band'    
# Push my code to bands4tonight branch    
git push origin bands4tonight

git kong-fu

At this point my code is found remotely in my branch. Great.

Meanwhile, let's say other developers are coding away features on different parts of the app, adding their favorite bands to the mix or just building new functionalities. I need to make sure I have the latest updated version of the code.

# Move to master branch
git checkout master
# Pull in locally the work the other devs have been doing
git pull 
# Move into my working branch
git checkout bands4tonight
# Make sure the code I just pulled in is now the base for my working code in my branch
git rebase master

Up until here, my working branch has the latest code from master and acts as if I was always there. Yes, that's exactly what we wanted!

The real trick happens as I keep on adding more code to my feature.

Let's say this week I’m obsessed with a new band (which I'd recommend checking out btw 😉). So I go back into index.rb and add it to my list.

bands = ['Incubus', 'Arctic Monkeys', 'Papooz', 'L’imperatrice', 'Poolside', 'Señor Loop']

Back in terminal, it's time to push my code to my remote branch to make the Pull Request.

git status
git add .
git commit --amend -m 'program returns a sample band'
git push -force

By writing git commit --amend -m ‘..’, I’m adding my new work on top of my latest commit and just moving it forward in the timeline.

This is great because:

1) It conveniently preserves the older versions inside my commit so I can go back in time if I need it.
2) It creates one commit per feature so that someone checking out the code base later on can easily see all the code where one feature is built.
3) It doesn't fatten the codebase with lots of commits, but rather only shows up the one where all code is seen together
4) It doesn't generate an extra merging commit it actually doesn't need.

Also important to note is that because I updated the commit, I am playing wizard with git's regular timeline. This means I have to force the push otherwise git won't allow it.

(Disclaimer here: I was afraid of forcing the push the first time I did it too. But a smart dev told me that at the end of the day, I was only working in my own local branch anyways so allll shall be fine under the Tuscan sun.)

And just like that, once our Pull Request is merged into master, you can run git log and see the clean, elegant timeline we've just created without any unnecessary merge commits.

Because at the end, your git tree is the only tree that is better off without leaves or branches. 🌳

Hope this was helpful :) spare some love if it was 💙

Top comments (13)

Amin • Jan 28 '20

Hello Juliette, nice article. This is by far the best explanation about the differences between rebase and merge in git!

I found on GitHub that you can do a squash merge. I learned to love it because it can squash (as its name indicates) all commits into one. From there, I always write something like Fixed an issue where the cat wasn't cute enough (#152) in the squash commit message. This allows having as many commits in the base branch as there are pull requests, keeping the history count low on the base branch. And if I need more details (meaning more commits history for that particular feature), I can click on the #152 which refers to the pull request ID.

What do you think? Is it a good idea in your opinion?

David Canning • Jan 28 '20

Good article, not sure I like the amend, think I prefer to squash just before rebasing back to master just so I can get to all my commits while working on feature branch.

_juliettech • Jan 30 '20

Yeah, I've used squash as well. I just feel like amending is easier and safer, especially if you're just starting to play with rebasing.

What would you say is the added benefit of squashing rather than amending?

David Canning • Jan 31 '20 • Edited

I'd say the main benefit of doing squash is that you still have multiple commits in your history up until you do the squash, so if you screw something up you can still look back through your commits to either get back to before, or look at the changes to figure out what you did. With the amend, you're adding your new changes to the previous commit, so you've only got the one commit in your history the whole time, no way to look at previous commits, because there aren't any.

Andrew Clayton • Feb 6 '20

Often --amend is exactly what you want.

Maybe you need to re-word the commit message, add a Reviewed-by tag etc... or perhaps you just need to make a code change or you forgot to git add [-p] something...

I would say I use --amend a lot more often than squashing commits.

The right tool for the job and all that...