Imagine this: you've been paged to investigate a production incident, and after some digging, you identify the commit with the breaking code. You decide to revert the change:
$ git revert 1a2b3c
Unfortunately, in doing so, a new bug is introduced! As it turns out, hidden in that old "broken" commit was some code that another part of the app depended upon, and when you reverted those lines, it left the site once again in a broken state. π Oh dear.
How can situations like this be avoided? To answer this question, we first need to examine how these types of commits come to be.
A Common Git Flow
Let's take a look at a common git flow when building a new feature:
- Create a new branch off of
main
. - Create commits as you go to save your work and fix bugs you find along the way.
- When the feature is complete, make a pull request.
- Merge branch into
main
once PR is approved.
This flow might feel quite familiar to you, and that's fine. It's how many of us were taught to work with git. However, there are two problems with this approach. The first problem we've already discussed is that some commits may contain incomplete work when you simply commit as you go. This makes reverting quite risky.
The second problem is that it can make reviewing pull requests very tedious. For example, what if you've been asked to review a recent PR in which the author states that, on top of adding a new feature, they fixed an unrelated bug as well. The PR consists of changes across dozens of files. Looking at each commit individually does not illuminate which changes pertain to the bug fix and which are for the new feature. Additionally, you notice some changes that seem unrelated to anything in the description of the PR. Clearly, this will not be a quick review.
Now, as lovely as it would be for each commit to neatly relate to only one change, that's a tall order to fill when you're in the midst of development. Tangents and rewrites are just part of the process. Our work is rarely so linear, and our git commits tend to reflect this.
So how can we guarantee that our git history is tidy and easily reviewable while also accepting the somewhat tangential nature of development? By modifying this basic git flow just slightly, we can create a better process that accomplishes just this.
An Improved Git Flow
The following approach was inspired by my coworker, Dan Wendorf, whose git flow tends to revolve around one core principle: do the work first, clean up the commits later. The benefit of this flow is that it separates the engineering work from the commit writing. In the end, we'll be left with a sequence of commits that are logically grouped, each relating to one main change in the code, thus cleaning up our git history and paving the way for a quicker PR review.
We can break it down into three steps, as follows.
Step 1: Make your changes
The first step isn't too different than before. Start by creating a new branch and getting to work on making your changes. Don't worry too much about writing descriptive commit messages just yet, as these won't be included in your final PR. For now a simple, "work in progress" or "WIP" message will do, or something that will help you remember what was in that commit like "WIP: Started building new model". The purpose of these commits are to make sure you don't lose work and provide some general guideposts along the path of that work.
$ git checkout -b my-feature-branch
...make changes...
$ git commit -m"WIP"
...make more changes...
$ git commit -m"WIP"
...make even more changes...
$ git commit -m"WIP"
In this step, it's okay to leave the codebase in a broken state or to commit half-baked features. This will all get cleaned up later.
Step 2: Reset
Once you've finished making your changes, it's time to prepare your work for some "git clean up." To do this, we'll run the following command:
$ git reset origin/main
Without any extra arguments, git reset
won't change the working tree, so your code won't change β all the work you've done will still be there. But because you've reset to an older commit, git status
will show all the changes you've made since you started building your feature. It will look like you did all the work but never made any of those "WIP" commits earlier.
$ git reset origin/main
Unstaged changes after reset:
M src/components/Footer/Footer.tsx
M src/components/Nav/Nav.css
M src/components/Nav/Nav.tsx
M src/components/Posts/Post.tsx
M src/components/Posts/PostList.tsx
$ git status
On branch feature-branch
Your branch is behind 'origin/feature-branch' by 3 commits, and can be fast-forwarded.
(use "git pull" to update your local branch)
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git restore <file>..." to discard changes in working directory)
modified: src/components/Footer/Footer.tsx
modified: src/components/Nav/Nav.css
modified: src/components/Nav/Nav.tsx
modified: src/components/Posts/Post.tsx
modified: src/components/Posts/PostList.tsx
In case you get in over your head here, don't worry β you can always bring back your original commits! Every commit you make lives in your .git
folder, even after a reset
. Even though it might seem like they've disappeared, they're still there, hiding.
If you want to go back to a commit where things weren't broken, git reflog
will show you a timeline of every commit you've referenced in your local repository, even across branches. Run git reflog
to find the commit you want to return to and then run git reset <commit-sha>
. This command will point the HEAD of your current branch to that commit, and you're back in business!
From here, we're ready to start making our new commits.
Step 3: Create new, logically-grouped commits
Now, take a look at all the files you've changed. Are there any that you can logically group? For example, all the dependency updates or changes related to a particular model. There's no "right" way to group files, so use your best judgment here. Add these files to your staging area, and make a commit describing the changes.
$ git add src/components/Nav/Nav.css
$ git add src/components/Nav/Nav.tsx
$ git commit -m"Added new styles to navigation"
$ git add src/components/Posts/Post.tsx
$ git add src/components/Posts/PostList.tsx
$ git commit -m"Updated author images on posts"
$ git add src/components/Footer/Footer.tsx
$ git commit -m"Fixed responsive bug in footer"
If you haven't changed many files, you might not need more than one commit, but we can often make our pull requests much easier to review by splitting up our changes into human-readable, easy-to-follow commits.
What if the same file contains multiple changes that should be grouped separately? It's possible to stage part of a file using git add --patch
(or git add -p
). Some code editors also provide a way to stage a range of changes rather than a whole file.
Be mindful of not leaving your codebase in a broken state during this step. Remember, a huge reason we're cleaning up our commits in the first place is so that nothing will break if we ever want to revert our changes. After making one of these new commits, you can git stash
the rest of the unstaged changes and test that everything's still in working order. If you realize you should have included another file in that commit, you can git stash pop
to bring back the other changes, git add
the missing file, and perform a git commit --amend
. This command will replace the last commit with a new one with the same description, including the old commit and the change you just made.
The Final Result
Once you've split your work into logically grouped commits, you're ready to create your pull request! The final result is a set of changes that your colleague can review one commit at a time in manageable chunks.
The benefit of this git flow is that it allows for the fluidity of typical development while also providing some much-needed order to maintain the repository's history.
Top comments (48)
Cleaning up your local commit history before pushing is the norm in any half serious project. Happy to see you introduced the concept to so many new people!
However (and feel free to ignore):
Using the
git reset
command for this purpose is like cutting your toenails with a lawnmower whilst blindfolded. This job would be so much faster, easier, and far less risky withgit rebase
instead. No thumping headache or teetering on the edge of insanity afterwards either!Use fixup/amend commits to modify changes you've committed earlier (but have not yet pushed). These commits receive a special mark that Git will be looking for during a rebases with the
autosquash
option. Git will figure out how to get these changes in with the rest, and in such a way that no one will ever even know they happened.No matter how bad or embarrassing your brainfarts ever get, Git will be there to help you cover up, forget, and move on. A friend like none other :)
Another one of rebase's neat tricks is its
interactive
option. It'll presents you with a list of your recent local commits and lets you rename/reorder/undo/or otherwise modify them as you see fit. All together in one go. Git will even warn you if your musical chairs session would result in conflicts - allowing you to abort, and puzzle some more to avoid unnecessary merge commits.Just like that you're done, no weird out-of-branch experiences, no worries about rewriting history, caught up that old Git again, got your commit in a state that's actually somewhat presentable, and all that within a fraction of the time.
Git Commit docs here: git-scm.com/docs/git-commit
Git Rebase docs here: git-scm.com/docs/git-rebase
Git grokking :)
git rebase
can save time in many situations, but it wonβt help much if your rebase plan looks like this:Some real-world storyβ¦
I was debugging to find out why my Git deployment pipeline failed, so I had to make a lot of small tweaks to the pipeline script, commit, and push to see if the CI build is back working. Since frustration is rising, having quick feedback loop is very important, and so any extra cognitive load has to be reduced.
Composing a commit message required context switching between βproblem-solving modeβ and βreflecting-and-summarizing modeβ which incurs significant load on my brain, so it has to be skipped. In the end my Git log looks like this (this is from an actual project):
In this case, to clean up the history,
git reset
would make it easier. I donβt think thatgit reset
is dangerous or risky at all: It never touches the working directory, and if I messed up,git reflog
always has my back. (git reset --hard
, on the other hand, is a whole another story.)This reached a state so far beyond a messy Git history, that it turned into a work of art. Linus Torvalds would be proud! π
I agree that the reset tool would have been a solid choice here. Comparing diffs/patches works well if you have only a few unknowns. Looks like you had slightly more to cover here..
My first thought too was "why not rebase?" but, and maybe I'm just doing it wrong, I've always had trouble if I want to change a commit, or if I want to put only some of the changes from one commit into another. I like this reset approach as an option at least to get my commit history the way I want.
If you start modifying commits during a rebase (which you likely will), then you run into the potential for conflicts, especially if you are reordering.
If you get into such situations and don't understand
$ git rebase --abort
Take smaller steps during the rebase. Move one commit (only a few commits back). Make only one change. Then rebase again for more changes.
Your bio checks out, good advice to take an incremental approach here!
Using fixup and amend commits won't eliminate these situations completely, but they'll definitely help towards keeping track of the commits without having to dive into their individual diffs.
Author here β hello! π
I didn't mention it the article, but I also use rebasing quite often! Especially to keep my branch up to date. You can also use interactive rebasing to achieve the same goal, it really just comes down to what works for you. Using
rebase -i
is helpful if you know that at least some of your commits can be left in tact, or don't need as much modifying. In fact, once you get comfortable enough with it, usingrebase -i
can be even faster than doing a fullreset
, because you're not having to re-write as many commits. πͺThat said, staying focused on development while also thinking about "Wait - should this work be in a separate commit?" requires some mental task-switching, which can be taxing on your working memory if you have an overactive neurodivergent brain like I do. I've found that the method described in the article frees up so much mental bandwidth because I can hyperfocus on one goal at a time, be that writing code, or organizing my commits.
If the goal is ultimately to have a sequence of clear, concise commits that are easy to review (or revert), then it really doesn't matter which method you choose. :)
Absolutely do what works best for you. As long as your pushed changes fit the branching strategy you're golden. Well, and not waste too much time putting back together the history you *just fixed* of course π
I'll try to get you an example sometime next week. You might not sell on it, but at least you'll know how it works!
Totally agree with that comment.
Notice some good stuff you may also use:
git reset HEAD^
in order to reset last commit (useful to split commit in the middle of a rebase). Notice that some IDE will provide built-in tooling for that.git push --force-with-lease
instead of--force
; that's gonna check if anyone pushed in the meantime (it should not, but I consider it as a good practice since it will warn you)I agree. Rebase is better option!
Very nice post. I use this process as well.
The only step I sometimes do differently is: Before resetting my feature branch, I copy it. I branch off the latest commit of my feature branch into a "temp" branch. Then I follow your steps.
However, if I mess up my afterwards, I can just reset my feature branch again and fast-forward merge my temp branch into my feature branch, bringing it back to the original commits before I reset anything. This means I don't have to mess around with the reflog, which can be quite complicated.
Thanks and keep up the good work π
Instead of copying, you could make an empty commit (and tag it, maybe), so you would have the exact point-in-time where to go back if needed. Could be something like:
Your branch-copy approach is interesting as well, because you can just checkout to the backup branch and go back to where you wanted.
nice write up, it is a lot of reading though... lol. I follow a very similar process myself, adhering to the traditional "commit small, commit often" concept as closely as possible.
if I end up resolving unrelated bugs while working on a feature branch, I normally do the following (assuming a "single main branch" workflow, using a "fast-forward merge only" strategy with no merge commits):
general-fixes
frommaster
feature-x
feature-x
ontogeneral-fixes
to verify thingsgeneral-fixes
intomaster
feature-x
..quite often this happens multiple times a day, so it doesn't take long for it to become habitual.
..and I apologise ahead of time, but I just need to rant a little bit here: I really, really, really wish I could work with people that understand and appreciate Git. :(
one of my work colleagues is of the opinion that rebasing branches after cherry picking commits actually creates duplicate commits.. but he also manually synchronises entire folders of code between multiple projects, and refuses to use any code formatting tools.. so, meh. :insert_epic_facepalm:
I really understand your rant, since I always went through this. But I try to advocate for git and better git flows as much as possible.
I love to commit often, too. I am a branch addict and create sub branches for branches I'm working on, only to (try to) better organize my job. But this
git reset
approach really seems to work for me. Can't wait to try this soon!As a branch addict, I always create WIP branches only to check them out later, some specific file that worked and so on. Again, with this
git reset
approach I believe this will not be necessary anymore.I am learning to code and I think this post will be a great solution to future projects I am apart off, Thanks
Glad you found this helpful!
If the goal is to end up with a series of well organized, self-contained, logically sensible commits, doing it while you are coding them, with fresh memory of what is for what, is still better than accumulating it all until the end, starting over and trying to sort through a big pile of changes.
Remember when you are wrapping up a feature branch, you have just gone through a lot of stress of coding, and are likely under a lot of pressure to deliver. At that point, I doubt many people would have the appetite to sort the 74 diff blocks in that one file into several separated commits. They would just lump them all up into one big commit.
This is a really interesting approach which I hadnβt considered. Thank you. Reviewing some responses it seems that get shares a problem found in other tool advocate camps of righteous zealots suggesting there is one single right way to do things. One of my favorite things about get is thereβs so many different ways to do things permitted by the many different commands available. This flow that you proposed is interesting and I could see it being useful but I donβt have to automatically jump on the bandwagon of saying this is how it ought to be done, nor do I think that you were suggesting so. Similar to the many commands that git offers, this workflow is one of many workflows that might make sense at any given point along side and with others at different times, in different branches, and in different projects.
Hi Annie
Thank you for the short nice article. It is very helpful.
One point you must do to avoid misunderstanding and keeping your article as simple as it is:
keep the name of the branch as it is.
At the beginnging of your article you create my-featrue-branch
But later you when check the status after the reset, the branch name is different:
This is confusing and will make newbies more unconfident.
It would be a pitty if unexperienced people get conused
Thank you for sharing :-)
According to my, well-working, standards, my
git-workflow
is as follows ( lets takerepoA
as my project's repo, andlocal
as local laptop/workplatform; )Make changes, than
Terrible idea.
It can be used as "last chance" when you totally messed up your commits but you should never use it for normal process.
Just commit often (it's your branch after all) and use "git squash" or "git rebase -i" before creating pull request.
If you made mistake during squash/rebase β "git reflog -> git reset" and try again.
We squash merge PRs
why squashing merge PRs? Simply merging is perfectly fine.
Probably because he doesn't want to "pollute" the destination branch with all the detailed and sometimes maybe incomplete commits. So, only one commit containing all the previous commits as comments may be better for his needs :-)
Some comments may only be visible to logged-in visitors. Sign in to view all comments.