Denys Grybov

Posted on Mar 7, 2021

On git. History

#git #teamwork

Main development
Post-review stage
Synchronization to a history
- Merge
- Rebase
- Cherry-pick
- Improvements
Cookbook

Rewriting git history is an anti-pattern. It affects all the contributors. However, feature branches are not a part of history. Feature branches are not more than a proposal to history adjustment. You put your changes on top of existing history. But you are free to manipulate the changes inside a feature branch in any way. This article is entirely about updating, reordering and other manipulating commits easily.

This is a follow-up article for my initial one that describes WHY you need to keep the history clean.

Main development

Break it down

You created a branch and working on your task. If your work assumes changing different modules, commit changes to them separately. The main reason is to keep messages like "Updated module A" from files related to module B. Additionally it will make it easier to update commits in the future.

Another good thing to do is to have only one commit per file or at least for a code area (function/method). Your pull/merge request contains the defined scope, and, most likely, it means one change for each code area. Some exceptions may occur but let's focus on a majority of situations.

⚠️ One general note related to all future changes: after pushing your branch to some remote, you will need to use force push to update it after most of the following manipulations.

amend

"Small fix" is one of the most widespread commit messages. It has absolutely no sense, it fixes a bug that didn't exist before you started working on a feature, and it will not exist after you merge your change into the main branch. It is just a follow-up. And this is where you need

git commit --amend

It updates your last commit with staged files. Instead of creating another reasonless commit, you can simply update a previous one.

See detailed example

Let's review a situation when you committed two files: module-a and module-b. Then you realized that module-c needs to be delivered within the same commit.

0. Let's see the changelog of the latest commit

git --no-pager show --pretty="" --name-only HEAD

module-a
module-b

1. You stage changes to adjust a commit

git add module-c

2. You amend the last commit

git commit --amend --no-edit

--no-edit flag is for skipping editing a commit message. If you want to update a message, don't use it. Without this flag, a prompt appears. As an alternative, you can set a new message without a prompt by adding -m "Your new message" prop.

3. Let's check what contains the latest commit again

git --no-pager show --pretty="" --name-only HEAD

module-a
module-b
module-c

Such adjustment to a commit changes a commit hash but leaves the original commit date.

If you need only to update a commit message for the latest commit, you can do:

git commit --amend -m "Brand new updated commit message"

It is useful to fix a typo or update a link to the ticket. Or to specify a command for GitHub once you have an automated process.

See detailed example

0. Let's see the two latest commits in our branch

git --no-pager log --oneline -2
c3855f2 (HEAD -> some-branch) Fix <bug description>
b7192d4 Implement new feature A

1. Now, we need to add a link to our issue tracker

git commit --amend -m "Fix #42 <bug description>"

2. Let's see what changed

git --no-pager log --oneline -2
16f20a0 (HEAD -> some-branch) Fix #42 <bug description>
b7192d4 Implement new feature A

As you can see, the message and commit hash changed

Once you need to update one of the previous commits, interactive rebase will help you.

reset

There are two types of reset: hard and soft.

--hard reset allows removing some extra changes. First, it gives an ability to discard the wrong approach for your task. Second, using hard reset, you can cancel a commit from another branch applied with cherry-pick if it is still on top.

--soft reset allows you to add changes step by step and then remove commits keeping your changes as a pure diff to the original main branch. It may be useful to perform significant changes, try different approaches. Or simply to commit "WIP", push the code and continue working on another machine. Using soft reset, you can commit whatever you want and then turn your last five commits into just staged changes.

git reset --soft HEAD~5

But don't overuse this approach. With more changes, you will get more work organizing them properly. It's better to split commits as changes to different modules/subtasks while you work and then reorder them and squash using interactive rebase. My suggestion is to have up to 1 WIP commit and split it into meaningful ones once you continued working on a task.

Staging patches

While you most likely shouldn't make multiple commits for one file within one PR, sometimes multiple commits are reasonable. For example, once you need to integrate modules A and B into C. You can commit A, then B, then C. Otherwise, you can do it in two commits: commit module A changes and the integration. Same for B. This is where you need interactive staging.

While git add module-c will stage the whole file, adding -p flag allows you to decide what to do with every hunk.

git add -p module-c

Your options are:

y - stage this hunk
n - do not stage this hunk
q - quit; do not stage this hunk or any of the remaining ones
a - stage this hunk and all later hunks in the file
d - do not stage this hunk or any of the later hunks in the file
g - select a hunk to go to
/ - search for a hunk matching the given regex
j - leave this hunk undecided, see next undecided hunk
J - leave this hunk undecided, see next hunk
k - leave this hunk undecided, see previous undecided hunk
K - leave this hunk undecided, see previous hunk
s - split the current hunk into smaller hunks
e - manually edit the current hunk
? - print help

If you want to see a preview of a file, you can run:

git difftool

It will show you what was changed and show you all of the hunks.

This article is about using git itself without any external GUI. But this is the only git feature that may be more reasonable to use in your IDE. However, git tools integrated into IDEs can be less precise, showing hunks comparing to cli.

Post-review stage

After getting a pull/merge request review developer gets comments. And this is the time when "address review comments" commits appear. But once again, your branch is not a part of history, and your commits are not a part of history as well.

Your goal is to implement something. To fix a bug, to implement a feature, perform a refactoring. But addressing the reviewer's comment is not your goal. There's no reason to mention the review stage at all. You create a product. Describe changes to a product. You are changing a product. Describe changes to the existing history. There's no need to mention comments, bugs created in your branch, and missed edge cases (unless this edge case is covered by really significant independent change).

Your goal is to add to history some complete change, there is no need to split commits on before review and after. You can quickly provide a tidy change using interactive rebase - one of your best friends in git ecosystem.

Interactive rebase

Interactive rebase allows manipulating already existing commits. You can reorder them, edit, remove extra ones and squash them.

Let's see how we can modify existing commits to apply changes addressing review commits.

To start interactive rebase and make updates within the last 7 commits, we need to run a command:

git rebase -i HEAD~7

Edit

If you need to make some small updated like remove logging, fix a typo, your best option is to switch pick to edit while interactive rebasing. After that, you will be able to edit selected commits in the original order. After rolling back to a chosen for edit commit, you can make your changes and then

git add .
git commit --amend [-m "Your new updated message"]
git rebase --continue

Please keep in mind that you will probably have to resolve conflicts while applying the next commits if you make some significant change. That's why it's so crucial to update a single code block only once within a pull/merge request.

Commit per comment

If you need to apply multiple and significant changes, I suggest addressing review comments one per commit (but if a comment covers numerous modules changes, then split this change per module and address a comment within various commits). It will allow applying changes on top of existing commits without mentioning extra fixes and provide only meaningful and essential changes in a request's scope.

To make it easier to integrate these changes into existing commits, I suggest giving your commits some temp name that will explicitly show where to put it later.

Squash/fixup

Interactive rebase allows not only edit commits but also reordering them. So after addressing review comments as comment-per-commit, you can place your new commits right after the original changes. Then you can apply squash or fixup to your follow-up commits. Both will merge all the changes marked for squashing commits to a previous one. The difference is that squash will ask you to provide a new commit message for a squashed commit, while fixup will only merge code changes and pick the message from an original commit.

Synchronization to a history

There are two main reasons to synchronize your branch with history: conflicts and improvements. Conflicts mean that you touched the same code area as someone else who already applied history changes. Improvements mean that you need to add newly added functionality in your branch. It can be some helper functions, build optimization, refactoring that improves performance. Here are the most useful options to do it.

Merge

Merge can be performed in one of two strategies: merge commit and rebase. Merge commit allows integrating a branch into a target branch by creating a commit on top of both. It gives an ability to resolve all of the conflicts at once. Merge commit is a perfect result of a pull/merge request.

To merge a develop branch into your one run:

git merge develop

Or you can merge it directly from a remote

git pull origin develop

Rebase

While creating a merge commit is a great final step of working within a branch, adding changes from the main branch to your feature branch can be done in a more sophisticated way - rebase. Rebase allows applying all your changes on top of another branch. It means that after

git rebase develop

will take the recent "develop" branch and apply all your locally created commits one-by-one. It also means that you will need to resolve them while git applies any new commit in a case of conflicts in some code area. This is why it's crucial to have only one commit for every line of code and not override it in multiple commits.

You can also rebase your branch on a remote using:

git pull origin develop --rebase

Merge strategy allows resolving commits at once; that's why it is more attractive from the first look. But merge strategy blocks you from manipulating your local changes while our goal is to keep the history clean.

Cherry-pick

Cherry-pick is an excellent command that allows applying a commit from another branch on top of your one. Using cherry-pick, you can apply important update existing in another branch without pulling/merging the whole branch and keeping your branch's scope relevant. So right after someone pushed a change, you can apply this commit on top of your branch.

git cherry-pick <commit-hash>

The problem that cherry-pick solves is "Hey, if you need my change, just copy changes from this my open PR".

Improvements

Working on some task, a developer can perform some refactoring. It may be a refactoring of a task-related module or some more common module. It can be a fix or improvement that may help all the rest of a team, like accelerating the build process. It's good to commit such modifications as a separate commit and share it with the rest of the contributors.

The first and straightforward way to share such changes is cherry-picking. After pushing your changes, you can describe your improvement in a chat and share a commit hash. Alternatively, you can create a separate branch and push this improvement there, but make sure that this change will soon be merged into the main branch. Otherwise, there's no need for a separate branch.

Cookbook

I created a gist to collect short step-by-step guides of keeping history clean. Please make it better by suggesting your own tutorials.

Forem