DEV Community

Cover image for Git Hygiene
Valerii Udodov
Valerii Udodov

Posted on • Originally published at valerii-udodov.com

Git Hygiene

git

Not by coincidence Git has become a default version control system for many of us. Git has proven itself to be predictable and reliable. Especially when it comes to conflict management, history traversal, and other daily developer routines.

I have collected some recommendations that will enhance your day-to-day experience of working with Git. And hopefully, reveal some hidden features that have been laying around all this time.

This article is not an ultimate Git guide by any means. Rather it is a set of common sense practices for using Git effectively.

Branching

Strategy

There are ultimately two main branching strategies feature branches and trunk-based development. Everything else is just a variation of one or another. The biggest distinguishable difference between them is a scope of change that gets merged into the trunk branch (aka master, main, core, mainline).

Feature Branching is a poor man's modular architecture, instead of building systems with the ability to easily swap in and out features at runtime/deploytime they couple themselves to the source control providing this mechanism through manual merging.
Dan Bodart

Use trunk-based development when possible. I won’t argue about the advantages of CI/CD, at least not in this article. I’ll just say that as the name suggest you can’t have CI/CD without Continuous Integration (CI). And by definition of the CI, you can’t have it without trunk-based development. Hence you can’t have CI/CD without trunk-based development.

ℹ️ Trunk-based development is a version control management practice where developers merge small, frequent updates to a core “trunk” or main branch…
Atlassian Trunk-based development definition

ℹ️ Continuous integration is a DevOps software development practice where developers regularly merge their code changes into a central repository, after which automated builds and tests are run. Continuous integration most often refers to the build or integration stage of the software release process…
AWS CI definition

The most reasonable alternative to trunk-based development is Gitflow. Given all the modern tooling for automation of testing, deployment, and releases Gitflow happens to be useful, but rarely. Deployment Hell that will unavoidably follow using Gitflow will likely cross any benefit you may think of.

Trunk-based development

trunk-based development
The diagram above was inspired by LeRoy's illustration

Unfortunately trunk-based development is not always available right away. As the name suggests it is a process. Development process. And it has certain entrance criteria.

  • It should be possible to develop in small batches. It may be not only the developer’s prerogative as the work scope depends on planning. If the work is sliced in small chunks during the team planning then developing in small batches is a no brainer

  • The code review process should be balanced.

    • Avoid overly heavy code-review process. Otherwise, developers will attempt to squeeze in as much work as possible to minimise the number of code reviews they have to go through.
    • Somewhat opposed to the previous point, don’t make it a box-ticking exercise
    • Code review is work and has to be recognised as such. It should be assigned and executed consciously.
    • Code review should be a priority. Active branch lifetime should be minimised. Hence merged asap. So if the code is ready it should be reviewed asap.
  • The trunk branch should always stay healthy. Trunk branch health is an integral aspect of team performance. Thus there should be no way to break trunk branch.

  • Reviews, automated tests, and builds should be exhaustive and efficient.

Naming

The most common use case for branch names is when we are looking for a specific branch to checkout. Consequently, the more discoverable branch is, the better.

Usually, we know either the ticket number or the gist of work that has been (or will be) done there. However, we rarely know the exact name. It is often a good idea to incorporate a ticket number together with a distilled summary of the work. E.g. 123654-form-buttons-styling-uplift or fixing-query-parameters-encoding-bug-344400

Clean up

Another pretty obvious one, but often overseen. Once the branch is no longer needed, it must be removed. The lower the number of obsolete branches, the better discoverability of active branches. Nowadays it is very easily automated task

Commits

Message

I like to think about a branch as a story that the developer tells about the work that has been done. Where the branch name is the name of the story. And every commit message is a sentence that builds the story. The frequency of commits and the content of commit messages should make sense in the context of the story you are trying to tell and should somewhat follow one another. Writing a commit message should be an intentional process, not a side-effect of using Git.

📌 Commit process must be seamless. Try to eliminate all possible obstacles.
Make sure you can commit from your IDE, and memorise the shortcut. Remove all unnecessary commit hooks, especially long-running ones.
Commit should not feel like a burden, it should feel like saving a file.

Size

You should commit often, therefore the size of each commit must be reasonably small. Base your commits around logical units of change. As long as you do that, the commit message will be accurate and obvious. Make sure you don’t get carried away as commit messages must stay concise, remember it is just a sentence in the story, not the story on its own.

Commit History

We have slightly touched base on the commit history before. Depending on the scope and the context we will be perceiving and using commit history differently.

ℹ️ To avoid any confusion I will call main branch that all the work is based on trunk branch. And any branch that is based from the trunk branch but not yet merged back in: Work in Progress branch or WIP branch.

In the context of the WIP branch (e.g. if you are reviewing someone's work) we are interested in work that has been done in the scope of a given ticket/task/story, hence the more details provided the better.

However, when we are looking at the trunk branch we are looking at the history in the scope of the project as a whole. And hardly there will be much value in a very granular view.

WIP Branch History

Generally ambiguous commit history on this level is a side-effect of poorly planned work. You must know at least in broad strokes how you are planning to tackle the task. As long as the work is reasonably planned using commits to craft a well-written story will be a natural and effortless act.

Take a look at 12332-uplifting-checkout-form branch history

12332-uplifting-checkout-form
- introduced a theme provider
- updated buttons to use the theme provider
- updated text fields to use theme provider
- removed redundant info banner
- fixed confirmation text typo
- addressed review comments
Enter fullscreen mode Exit fullscreen mode

It tells a story, isn’t it? By looking at the branch and commit messages you can see what has happened, if you require more details you can get down to the code level. Try to avoid any code specifics in the commit message, variable names, classes, etc., anything that will be anyway visible on the code level.

Ideally, you should be able to review what has been done by only looking at the commit history, although I would not recommend doing so.

Trunk Branch History

The most often use case for commit history on this level is during the process of figuring out wtf have happened with this code…

🙃 git blame ruining friendships since 2005

I think this will sound familiar to you... You are working on a ticket and you stumble across a piece of code that you need to use/change/refactor. It works somehow but looks rather confusing and illogical. The code has been there for months or perhaps years. It has been written by someone who left the company ages ago... And yet there you are one on one with it…

💡 What is worse than no documentation? Wrong documentation.
Comments explaining how code works are no better than documentation. Both get stale very fast. However, comments not only are rarely updated but often get copied all over the code base, completely losing even remote value.

The biggest advantage of having fluent and articulative history is the ability to time travel and answer the very important question of why? something has been done in a certain way. You can go to “day 1” and see how this whole space ended up where you found it. If you are lucky enough you will find ticket numbers attached to merge commits, so you’ll be able to get even more context.

💡 Mark important events (such as releases and versions) in the history with tags. You can use annotated tags to store some extra metadata.

Merge Strategy

In most cases, I’d suggest using Squash Merge as a trunk branch default merge strategy. Squash Merges will clean the trunk branch history from unnecessary noise that otherwise will leak from WIP branches history. Squash Merges will protect trunk branch integrity and concision. It will keep the history linear, which is much easier to read and rebase on.

When all the work has been done and changes have been reviewed. You are one click away from the merge, it is hard to find a good reason to preserve the WIP branch history anyway.

Merge Conflicts

While you are working on your WIP branch you shouldn’t be overwhelmed with conflicts. If you are, it is typically a consequence of sloppy work planning and mediocre orchestration. The WIP branch lifetime should be short.

However, if you encountered merge conflicts… Rebase. As said before linear history is much easier to read and comprehend. Hence much easier to traverse while traversing the history through the git blame

💡 Before rebasing make sure you know what has changed since you started your work or bring someone who knows. Otherwise, there’s a high chance you will end up with a broken WIP branch.

Bottom line ___

I have tried to distill recommendations and keep them as generic as possible, avoiding any specifics related to the Git VCS provider, automation tools, or company processes. Said that given any specific tool, any of these practices or recommendations can be (and should be) improved significantly.

Hence the final advice: don’t keep your process agnostic to tools, rely on them and use them to the maximum extent.

Top comments (3)

Collapse
 
dbolser profile image
Dan Bolser

Can't see how to cheer articles, so... Yay!

Collapse
 
jnareb profile image
Jakub Narębski

I disagree with some of recommendations here.

First, trunk-based workflow has the disadvantage that for more involved features, that need to be created in small incremental steps, requires that you incorporate version control tasks into your application in the form of feature toggles.

This is the opposite problem of feature branches done wrong, where branching substitutes for modularity.

In summary, both trunk-based workflow and feature branches have their advantages and disadvantages.

Second, I strongly disagree with the suggestion of squash merges. Unless you have a very messy history, I recommend against their use; and if you have messy history, interactive rebase before merge is your friend.

Squash merges lose information, and lead to large commits - something that you recommend against. While arguably that is not the problem for the code review process, which should be done before the merge, it would mean that browsing through history, and using git bisect to find bugs will be less effective.

If complex history is a problem, use a better viewer. git log (and thus also gitk) has --first-parent option for a reason.

Collapse
 
vudodov profile image
Valerii Udodov

Hey Jakub,

Thanks for a great comment, appreciate your input.

I didn’t want my suggestion regarding the trunk-based development to sound anything like “one size fits all”. I can see scenarios where feature branching for long-running projects is… I wouldn’t say preferable, rather unavoidable.

Nevertheless, I wouldn’t necessarily consider feature-toggling a disadvantage. And even if I will, I would still recommend using trunk-based development whenever it is possible.
With all its possible advantages (which I personally wasn’t fortunate enough to experience in my practice), due to its nature: feature branching will require merging orchestration and extra testing efforts afterwards every release… Which overweights any possible advantage feature branching has to offer.

Regarding squash-merge, I can see why you wouldn’t want to use it with feature-branching. However with trunk-based development there’d very low (if any) advantage in preserving history of every branch, but the noise created would be rather disturbing.

All in all I guess we always reflect on our own experience. And I think a lot would be added to it by processes, release requirements and expectation that surrounding development routine.