James Nesta

Posted on Jul 28

How Should You Use an Auto-Formatter?

Introduction

Many code projects use an auto-formatter (like Prettier). Auto-formatters are fantastic, because they save developers an enormous amount of time and keep the codebase consistent. The rest of this blog assumes that you want to use an auto-formatter.

But at what point should auto-formatters be applied to the code? There are 3 stages to choose from, and you can choose one or more of them:

On Save - The code editor (IDE) auto-formats the code as soon as a file is saved.
On Pre-Commit - A pre-commit hook is used to ensure that all code is formatted before a commit.
In CI (Continuous Integration) - A robot auto-formats the code automatically as soon as a commit is pushed to the remote repository (using GitHub Actions, GitLab runner, etc.).

Different code projects use different stages. But what combination is best? This blog will discuss the tradeoffs involved.

Goals

In this analysis, we want to optimize for developer experience (DX). In other words, we want our developers to be as productive as possible. At the same time, we still want to ensure that bad code does not slip through the cracks.

Time Before Discrepancy Discovery

One way to think about DX is to frame it in terms of "the amount of time that passes before things become auto-formated". Or, more generally, "the amount of time that passes before discrepencies between the written code and the correct code are discovered".

In general, we want to minimize the time, because finding discrepencies as soon as possible allows us to fix those problems and iterate faster. Let's consider the three stages of auto-formatting checks with respect to time-to-discovery. We can plot them like so:

In this framing, we see that "on save" is the best.

Formatting on Save vs Formatting on Type

Auto-formatting on save has a low time-to-discovery, because developers typically save a file once they are done modifying it. This means that the developer can immediately see what the auto-formatter did to their code, if anything. But it is worth mentioning that there is a hypothetical option that is even faster: auto-formatting on type. In other words, you could imagine that the developer's IDE is configured such that the auto-formatter is summoned after every keystroke, instead of just being summoned on save. (This would be a dot on the extreme left side of the above chart.)

Maybe it goes without mentioning, but this option would not actually result in a better DX. Since auto-formatters move code around, having your cursor jump all over the place as you type would be confusing and probably reduce your productivity. Since auto-formatted code has an identical AST, the developer does not really gain anything by getting to see the auto-formatting changes that early in the process.

Manual Formatting vs Auto-Formatting on Save

Some developers do not like formatting on save and instead prefer to explicitly format the code with a hotkey. In this workflow, they would have to press both the formatting hotkey and the save hotkey once they are finished modifying a file. But this workflow has a failure mode: what if they forget to format the file? Now, formatting errors can leak into the codebase. The beauty of auto-formatting on save is that it avoids this failure mode.

We could also ask: what is the benefit to this extra, intermediary step? Since auto-formatted code has an identical AST, there is no real reason to ever want unformatted code on-disk. In weird situations where you really don't want the auto-formatter to be summoned (like when troubleshooting with whitespace), it is trivial to use the "Save Without Formatting" feature of the editor (which is mapped to Ctrl + K + Ctrl + Shift + S by default in VSCode).

For this reason, I think all developers should be encouraged to use auto-formatting on save. But if a developer really does not like auto-formatting on save, that's okay too, because repositories should also be checking for formatting in CI to ensure that nothing slips through the cracks. (More on that later.)

Formatting on Save vs Formatting on Pre-Commit

Both auto-formatting on save and auto-formatting on pre-commit accomplish the goal of preventing unformatted code from reaching the remote repository. So which is better?

Tool Visibility - Formatting on Save Wins

Auto-formatting on save is nice because the developer gets to immediately see what the auto-formatter did, if anything. But when auto-formatting on pre-commit, the changes are performed in an invisible fashion.

Realistically, since auto-formatted code has an identical AST, this distinction does not matter much. (Meaning that the developer probably won't care about formatting-related-changes in either context.) But there's an argument for having tooling be as transparent as possible, because it helps us understand what it is doing. This is especially helpful when the tooling fails and you need to debug it!

Installation Automation - It Depends

Turning on auto-formatting on save in an editor can be somewhat automated. For example, if the team uses VSCode, then ".vscode/settings.json" and ".vscode/extensions.json" files can be introduced to the repository to provide good defaults and extension notifications, respectively. Other editors like WebStorm have similar features (via ".idea/workspace.xml").

The installation of pre-commit hooks can only be automated in certain cases:

When using the Python pre-commit framework, pre-commit hooks have to be manually installed by the developer after the repository is cloned.
When using the Husky pre-commit framework, pre-commit hook installation can be somewhat automated by way of an npm prepare script. (In other words, they are automatically installed when the developer installs dependencies with e.g. npm ci.)

So, with respect to automation, Husky is probably the clearest winner (contingent upon a JavaScript/TypeScript codebase in which developers reflexively install dependencies after cloning).

Editor Agnosticism - It Depends

When considering DX, it is desirable for a codebase to be editor-agnostic. In other words, if a codebase is not strongly coupled to a specific editor, then developers are free to use the specific IDE that they will be the most productive in. So, if we care about editor agnosticism, then pre-commit hooks are the clear winner.

On the other hand, if everyone on the team uses the same editor, then you aren't really gaining much by being editor-agnostic. And at the time of this writing (July 2024), most of the industry has consolidated around VSCode for many languages. (Some even consider VSCode to be better than paid-for alternatives like WebStorm.) Languages like Java won't necessarily have one-editor-to-rule-them-all, but if your language does, then the benefit of being editor-agnostic is obviously lessened.

Configuration Maintenance - Tied

Both editor-specific files and pre-commit files must live inside of the repository. Thus, the maintenance burden between them is a tie.

(Of course, it is worth noting that the tie would be broken in the case of the repository having to maintain two or more sets of editor files. This is not common and I wouldn't recommend doing this.)

Git Commit Velocity - Tied

It is common for skilled software developers to make lots of tiny commits. The basic idea is that smaller commits are easier to understand for others on the team, and allow for more granular reverting when things go wrong. This is documented in books like Clean Code by Robert Martin.

Obviously, auto-formatting on save does not interfere with developers making commits. But pre-commit hooks do, and are notorious for inhibiting productivity. (Some developers have even said that they will quit outright when pre-commit hooks are mandatory at the organization.)

However, while it is true that long-running pre-commit hooks inhibit productivity, this is not the case for hooks that run near-instantaneously. And code formatters fall into this category. For this reason, the negative stigma against pre-commit hooks is not relevant when considering auto-formatters. Thus, Git commit velocity is not really a concern for either option.

Complexity - Formatting on Save Wins

Ideally, we would want to pick the solution that introduces the least complexity into the developer workflow. This should probably be the most important consideration!

Overall, we could probably say that formatting on save is less complex than formatting in a pre-commit hook, because most of the heavy lifting is done by the editor and the extension. Take the case of Prettier, the most common formatter for JavaScript/TypeScript. Since it requires Node.js, we want to run it in a context that abstracts away the installation of the runtime.

In the case of IDEs, the runtime is bundled by default. Thus, the Prettier extension can leverage the runtime without having to do anything additional.
In the case of pre-commit hooks, the runtime either has to be installed manually by every user (in the case of Husky) or is handled in Python (in the case of pre-commit). Running Node.js through Python/pyenv introduces an additional layer of complexity, which can lead to issues like OpenSSL library incompatibilities.

Overall, formating on save is much less likely to go wrong. And when things do go wrong, it is much easier to debug.

Using Format on Save and Format on Pre-Commit Together

Even if you decide that formatting on save is slightly better than formatting on pre-commit, you might decide to use both solutions at the same time. But is that a good idea?

For - Defence in Depth

The argument to use both is a defence in depth strategy:

If we just use format on save, it is possible that a developer's editor can become misconfigured, and unformatted code can slip into the codebase.
If we just use format on pre-commit, it is possible that a developer's pre-commit hook can become misconfigured, and unformatted code can slip into the codebase.
If we use both, then each step guards against failures at the other step.
It is still technically possible that the developer's editor and pre-commit hook can both be misconfigured, but this is much less likely than just one component failing.

Against - We Have To Use CI Anyway

Consider the case where the developer's editor and pre-commit hook are both misconfigured. Because this case exists, if we want 100% coverage, we have no choice but to also check for formatting in CI.

Thus, lowering the probability of pushing unformatted code to the remote does not really buy us much of anything, because CI is going to catch the bug in either case.

(To be clear, this is not an argument for only having CI checks, because that is terrible DX. Rather, the argument is that formatting on pre-commit doesn't give us any additional DX wins on top of formatting on save.)

Against - More Complexity is Bad

As engineers, we should strive to minimize the complexity in our codebases and in our toolchains. It is pretty hard to justify the complexity increase from adding a pre-commit hook when the payoff is essentially just a duplication of existing functionality.

The Downsides of CI Checks

In the visualization from earlier on, we called out CI as being the "worst" option because it is the option that is furthest away from the developer. But let's color this in a little bit to illustrate exactly how the bad DX comes about.

CI runs can take a long time. A common scenario is something like the following:

Alice finishes work on her feature branch. She pushes to remote, and knows from past experience that CI will take about 30 minutes to complete.
In order to stay productive, Alice immediately switches to a new branch in order to work on a new feature.
30 minutes pass, and Alice is now thinking deeply on a problem related to the new feature.
Alice gets an email notification that CI has failed.
Alice now has to perform a very costly context switch away from her problem and into where she was 30 minutes ago, in order to address the problem that CI is complaining about.
Alice finds and fixes the error. She pushes to remote.
Alice now performs another costly context switch back to her new branch.
30 minutes pass, and Alice gets another email notification that CI has failed.
The cycle repeats.

In order to avoid these kinds of time drains, we want to push the checks closer to the developer, if possible. Thankfully, for the case of auto-formatting, it is possible to do that with minimal other downsides.

CI Checks are Still Mandatory

Using CI is not free: there is a small-ish maintenance cost, a medium-ish complexity cost, and the potential cost of hosting the platform/runners. But if we want to ensure that no bad code slips through the cracks, we don't have a choice: we must check formatting in CI, regardless of which other steps that we use.

But the point here is that we can use CI as a last-resort check, instead of as the primary check. By doing this, we get the best of both worlds: good DX and 100% validation.

Conclusion

Given the previous discussion, I think the following is true:

We have to check formatting in CI to ensure proper coverage (because it is possible to get around the other types of checks).
We have to check formatting before pushing to remote to ensure a good DX.
Using auto-format on save is arguably superior to auto-format on pre-commit (for mostly visibility and complexity reasons).
Using both auto-format on save and auto-format on pre-commit has no real benefit (and greatly increases the complexity of the tooling).

Top comments (1)

David Blass • Jul 28

Agree! Format on save by default, always check in CI.

Do whatever else you want locally, but I don't see the upside to mandating precommit hooks. They're less immediate than on-save and less well-enforced than CI.

DEV Community