EvvyTools

Posted on Jul 4

How to Add an Automated SQL Formatting Check to Your CI Pipeline

#tools #webdev #productivity

If your team has agreed on a SQL style but nobody's enforcing it, you'll drift back to inconsistent formatting within a month. New hires don't know the unwritten convention, tired developers on a Friday skip the extra step, and the style guide quietly stops mattering because nothing actually checks it. The fix is to enforce it in CI instead of trusting everyone to remember every time.

Photo by Wolfgang Weiser on Pexels

Step 1: Pick a formatter and lock its config

Whatever formatter your team uses, pin its version and its configuration (keyword case, indent width, comma style) in a config file committed to the repo. If the formatter's defaults can drift between versions, an unpinned dependency will silently reformat your entire codebase on the next npm install or pip install, which is its own kind of chaos and defeats the whole point of having a consistent standard in the first place.

Treat this config file the same way you'd treat an ESLint or Prettier config: check it in, review changes to it deliberately, and don't let it get bumped as a side effect of an unrelated dependency update.

Step 2: Add a check script, not a fix script

The CI check should fail the build if formatting doesn't match, not silently reformat and commit on your behalf. A script that runs the formatter against every changed .sql file, diffs the output against the committed version, and exits non-zero on any difference is enough:

for f in $(git diff --name-only origin/main -- '*.sql'); do
  sql-formatter "$f" > /tmp/formatted.sql
  diff -q "$f" /tmp/formatted.sql || exit 1
done

This keeps formatting as a deliberate developer action (run the formatter locally, commit the result) rather than something CI does silently on your behalf, which avoids surprising force-pushes or unexpected diffs showing up on merge that nobody remembers authorizing.

Step 3: Scope it to changed files only, not the whole repo

Running the check against every SQL file in the repo on every PR will fail on day one if your codebase has years of inconsistently formatted queries sitting in it already. Scope the check to files touched in the current diff so you're only enforcing the new convention going forward, and clean up the historical backlog separately, in its own dedicated PR, on its own schedule, reviewed on its own merits as a pure formatting change.

Step 4: Document the exception path

Some generated SQL (migrations produced by an ORM, vendor-provided scripts, third-party seed data) shouldn't be reformatted, either because it's not really "yours" to restyle or because reformatting it would make future diffs against the upstream source harder to read. Exclude those paths explicitly in the CI config rather than letting developers add ad-hoc # noformat comments that nobody remembers the exact syntax for six months later.

Step 5: Make the fix path obvious in the failure message

A CI failure that just says "formatting check failed" sends people hunting for what to do next, usually into a Slack thread asking someone else how to fix it. Have the script print the exact command to run locally to reformat and pass the check, so the failure resolves in one copy-paste instead of a back-and-forth conversation that costs more time than the original formatting issue would have.

Step 6: Decide what happens on a formatting-only PR

Once the check exists, you'll eventually get a pull request that's purely a reformat with no logic changes, either from the initial rollout or from someone cleaning up a file they happened to be touching anyway. Treat these as their own category: fast to review (diff the AST or the query's semantic content if your tooling supports it, not just the text), but still worth a real look, since a bug in the formatter itself could theoretically alter behavior in an edge case. Don't let "it's just formatting" become an excuse to merge without any review at all.

Step 7: Revisit the config occasionally, but rarely

A locked formatting config is a feature, not a limitation, but it shouldn't be permanently frozen either. If the team's needs change (a new database engine gets adopted, a new hire strongly prefers a different comma style and makes a good case for it) revisit the config deliberately, as its own decision, rather than letting it drift through unrelated PRs. The goal is stability with an explicit, rare escape hatch, not an unchangeable rule enforced for its own sake.

Why this is worth the setup cost

The one-time cost of wiring this up is small compared to the recurring cost of relitigating the same case-and-comma argument every few months, which is what happens by default once a style guide has no enforcement behind it. Once this check exists, new team members follow the convention automatically because the CI system teaches them on their very first PR, rather than a wiki page they may or may not have read closely.

Once this is running, the case-versus-lowercase, leading-comma-versus-trailing debates mostly disappear, because the CI check makes the team's choice binding instead of aspirational. For background on why formatters disagree on these defaults in the first place, see this breakdown of SQL formatter keyword case and indent conventions. If you want to spot-check formatting output manually before wiring up CI, EvvyTools has a browser-based SQL formatter that's a fast way to see the target output for a given query before you commit your team to a specific config.

For general CI pipeline patterns beyond SQL specifically, GitHub's Actions documentation and GitLab's CI/CD documentation both cover the check-and-fail pattern used here in more general terms, and either is a good reference if you're setting up your very first automated formatting gate. If your stack also includes a general-purpose linter, the ESLint documentation is a useful comparison point for how a mature ecosystem structures the same "check, don't auto-fix in CI" pattern for application code.

Handling the rollout without blocking every open PR at once

If you're adding this check to a repository that already has open pull requests in flight, don't flip the CI gate to blocking on day one. Every open PR touching a .sql file will suddenly fail, not because of anything the author did wrong on that PR specifically, but because the check is now retroactively applying to code written before the standard existed. Run the check in warn-only mode for a week or two first, let people see and fix the failures at their own pace, then switch it to a hard block once the backlog of in-flight PRs has mostly cleared out naturally.

This staged rollout matters more than it might seem. A hard, blocking gate that lands with zero warning reads as hostile to whoever's PR happens to be open at the time, even though the goal (consistent formatting) is genuinely good for the whole team. A short warning period costs almost nothing and avoids that entirely preventable friction.

Measuring whether the gate is actually working

After a month or two of the check running, it's worth actually looking at whether it's doing its job. A useful signal: are formatting-only PRs still showing up regularly, or has the codebase stabilized into consistent formatting with the check mostly just confirming what developers are already producing locally before they push? If formatting-only fixup commits keep appearing regularly, that's a sign either the local tooling isn't wired in well enough (developers are relying on CI to catch what a pre-commit hook should catch earlier) or the config needs another look. Either way, the CI check gives you a concrete, measurable signal instead of a vague sense that "formatting seems better now."

DEV Community