DEV Community

Cover image for Your repo has whitespace problems you can't see — I built a zero-dep CLI that finds and fixes them all
benjamin
benjamin

Posted on

Your repo has whitespace problems you can't see — I built a zero-dep CLI that finds and fixes them all

Whitespace problems are the ones you can't see until they bite. A pull request where half the "changes" are trailing-space diffs. A shell script that breaks in CI because someone's editor saved it CRLF. A .env with a UTF-8 BOM that makes the first variable name mysteriously not match. A file with no final newline that turns one-line changes into two-line diffs forever.

None of it shows up on screen. All of it shows up in git blame.

Today, catching this takes three or four tools stitched together — and I got tired of that, so I built wssweep: one zero-config command that finds all the common whitespace smells and, with --fix, cleans them in place.

$ npx wssweep

  src/app.js  (2)
      14: trailing-whitespace  trailing whitespace
       -  missing-final-newline  no newline at end of file
  config.yml  (1)
       -  mixed-eol  mixed line endings (CRLF×3, LF×1)

  ✖ 3 whitespace issues in 2 files  (mixed-eol=1, missing-final-newline=1, trailing-whitespace=1)

$ npx wssweep --fix     # clean them
Enter fullscreen mode Exit fullscreen mode

It checks seven things: trailing whitespace, mixed CRLF/LF line endings, lone CRs, a missing final newline, extra trailing blank lines, a UTF-8 BOM, and tabs mixed with spaces in one indent. Non-zero exit on findings, so it's a CI gate. pip install wssweep gets the same tool in Python — byte-for-byte identical output and fixes.

Why not editorconfig-checker / pre-commit / prettier?

Because each does part of it:

  • editorconfig-checker reports — but you have to author an .editorconfig first, and it can't fix anything.
  • pre-commit's trailing-whitespace / end-of-file-fixer / mixed-line-ending hooks do fix, but only inside the pre-commit framework, and they're three separate hooks. Nobody runs them ad-hoc on a fresh checkout.
  • prettier fixes whitespace only as a side effect of reformatting all your code, and won't touch files it can't parse.
  • dos2unix does line endings and nothing else.

wssweep is the one npx/pip command, no config and no framework, that does the whole set at once and drops into any CI regardless of toolchain.

The opinions that make it zero-config

A whitespace tool with no config has to make the right calls by default, or it's noise:

  • A consistently-CRLF file is fine — only a file mixing CRLF and LF is flagged. .bat/.cmd even keep CRLF when fixed, so it never breaks a Windows script.
  • In Markdown, a line ending in exactly two spaces is a hard line break — semantically meaningful — so the trailing-whitespace check is skipped in .md. (BOM, final-newline, and EOL checks still apply.)
  • mixed-indentation is report-only. Auto-converting tabs↔spaces needs a tab-width guess that can silently wreck alignment, so wssweep tells you and leaves it.

The fun part: --fix that two languages agree on, to the byte

I wanted a Node build and a Python build that produce identical reports and write identical bytes on --fix. Whitespace is the worst possible domain for that, because every shortcut leaks platform behavior:

  • str.splitlines() in Python splits on VT, FF, NEL, U+2028, U+2029 too — it would invent phantom lines a split(/\r\n|\r|\n/) never sees. Forbidden.
  • \s matches different things in JS and Python (NBSP, the BOM char, …). So "whitespace" here means exactly [ \t] — explicit, never \s.
  • Python's text-mode file IO silently translates \r\n\n on read and \nos.linesep on write — it would rewrite the very bytes the tool inspects. So everything is raw bytes, and the scan runs on a latin-1 (byte-faithful) view where every byte maps 1:1 to a code point.
  • --fix builds the corrected bytes, compares to the original, and writes only if they differ — atomically (temp file + rename), preserving the file mode, never touching a binary or skipped file. And it's idempotent: run it twice, the second run does nothing.

A differential test fixes two identical trees with the two builds and cmps every file — zero differing bytes.

Install

npx wssweep --fix      # Node, zero deps
pip install wssweep    # Python, zero deps, identical fixes
Enter fullscreen mode Exit fullscreen mode

MIT licensed, both builds open source:

Run npx wssweep on your current project. How many trailing-whitespace lines and missing newlines is it hiding? (Mine's never zero.)

Top comments (0)