<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Michael Rawls, Jr.</title>
    <description>The latest articles on DEV Community by Michael Rawls, Jr. (@mrjr0101).</description>
    <link>https://dev.to/mrjr0101</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3623644%2F19b37285-b1de-42b6-a983-20e0e44493b1.png</url>
      <title>DEV Community: Michael Rawls, Jr.</title>
      <link>https://dev.to/mrjr0101</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/mrjr0101"/>
    <language>en</language>
    <item>
      <title>I built a CLI to fix the encoding/newline/whitespace noise that pollutes your diffs</title>
      <dc:creator>Michael Rawls, Jr.</dc:creator>
      <pubDate>Tue, 10 Mar 2026 19:31:56 +0000</pubDate>
      <link>https://dev.to/mrjr0101/i-built-a-cli-to-fix-the-encodingnewlinewhitespace-noise-that-pollutes-your-diffs-4da5</link>
      <guid>https://dev.to/mrjr0101/i-built-a-cli-to-fix-the-encodingnewlinewhitespace-noise-that-pollutes-your-diffs-4da5</guid>
      <description>&lt;h1&gt;
  
  
  I built a CLI to fix the encoding/newline/whitespace noise that pollutes your diffs
&lt;/h1&gt;

&lt;p&gt;Every team I have worked on eventually hits the same invisible problem.&lt;/p&gt;

&lt;p&gt;Someone on Windows commits a file. Someone on Mac pulls it. The diff shows 200 changed lines.&lt;br&gt;
Nothing actually changed. It was trailing spaces, CRLF endings, a BOM, a file that got&lt;br&gt;
re-saved in a different encoding. The code review is useless because the real changes are&lt;br&gt;
buried in whitespace noise.&lt;/p&gt;

&lt;p&gt;I got tired of fixing this manually on every project, so I built&lt;br&gt;
&lt;a href="https://pypi.org/project/code-normalizer-pro/" rel="noopener noreferrer"&gt;code-normalizer-pro&lt;/a&gt; -- a CLI that handles&lt;br&gt;
all of it in one pass.&lt;/p&gt;


&lt;h2&gt;
  
  
  What it does
&lt;/h2&gt;

&lt;p&gt;One command normalizes an entire directory:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Converts encoding to UTF-8 (handles UTF-16, UTF-8-BOM, Windows-1252, Latin-1, and more)&lt;/li&gt;
&lt;li&gt;Fixes line endings -- CRLF to LF&lt;/li&gt;
&lt;li&gt;Strips trailing whitespace from every line&lt;/li&gt;
&lt;li&gt;Ensures a single newline at end of file&lt;/li&gt;
&lt;li&gt;Skips binary files automatically&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It works on Python, JavaScript, TypeScript, Go, Rust, C, C++, and Java files.&lt;/p&gt;


&lt;h2&gt;
  
  
  Install
&lt;/h2&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;code-normalizer-pro
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;Requires Python 3.10+. Core has zero dependencies beyond tqdm for progress bars.&lt;/p&gt;


&lt;h2&gt;
  
  
  Basic usage
&lt;/h2&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# See what would change without touching anything&lt;/span&gt;
code-normalizer-pro /path/to/project &lt;span class="nt"&gt;--dry-run&lt;/span&gt;

&lt;span class="c"&gt;# Fix everything in-place&lt;/span&gt;
code-normalizer-pro /path/to/project &lt;span class="nt"&gt;--in-place&lt;/span&gt;

&lt;span class="c"&gt;# Specific extensions only&lt;/span&gt;
code-normalizer-pro /path/to/project &lt;span class="nt"&gt;-e&lt;/span&gt; .py &lt;span class="nt"&gt;-e&lt;/span&gt; .js &lt;span class="nt"&gt;--in-place&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;Dry-run output looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Scanning /path/to/project...
  [changed] src/utils.py  -- trailing whitespace (34 chars), CRLF endings
  [changed] src/main.js   -- encoding: windows-1252 -&amp;gt; utf-8
  [skip]    assets/logo.png -- binary
  [ok]      tests/test_core.py

Total: 47 files | 2 changed | 1 skipped | 44 already clean
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Nothing is written in dry-run mode. You see exactly what would happen.&lt;/p&gt;




&lt;h2&gt;
  
  
  Parallel processing for large codebases
&lt;/h2&gt;

&lt;p&gt;Sequential mode processes about 20-30 files per second. For a monorepo that is fine.&lt;br&gt;
For anything over a few thousand files, use parallel mode:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;code-normalizer-pro /path/to/project &lt;span class="nt"&gt;--parallel&lt;/span&gt; &lt;span class="nt"&gt;--in-place&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It uses all available CPU cores by default. You can cap it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;code-normalizer-pro /path/to/project &lt;span class="nt"&gt;--parallel&lt;/span&gt; &lt;span class="nt"&gt;--workers&lt;/span&gt; 4 &lt;span class="nt"&gt;--in-place&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Benchmarks on a Python codebase averaging 200 lines per file:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Mode&lt;/th&gt;
&lt;th&gt;100 files&lt;/th&gt;
&lt;th&gt;500 files&lt;/th&gt;
&lt;th&gt;1000 files&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Sequential&lt;/td&gt;
&lt;td&gt;3.2s&lt;/td&gt;
&lt;td&gt;16.8s&lt;/td&gt;
&lt;td&gt;33.5s&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Parallel (4)&lt;/td&gt;
&lt;td&gt;1.1s&lt;/td&gt;
&lt;td&gt;4.3s&lt;/td&gt;
&lt;td&gt;7.1s&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Parallel (8)&lt;/td&gt;
&lt;td&gt;0.8s&lt;/td&gt;
&lt;td&gt;2.9s&lt;/td&gt;
&lt;td&gt;4.8s&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  SHA256 caching for repeat runs
&lt;/h2&gt;

&lt;p&gt;On the first run it processes everything and writes a &lt;code&gt;.normalize-cache.json&lt;/code&gt; file.&lt;br&gt;
On every run after that, unchanged files are skipped entirely.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;code-normalizer-pro /path/to/project &lt;span class="nt"&gt;--cache&lt;/span&gt; &lt;span class="nt"&gt;--in-place&lt;/span&gt; &lt;span class="nt"&gt;--parallel&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Second run output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;All discovered files were unchanged and skipped by cache.
Cached hits: 1000
Total runtime: 0.8s
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This matters a lot in CI. If your normalize step runs on every push but only 5 files&lt;br&gt;
actually changed, it finishes in under a second instead of 30.&lt;/p&gt;


&lt;h2&gt;
  
  
  Pre-commit hook
&lt;/h2&gt;

&lt;p&gt;This is the part that actually enforces standards across a team.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Run once inside any git repo&lt;/span&gt;
code-normalizer-pro &lt;span class="nt"&gt;--install-hook&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That writes a pre-commit hook that checks staged files before every commit.&lt;br&gt;
If any file needs normalization, the commit is blocked and the fix command is printed:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Checking 5 staged file(s)...

Files that need normalization:
  src/feature.py
  src/utils.js

Run: code-normalizer-pro src/feature.py src/utils.js --in-place
Or:  git commit --no-verify  (to skip this check)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The developer fixes the files, re-stages, and commits. No config file required.&lt;br&gt;
The hook uses the Python interpreter that installed the package, so it works&lt;br&gt;
in virtualenvs without extra setup.&lt;/p&gt;


&lt;h2&gt;
  
  
  CI integration
&lt;/h2&gt;

&lt;p&gt;Add a normalization check to your pipeline in about 10 lines.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;GitHub Actions:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Code hygiene check&lt;/span&gt;

&lt;span class="na"&gt;on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;push&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;pull_request&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;

&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;normalize-check&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/checkout@v4&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/setup-python@v5&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;python-version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;3.11"&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pip install code-normalizer-pro&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;code-normalizer-pro . --dry-run --parallel&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;--dry-run&lt;/code&gt; flag currently exits 0 regardless of whether it finds violations.&lt;br&gt;
A &lt;code&gt;--fail-on-changes&lt;/code&gt; flag is on the roadmap -- for now, if you need CI to fail on&lt;br&gt;
violations, you can grep the output for "changed" and exit 1 accordingly. I will&lt;br&gt;
document the workaround in the README until the flag ships.&lt;/p&gt;


&lt;h2&gt;
  
  
  Interactive mode
&lt;/h2&gt;

&lt;p&gt;If you are normalizing a codebase for the first time and want to review each change&lt;br&gt;
before it gets written:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;code-normalizer-pro /path/to/project &lt;span class="nt"&gt;--interactive&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It shows a diff for each file and waits for &lt;code&gt;y / n / d (show full diff) / q (quit)&lt;/code&gt;.&lt;br&gt;
Useful when you are not sure what you are about to change in a legacy codebase.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I learned building this
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Encoding detection is hard.&lt;/strong&gt; UTF-16 files without a BOM are indistinguishable from&lt;br&gt;
binary garbage unless you do heuristic analysis. I ended up with a layered approach --&lt;br&gt;
check for BOM first, then try a candidate list in order, then fall back to binary&lt;br&gt;
detection. There are still edge cases.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;ProcessPoolExecutor and in-place writes need careful handling.&lt;/strong&gt; When you spawn&lt;br&gt;
workers, backup creation has to happen before dispatch -- not inside the worker --&lt;br&gt;
otherwise parallel mode silently skips backups. This is a known bug in the current&lt;br&gt;
release that I am fixing next.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cache path matters.&lt;/strong&gt; The cache file should live next to the target directory,&lt;br&gt;
not in CWD. If you run the tool from a different working directory each time, the&lt;br&gt;
cache never hits. Also on the fix list.&lt;/p&gt;




&lt;h2&gt;
  
  
  Current state and roadmap
&lt;/h2&gt;

&lt;p&gt;This is v3.0.1-alpha.1. It works and I use it on my own projects daily.&lt;br&gt;
These are the rough edges I am actively fixing:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;--parallel --in-place&lt;/code&gt; skips backups (data safety issue -- high priority)&lt;/li&gt;
&lt;li&gt;Cache file lands in CWD instead of target directory&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;--dry-run&lt;/code&gt; exits 0 even when violations are found (need &lt;code&gt;--fail-on-changes&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;No &lt;code&gt;--version&lt;/code&gt; flag yet&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Coming next:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;.gitignore&lt;/code&gt; pattern support (skip files the project already ignores)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;--git-staged&lt;/code&gt; mode (normalize only what is staged, like the pre-commit hook does)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;--fail-on-changes&lt;/code&gt; for CI&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;--version&lt;/code&gt; flag&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Try it and tell me what is missing
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;code-normalizer-pro
code-normalizer-pro &lt;span class="nb"&gt;.&lt;/span&gt; &lt;span class="nt"&gt;--dry-run&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The two things I want to know from anyone who tries it:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Does it work in your CI setup? If it broke something, I want to know exactly how.&lt;/li&gt;
&lt;li&gt;What language or workflow is missing that would make this useful to you?&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Source is on GitHub: &lt;a href="https://github.com/MRJR0101/code-normalizer-pro" rel="noopener noreferrer"&gt;https://github.com/MRJR0101/code-normalizer-pro&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you hit a bug or have a feature request, open an issue. I respond to everything.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Built with Python 3.10+. Zero required external dependencies except tqdm.&lt;/em&gt;&lt;br&gt;
&lt;em&gt;MIT license.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>python</category>
      <category>codequality</category>
      <category>opensource</category>
      <category>cli</category>
    </item>
  </channel>
</rss>
