DEV Community

BabbaWaagen
BabbaWaagen

Posted on

Auditing repository bloat without auto-fixing anything

Repository Cleanup and Auditing

Over time, repositories tend to accumulate things that were never part of the original plan.

Not necessarily because someone made a mistake, but because repos live for years:

  • build outputs get committed temporarily
  • files get copied instead of reused
  • directories stick around after refactors
  • ownership changes and context gets lost

None of this is dramatic on its own — but over time it adds up.

The Problem I Ran Into

I wanted a way to see the current state of a repository without:

  • deleting anything
  • rewriting history
  • enforcing opinions
  • adding heavy configuration

Most tools I tried either focused on auto-cleanup or required a lot of setup.
I just wanted an audit.

A Small CLI for Repo Audits

So I built a small CLI tool in Go called repo-clean.

It scans a repository and reports:

  • large files
  • duplicate files (by hash)
  • commonly unwanted directories

It does not:

  • auto-delete files
  • modify the repo
  • push best practices
  • hide details

It only reports what is already there.

Example Usage

repo-clean scan .
Enter fullscreen mode Exit fullscreen mode

For CI or automation, JSON output is available:

repo-clean scan . --json
Enter fullscreen mode Exit fullscreen mode

This makes it easy to plug into pipelines, reports, or checks without changing developer workflows.

Why No Auto-Fixes?

This is a deliberate choice.

In many teams, especially with older or shared repositories:

  • not everything can be removed immediately
  • some artifacts are temporarily tolerated
  • decisions require human context

repo-clean is meant to surface information, not make decisions.

When This Is Useful

This tool is intentionally narrow, but works well for:

  • legacy repositories
  • CI audits
  • periodic repo hygiene checks
  • understanding why a repo feels heavier over time

It is not a replacement for .gitignore or good Git practices.
It's an audit tool for the current state.

Open Source

The project is open source (MIT):

https://github.com/Bladiostudio/repo-clean

Feedback, issues, and improvements are welcome.

Top comments (0)