Repository Cleanup and Auditing
Over time, repositories tend to accumulate things that were never part of the original plan.
Not necessarily because someone made a mistake, but because repos live for years:
- build outputs get committed temporarily
- files get copied instead of reused
- directories stick around after refactors
- ownership changes and context gets lost
None of this is dramatic on its own — but over time it adds up.
The Problem I Ran Into
I wanted a way to see the current state of a repository without:
- deleting anything
- rewriting history
- enforcing opinions
- adding heavy configuration
Most tools I tried either focused on auto-cleanup or required a lot of setup.
I just wanted an audit.
A Small CLI for Repo Audits
So I built a small CLI tool in Go called repo-clean.
It scans a repository and reports:
- large files
- duplicate files (by hash)
- commonly unwanted directories
It does not:
- auto-delete files
- modify the repo
- push best practices
- hide details
It only reports what is already there.
Example Usage
repo-clean scan .
For CI or automation, JSON output is available:
repo-clean scan . --json
This makes it easy to plug into pipelines, reports, or checks without changing developer workflows.
Why No Auto-Fixes?
This is a deliberate choice.
In many teams, especially with older or shared repositories:
- not everything can be removed immediately
- some artifacts are temporarily tolerated
- decisions require human context
repo-clean is meant to surface information, not make decisions.
When This Is Useful
This tool is intentionally narrow, but works well for:
- legacy repositories
- CI audits
- periodic repo hygiene checks
- understanding why a repo feels heavier over time
It is not a replacement for .gitignore or good Git practices.
It's an audit tool for the current state.
Open Source
The project is open source (MIT):
https://github.com/Bladiostudio/repo-clean
Feedback, issues, and improvements are welcome.
Top comments (0)