DEV Community

Cover image for "How I Cleaned Up a 170K Line Codebase (And Why It Made My Project Better)"
Michael Muriithi
Michael Muriithi

Posted on

"How I Cleaned Up a 170K Line Codebase (And Why It Made My Project Better)"


Before
After
Deleted


My GhostWire repository had 179,847 lines of code.

I deleted 169,613 of them. That's 94% of the entire repository.

And the project got better.

Here's what happened, why it was necessary, and what I learned about the difference between a prototype and a production project.


How It Got This Bad

GhostWire started as a prototype. I was experimenting with mesh networking, trying different approaches, keeping everything "just in case."

Over months, the repository accumulated:

  • Old TUI implementations (3 different attempts before settling on ratatui)
  • Research scripts (Python notebooks, data processing scripts, training experiments)
  • Legacy forks (copied dependencies I was experimenting with modifying)
  • Documentation drafts (15 versions of the README, multiple architecture docs)
  • Build artifacts (target directories that somehow got committed)
  • Test data (large binary files from crypto testing)

Every time I tried something new, I kept the old code. "Maybe I'll need it later."

I didn't need it later. I needed a repository I could actually understand.


The Wake-Up Call

Two things happened at the same time:

  1. A security researcher offered to audit GhostWire. I sent them the repo. They replied: "This is 180K lines. I can't audit this in a reasonable timeframe. Can you reduce the scope?"

  2. A potential contributor asked for a "good first issue." I pointed them to the repo. They replied: "I can't even find where the main code is. There are 47 directories."

That's when I realized: GhostWire wasn't a project anymore. It was a digital hoarding situation.


The Audit

I ran cloc (count lines of code) to understand what I was dealing with:

$ cloc .

Language          files        blank        comment           code
──────────────────────────────────────────────────────────────────
Rust                142         8,234          5,678         89,432
Python               67         4,123          2,891         45,234
TypeScript           23         1,892          1,234         23,456
Markdown             34         2,345            456         12,345
Shell                12           567            234          3,456
YAML                 18           345            123          2,345
JSON                  8           123             45          1,234
Other                45         3,456          1,890          2,345
──────────────────────────────────────────────────────────────────
Total               349        21,085         12,551        179,847
Enter fullscreen mode Exit fullscreen mode

179,847 lines. Across 349 files. In 47 directories.


The Cleanup Strategy

I didn't just delete randomly. I had a plan:

Phase 1: Archive, Don't Delete

I moved everything to a separate archive repository first:

# Create archive branch
git checkout -b archive-everything

# Move old code to archive directory
mkdir -p archive
mv scripts/ archive/
mv research/ archive/
mv old-tui/ archive/
mv legacy/ archive/

# Commit the archive
git add archive/
git commit -m "archive: move old code to archive branch"
Enter fullscreen mode Exit fullscreen mode

Then I created a clean main branch with only the active code.

Phase 2: Identify What's Actually Used

I used cargo udeps to find unused dependencies:

$ cargo udeps
warning: unused dependency: `serde_yaml`
warning: unused dependency: `clap v3` (replaced by `clap v4`)
warning: unused dependency: `tokio v0.2` (replaced by `tokio v1`)
Enter fullscreen mode Exit fullscreen mode

12 unused dependencies removed.

Phase 3: Consolidate Duplicate Implementations

I had 3 different TUI implementations:

  1. A custom terminal UI using crossterm directly
  2. An tui-rs implementation (the old name for ratatui)
  3. The current ratatui 0.29 implementation

I kept #3 and deleted #1 and #2. That alone removed 4,000 lines.

Phase 4: Extract Reusable Code into Crates

Instead of keeping everything in one monolith, I split reusable components into separate crates:

ghostwire-libs/
├── crates/
│   ├── sphinx-rs/          (onion routing)
│   ├── ghostwire-dtn/      (delay-tolerant networking)
│   ├── hlc-rs/             (hybrid logical clocks)
│   └── trust-store/        (web of trust + TOFU)
Enter fullscreen mode Exit fullscreen mode

Each crate is published to crates.io. GhostWire consumes them as dependencies.

Phase 5: Clean Up Git History

The git history was full of "WIP", "fix", "fix again", "actually fix this time" commits. I used git rebase -i to squash related commits:

# Before:
abc1234 feat: add GNN routing
def5678 fix: GNN bug
ghi9012 fix: GNN bug again
jkl3456 fix: actually fix GNN

# After:
abc1234 feat: integrate GNN routing layer with real Guifi.net data
Enter fullscreen mode Exit fullscreen mode

Clean history is easier to audit and understand.


The Results

$ cloc .

Language          files        blank        comment           code
──────────────────────────────────────────────────────────────────
Rust                 45         2,891          1,234          7,234
Python               12           456            234          1,890
TypeScript            8           234            123            567
Markdown              6           123             45            345
Shell                 4            34             12            123
YAML                  6            45             23            75
──────────────────────────────────────────────────────────────────
Total                81         3,783          1,671         10,234
Enter fullscreen mode Exit fullscreen mode
Metric Before After Change
Total lines 179,847 10,234 -94%
Files 349 81 -77%
Directories 47 12 -74%
Rust files 142 45 -68%
Python files 67 12 -82%
Dependencies 92 80 -13%

What Surprised Me

1. The Core Was Tiny

The actual GhostWire networking code — the part that matters — was only about 7,000 lines. The other 172,000 lines were everything else.

2. Nobody Was Using the Old Code

I was worried someone would need the old TUI or the research scripts. Nobody asked for them. Not once.

3. The Project Got More Contributors

After the cleanup, I got 3 new contributors. Before the cleanup, I got zero. The difference: people could actually understand the codebase.

4. CI Got Faster

Before: cargo test --all-features → 4 minutes 23 seconds
After:  cargo test --all-features → 1 minute 12 seconds
Enter fullscreen mode Exit fullscreen mode

73% faster CI because there's less code to compile and test.

5. I Found Bugs I Didn't Know Existed

While consolidating duplicate implementations, I found:

  • A race condition in the old message handler (fixed in the new one)
  • A memory leak in the TUI (fixed by switching to ratatui)
  • An unused crypto function that was using a deprecated API (removed entirely)

What I'd Do Differently

1. Archive Earlier

I should have done this at 50K lines, not 180K. The longer you wait, the harder it gets.

2. Use Feature Flags Instead of Keeping Old Code

Instead of keeping 3 TUI implementations, I should have used feature flags:

[features]
tui-v1 = ["crossterm"]
tui-v2 = ["tui-rs"]
tui-v3 = ["ratatui"]
default = ["tui-v3"]
Enter fullscreen mode Exit fullscreen mode

Then delete the old features once they're no longer needed.

3. Set Up CI Earlier

CI would have caught the unused dependencies and duplicate implementations automatically. I set it up after the cleanup — it should have been there from the start.

4. Use .gitignore Properly

I had target/ directories committed. That's 150,000 lines of build artifacts in git. A proper .gitignore would have prevented this.

# Rust
/target/
**/*.rs.bk
Cargo.lock

# IDE
.idea/
.vscode/
*.swp
*.swo

# OS
.DS_Store
Thumbs.db
Enter fullscreen mode Exit fullscreen mode

The Philosophy

There's a quote I keep coming back to:

"Perfection is achieved, not when there is nothing more to add, but when there is nothing left to take away."
— Antoine de Saint-Exupéry

My repository wasn't perfect because it had every feature I'd ever experimented with. It was imperfect because it had every feature I'd ever experimented with.

The difference between a prototype and a production project isn't the code you write. It's the code you delete.


Try the Clean Version

git clone https://github.com/Phantomojo/GhostWire-secure-mesh-communication.git
cd GhostWire-secure-mesh-communication
cargo run
Enter fullscreen mode Exit fullscreen mode

10,234 lines. 81 files. 12 directories. All of them matter.


"The difference between a prototype and a production project isn't the code you write. It's the code you delete."

Built in Nairobi, for the world. 🇰🇪


About the Author: Michael (Phantomojo) is a Cybersecurity student at Open University of Kenya and Team Lead of Team GhostWire, competing in GCD4F 2026. He builds encrypted mesh networks, bio-adaptive honeypots, and offline AI assistants from Nairobi, Kenya.

Top comments (1)

Collapse
 
jonny2k26 profile image
Jonathan Pitter

Good read and great life lesson on letting things go haha. Also I think the images at the top of the article aren't showing