Michael Muriithi

Posted on Apr 11

"How I Cleaned Up a 170K Line Codebase (And Why It Made My Project Better)"

#codequality #devjournal #showdev #softwareengineering

My GhostWire repository had 179,847 lines of code.

I deleted 169,613 of them. That's 94% of the entire repository.

And the project got better.

Here's what happened, why it was necessary, and what I learned about the difference between a prototype and a production project.

How It Got This Bad

GhostWire started as a prototype. I was experimenting with mesh networking, trying different approaches, keeping everything "just in case."

Over months, the repository accumulated:

Old TUI implementations (3 different attempts before settling on ratatui)
Research scripts (Python notebooks, data processing scripts, training experiments)
Legacy forks (copied dependencies I was experimenting with modifying)
Documentation drafts (15 versions of the README, multiple architecture docs)
Build artifacts (target directories that somehow got committed)
Test data (large binary files from crypto testing)

Every time I tried something new, I kept the old code. "Maybe I'll need it later."

I didn't need it later. I needed a repository I could actually understand.

The Wake-Up Call

Two things happened at the same time:

A security researcher offered to audit GhostWire. I sent them the repo. They replied: "This is 180K lines. I can't audit this in a reasonable timeframe. Can you reduce the scope?"
A potential contributor asked for a "good first issue." I pointed them to the repo. They replied: "I can't even find where the main code is. There are 47 directories."

That's when I realized: GhostWire wasn't a project anymore. It was a digital hoarding situation.

The Audit

I ran cloc (count lines of code) to understand what I was dealing with:

$ cloc .

Language          files        blank        comment           code
──────────────────────────────────────────────────────────────────
Rust                142         8,234          5,678         89,432
Python               67         4,123          2,891         45,234
TypeScript           23         1,892          1,234         23,456
Markdown             34         2,345            456         12,345
Shell                12           567            234          3,456
YAML                 18           345            123          2,345
JSON                  8           123             45          1,234
Other                45         3,456          1,890          2,345
──────────────────────────────────────────────────────────────────
Total               349        21,085         12,551        179,847

179,847 lines. Across 349 files. In 47 directories.

The Cleanup Strategy

I didn't just delete randomly. I had a plan:

Phase 1: Archive, Don't Delete

I moved everything to a separate archive repository first:

# Create archive branch
git checkout -b archive-everything

# Move old code to archive directory
mkdir -p archive
mv scripts/ archive/
mv research/ archive/
mv old-tui/ archive/
mv legacy/ archive/

# Commit the archive
git add archive/
git commit -m "archive: move old code to archive branch"

Then I created a clean main branch with only the active code.

Phase 2: Identify What's Actually Used

I used cargo udeps to find unused dependencies:

$ cargo udeps
warning: unused dependency: `serde_yaml`
warning: unused dependency: `clap v3` (replaced by `clap v4`)
warning: unused dependency: `tokio v0.2` (replaced by `tokio v1`)

12 unused dependencies removed.

Phase 3: Consolidate Duplicate Implementations

I had 3 different TUI implementations:

A custom terminal UI using crossterm directly
An tui-rs implementation (the old name for ratatui)
The current ratatui 0.29 implementation

I kept #3 and deleted #1 and #2. That alone removed 4,000 lines.

Phase 4: Extract Reusable Code into Crates

Instead of keeping everything in one monolith, I split reusable components into separate crates:

ghostwire-libs/
├── crates/
│   ├── sphinx-rs/          (onion routing)
│   ├── ghostwire-dtn/      (delay-tolerant networking)
│   ├── hlc-rs/             (hybrid logical clocks)
│   └── trust-store/        (web of trust + TOFU)

Each crate is published to crates.io. GhostWire consumes them as dependencies.

Phase 5: Clean Up Git History

The git history was full of "WIP", "fix", "fix again", "actually fix this time" commits. I used git rebase -i to squash related commits:

# Before:
abc1234 feat: add GNN routing
def5678 fix: GNN bug
ghi9012 fix: GNN bug again
jkl3456 fix: actually fix GNN

# After:
abc1234 feat: integrate GNN routing layer with real Guifi.net data

Clean history is easier to audit and understand.

The Results

$ cloc .

Language          files        blank        comment           code
──────────────────────────────────────────────────────────────────
Rust                 45         2,891          1,234          7,234
Python               12           456            234          1,890
TypeScript            8           234            123            567
Markdown              6           123             45            345
Shell                 4            34             12            123
YAML                  6            45             23            75
──────────────────────────────────────────────────────────────────
Total                81         3,783          1,671         10,234

Metric	Before	After	Change
Total lines	179,847	10,234	-94%
Files	349	81	-77%
Directories	47	12	-74%
Rust files	142	45	-68%
Python files	67	12	-82%
Dependencies	92	80	-13%

What Surprised Me

1. The Core Was Tiny

The actual GhostWire networking code — the part that matters — was only about 7,000 lines. The other 172,000 lines were everything else.

2. Nobody Was Using the Old Code

I was worried someone would need the old TUI or the research scripts. Nobody asked for them. Not once.

3. The Project Got More Contributors

After the cleanup, I got 3 new contributors. Before the cleanup, I got zero. The difference: people could actually understand the codebase.

4. CI Got Faster

Before: cargo test --all-features → 4 minutes 23 seconds
After:  cargo test --all-features → 1 minute 12 seconds

73% faster CI because there's less code to compile and test.

5. I Found Bugs I Didn't Know Existed

While consolidating duplicate implementations, I found:

A race condition in the old message handler (fixed in the new one)
A memory leak in the TUI (fixed by switching to ratatui)
An unused crypto function that was using a deprecated API (removed entirely)

What I'd Do Differently

1. Archive Earlier

I should have done this at 50K lines, not 180K. The longer you wait, the harder it gets.

2. Use Feature Flags Instead of Keeping Old Code

Instead of keeping 3 TUI implementations, I should have used feature flags:

[features]
tui-v1 = ["crossterm"]
tui-v2 = ["tui-rs"]
tui-v3 = ["ratatui"]
default = ["tui-v3"]

Then delete the old features once they're no longer needed.

3. Set Up CI Earlier

CI would have caught the unused dependencies and duplicate implementations automatically. I set it up after the cleanup — it should have been there from the start.

4. Use `.gitignore` Properly

I had target/ directories committed. That's 150,000 lines of build artifacts in git. A proper .gitignore would have prevented this.

# Rust
/target/
**/*.rs.bk
Cargo.lock

# IDE
.idea/
.vscode/
*.swp
*.swo

# OS
.DS_Store
Thumbs.db

The Philosophy

There's a quote I keep coming back to:

"Perfection is achieved, not when there is nothing more to add, but when there is nothing left to take away."
— Antoine de Saint-Exupéry

My repository wasn't perfect because it had every feature I'd ever experimented with. It was imperfect because it had every feature I'd ever experimented with.

The difference between a prototype and a production project isn't the code you write. It's the code you delete.

Try the Clean Version

git clone https://github.com/Phantomojo/GhostWire-secure-mesh-communication.git
cd GhostWire-secure-mesh-communication
cargo run

10,234 lines. 81 files. 12 directories. All of them matter.

"The difference between a prototype and a production project isn't the code you write. It's the code you delete."

Built in Nairobi, for the world. 🇰🇪

About the Author: Michael (Phantomojo) is a Cybersecurity student at Open University of Kenya and Team Lead of Team GhostWire, competing in GCD4F 2026. He builds encrypted mesh networks, bio-adaptive honeypots, and offline AI assistants from Nairobi, Kenya.