repofuse: Compress your codebase into ~1500 tokens for LLM consumption

#repofuse

The Problem

You're working on a large codebase. You want to ask an LLM a question about it — refactor this module, explain that architecture, find security issues. So you dump the relevant files into the prompt.

Then your context window overflows. You hit the token limit. You trim files, lose context, get worse answers. Or you pay for GPT-4-32k tokens on every question.

If you're using an AI coding assistant — Copilot, Cursor, Claude Code, whatever — your AI is either seeing a shallow slice of your repo or you're burning money on context.

The Approach

I built repofuse — a zero-dependency Python tool that compresses your entire codebase into a ~1500 token structured summary. It doesn't dump source files. It extracts the signals an LLM actually needs:

Module tree — what files exist and how they're organized
Dependency graph — who imports what, module coupling
Risk-ranked functions — dead code, high cyclomatic complexity, security-sensitive patterns

All in a single JSON block the LLM can consume in one shot.

Show, Don't Tell

pip install repofuse
cd your-project
repofuse fuse --output context.json

This takes a project with hundreds of files and produces ~1500 tokens. You can then feed it directly:

cat context.json | your-llm-cli "Find security vulnerabilities in this codebase"

Or pipe it into your AI assistant's system prompt. The key insight: the LLM gets a map, not a dump. It knows what's there, how things connect, and where risk lives — without reading every line.

What's Inside

The output is a structured JSON block. Here's a simplified version of what you get:

{
  "tree": {
    "src/api/": ["routes.py", "middleware.py"],
    "src/db/": ["models.py", "migrations.py"]
  },
  "dependencies": {
    "src/api/routes.py": ["src/db/models.py", "src/lib/auth.py"]
  },
  "risk": [
    {
      "function": "handle_payment",
      "file": "src/payments/processor.py",
      "risk": "high",
      "reason": "cyclomatic complexity 15, no input validation"
    }
  ]
}

The LLM gets the architecture and the hotspots without wading through implementation details. If it needs more detail on a specific function, it can ask — you've already saved 95% of tokens.

Where It Shines

Monorepos with hundreds of files that overflow every context window
CI/CD pipelines — generate the fuse file on every commit, keep your AI assistant always in sync
AI agents that need structured understanding of unfamiliar codebases
Cost-sensitive teams — every token you don't send is money saved

Trade-offs

It's not a replacement for reading the code yourself. If you need line-by-line accuracy, dump the files. But if you're asking architectural questions — "how does auth work in this project?" or "which modules depend on the legacy data layer?" — the fuse file gives you better answers than a handful of files you guessed at.