DEV Community

massiron
massiron

Posted on • Originally published at deepstrain.dev

repofuse: Compress your codebase into ~1500 tokens for LLM consumption

The Problem

You're working on a large codebase. You want to ask an LLM a question about it — refactor this module, explain that architecture, find security issues. So you dump the relevant files into the prompt.

Then your context window overflows. You hit the token limit. You trim files, lose context, get worse answers. Or you pay for GPT-4-32k tokens on every question.

If you're using an AI coding assistant — Copilot, Cursor, Claude Code, whatever — your AI is either seeing a shallow slice of your repo or you're burning money on context.

The Approach

I built repofuse — a zero-dependency Python tool that compresses your entire codebase into a ~1500 token structured summary. It doesn't dump source files. It extracts the signals an LLM actually needs:

  • Module tree — what files exist and how they're organized
  • Dependency graph — who imports what, module coupling
  • Risk-ranked functions — dead code, high cyclomatic complexity, security-sensitive patterns

All in a single JSON block the LLM can consume in one shot.

Show, Don't Tell

pip install repofuse
cd your-project
repofuse fuse --output context.json
Enter fullscreen mode Exit fullscreen mode

This takes a project with hundreds of files and produces ~1500 tokens. You can then feed it directly:

cat context.json | your-llm-cli "Find security vulnerabilities in this codebase"
Enter fullscreen mode Exit fullscreen mode

Or pipe it into your AI assistant's system prompt. The key insight: the LLM gets a map, not a dump. It knows what's there, how things connect, and where risk lives — without reading every line.

What's Inside

The output is a structured JSON block. Here's a simplified version of what you get:

{
  "tree": {
    "src/api/": ["routes.py", "middleware.py"],
    "src/db/": ["models.py", "migrations.py"]
  },
  "dependencies": {
    "src/api/routes.py": ["src/db/models.py", "src/lib/auth.py"]
  },
  "risk": [
    {
      "function": "handle_payment",
      "file": "src/payments/processor.py",
      "risk": "high",
      "reason": "cyclomatic complexity 15, no input validation"
    }
  ]
}
Enter fullscreen mode Exit fullscreen mode

The LLM gets the architecture and the hotspots without wading through implementation details. If it needs more detail on a specific function, it can ask — you've already saved 95% of tokens.

Where It Shines

  • Monorepos with hundreds of files that overflow every context window
  • CI/CD pipelines — generate the fuse file on every commit, keep your AI assistant always in sync
  • AI agents that need structured understanding of unfamiliar codebases
  • Cost-sensitive teams — every token you don't send is money saved

Trade-offs

It's not a replacement for reading the code yourself. If you need line-by-line accuracy, dump the files. But if you're asking architectural questions — "how does auth work in this project?" or "which modules depend on the legacy data layer?" — the fuse file gives you better answers than a handful of files you guessed at.

Try It

Free, open-source, pip installable:

pip install repofuse
Enter fullscreen mode Exit fullscreen mode

Repo: https://github.com/massiron/repofuse

Docs: https://deepstrain.dev

No registration. No API key. Just Python stdlib.

I'd love to hear where it breaks for your codebase — file an issue or drop a comment.

Top comments (0)