If you've ever hit a token limit trying to feed a codebase to an LLM, you know the pain: truncate the files and lose critical context, or pay for more tokens than makes sense.
Repofuse solves this by compressing your entire codebase into a structured ~1500 token context pack — module tree, dependency graph, and risk-ranked function list in one portable JSON block. That's a 95% token savings vs. dumping raw source files.
Let's walk through it.
Install
pip install repofuse
Zero dependencies — pure Python, stdlib only. Works with any Python project on Linux, macOS, or Windows.
One-shot run
repofuse .
You'll get a JSON output in stdout. You can redirect it to a file, pipe it to a clipboard tool, or feed it directly to an LLM:
repofuse . > context-pack.json
The output
A context pack contains three sections:
- module_tree — All source files arranged as a tree, with line counts per file. Your LLM sees the project skeleton immediately.
- dependency_graph — Edges between modules (who imports whom). Enables reasoning about coupling and change impact.
- risk_ranked_functions — Functions sorted by risk indicators (complexity, cyclomatic depth, number of imports). Lets the LLM focus attention on the most critical code.
Example snippet:
{
"module_tree": {
"src/app.py": 120,
"src/models.py": 85
},
"dependency_graph": [
{"from": "src/app.py", "to": "src/models.py"}
],
"risk_ranked_functions": [
{"name": "process_payment", "file": "src/payments.py", "risk_score": 0.87, "reason": "High cyclomatic complexity, 5 conditional branches, 3 external imports"}
]
}
CI integration
Add it to your CI pipeline so every commit ships an up-to-date context pack:
# .github/workflows/context-pack.yml
name: Update context pack
on: [push]
jobs:
pack:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: pip install repofuse
- run: repofuse . > context-pack.json
- uses: stefanzweifel/git-auto-commit-action@v5
with:
commit_message: "auto: update context pack"
When would you use this?
- Your team's monorepo has hundreds of files, and claude/gpt-4 keeps forgetting module structure after two files.
- You're building an AI agent that needs to understand a codebase before writing code in it. A context pack is far more reliable than a few random source files.
- You're onboarding to a new repo and want to dump the whole thing into an AI chat in one shot.
Limitations (honest ones)
- Python only — repofuse parses Python AST. It won't read TypeScript, Go, or Rust (yet).
- Static analysis only — risk scores are based on structural metrics, not runtime data. A function with a high risk score might be perfectly safe if it's well-tested.
- Tree + deps + risk, not code — the output replaces raw source files. You still need the actual code for line-level details. The context pack is a map, not the territory.
Try it
pip install repofuse && repofuse .
Repo: github.com/massiron/repofuse
Docs: deepstrain.dev
Free and open-source.
Top comments (0)