If you use Claude Code, Cursor, or Aider - you feel the pain.
You ask to fix a bug and it dumps 15 files into context. 45K tokens wasted.
I built PMC Engine - AST-level compression that scores every symbol and sends only what's needed.
The Numbers
FastAPI (48 files, 33K LOC, DeepSeek V4 Flash):
- Simple fix: 45K -> 7.1K (84%)
- Refactor: 85K -> 7.8K (91%)
- Complex: 148K -> 7.7K (95%)
- Avg: 91.8% reduction, 100% quality, <5ms
Quick Start
pip install pmc-engine
pmc index ./my-project
pmc serve --port 8080
Top comments (1)
AST compression is the clever, underused lever - instead of dumping whole files into context, you send the structural skeleton (signatures, types, call graph) and only expand the bodies the task actually needs. 96% is plausible because most of what people paste into a model is irrelevant implementation detail; the model usually needs the shape, not every line. Treating code as a tree you can prune beats treating it as a flat blob of tokens.
The thing I'd flag: AST-aware context selection pairs beautifully with model-routing - once you've cut the context 96%, the remaining work is small enough that cheap models handle most of it, compounding the savings. That combination (smart context + per-step routing) is exactly how Moonshift (a multi-agent pipeline shipping a prompt to a real SaaS) keeps a full build ~$3 flat. Context compression and routing are the two biggest levers and they multiply. Genuinely sharp technique - are you doing the AST pruning per-task (only expand touched nodes) or a static summary? The dynamic per-task version is where the 96% really lives.