"Just paste the repo into the model" runs into a hard wall: the context window. Paste too much and you get a truncation error, or — worse — the model silently drops the earliest files and answers from a partial picture. The fix is to treat "will it fit?" as a number you compute before you paste.
Step 1: estimate tokens without calling an API
You don't need a network round-trip to get a usable estimate. For source code, a blend of two signals is within ~5–10% of real BPE tokenizers:
- Characters ÷ ~3.6 — code tokenizes denser than prose (more punctuation and identifiers).
- Count of word/symbol runs × ~1.15 — a second signal that corrects the char estimate on symbol-heavy files.
Average the two and you have a fast, offline token estimate. Good enough to answer "does it fit?"
Step 2: check it against the model you're targeting
Context windows vary a lot, so budget against the specific model:
| Model | Context |
|---|---|
| Claude (Fable 5 / Opus / Sonnet) | 200K |
| GPT-5 | 400K |
| GPT-4.1 | 1M |
| Gemini 2.5 Pro | 1M |
Report the bundle as a percentage of the target window — "48K tokens = 24% of 200K" tells you at a glance whether you have room left for the actual conversation.
Step 3: if it's over budget, trim by importance — not at random
When a repo is too big, the naive move (truncate the end) throws away whoever files happen to be last. Better: omit the largest file bodies first, but keep every file listed. The model still sees the full project map (so it knows payments/refund.ts exists) even if that file's body didn't make the cut.
With ctxpack this is one flag:
npx github:trongtruong110-ux/ctxpack . --fit 60000 -o context.md
ctxpack: 220 files packed
tokens: ~59,400
trimmed: 34 file(s) omitted to fit 60,000 tokens
Every file is still named in the index; only the biggest bodies are dropped to hit the budget.
The habit
- Estimate before you paste — "does it fit?" is answerable up front.
- Budget per model — a bundle that fits Gemini may blow Claude's window.
- Trim by size, keep the map — a partial bundle that still lists every file beats a truncated one that hides what's missing.
ctxpack is MIT-licensed and free: https://github.com/trongtruong110-ux/ctxpack. How do you currently decide what to include when a repo is too big for one prompt?
Top comments (2)
Treating context fit as a number to compute before pasting anything is the right habit, especially now that silent truncation is often more dangerous than a hard failure. The offline estimate is pragmatic because it gives teams a cheap preflight check instead of waiting until a long agent run starts making decisions from a partial repo snapshot. What matters after that is deciding what gets trimmed first and being able to verify which files the agent actually saw, since “fits in theory” and “used the right context” are different questions. That is where I’ve found trace tooling like agent-inspect useful around coding-agent workflows. I’d be curious whether you think the next step is better token budgeting, or better repo summarization layers so agents stop needing the raw code volume in the first place.
Interesting project. For me it's important to apply best pratices such as keeping the number of lines of code less than 400 lines for example. Or, establish a clear folder structure.