DEV Community

Will your codebase fit in the context window? How to measure it (and trim to fit)

"Just paste the repo into the model" runs into a hard wall: the context window. Paste too much and you get a truncation error, or — worse — the model silently drops the earliest files and answers from a partial picture. The fix is to treat "will it fit?" as a number you compute before you paste.

Step 1: estimate tokens without calling an API

You don't need a network round-trip to get a usable estimate. For source code, a blend of two signals is within ~5–10% of real BPE tokenizers:

  • Characters ÷ ~3.6 — code tokenizes denser than prose (more punctuation and identifiers).
  • Count of word/symbol runs × ~1.15 — a second signal that corrects the char estimate on symbol-heavy files.

Average the two and you have a fast, offline token estimate. Good enough to answer "does it fit?"

Step 2: check it against the model you're targeting

Context windows vary a lot, so budget against the specific model:

Model Context
Claude (Fable 5 / Opus / Sonnet) 200K
GPT-5 400K
GPT-4.1 1M
Gemini 2.5 Pro 1M

Report the bundle as a percentage of the target window — "48K tokens = 24% of 200K" tells you at a glance whether you have room left for the actual conversation.

Step 3: if it's over budget, trim by importance — not at random

When a repo is too big, the naive move (truncate the end) throws away whoever files happen to be last. Better: omit the largest file bodies first, but keep every file listed. The model still sees the full project map (so it knows payments/refund.ts exists) even if that file's body didn't make the cut.

With ctxpack this is one flag:

npx github:trongtruong110-ux/ctxpack . --fit 60000 -o context.md
Enter fullscreen mode Exit fullscreen mode
ctxpack: 220 files packed
  tokens: ~59,400
  trimmed: 34 file(s) omitted to fit 60,000 tokens
Enter fullscreen mode Exit fullscreen mode

Every file is still named in the index; only the biggest bodies are dropped to hit the budget.

The habit

  1. Estimate before you paste — "does it fit?" is answerable up front.
  2. Budget per model — a bundle that fits Gemini may blow Claude's window.
  3. Trim by size, keep the map — a partial bundle that still lists every file beats a truncated one that hides what's missing.

ctxpack is MIT-licensed and free: https://github.com/trongtruong110-ux/ctxpack. How do you currently decide what to include when a repo is too big for one prompt?

Top comments (2)

Collapse
 
raju_dandigam profile image
Raju Dandigam

Treating context fit as a number to compute before pasting anything is the right habit, especially now that silent truncation is often more dangerous than a hard failure. The offline estimate is pragmatic because it gives teams a cheap preflight check instead of waiting until a long agent run starts making decisions from a partial repo snapshot. What matters after that is deciding what gets trimmed first and being able to verify which files the agent actually saw, since “fits in theory” and “used the right context” are different questions. That is where I’ve found trace tooling like agent-inspect useful around coding-agent workflows. I’d be curious whether you think the next step is better token budgeting, or better repo summarization layers so agents stop needing the raw code volume in the first place.

Collapse
 
marrouchi profile image
Med Marrouchi

Interesting project. For me it's important to apply best pratices such as keeping the number of lines of code less than 400 lines for example. Or, establish a clear folder structure.