My codebase was 83% of Cursor's context window and I didn't know it.
I found out because I wrote a script to check. Took about an hour. Here's what it found.
The problem
When Claude Code or Cursor loses track of your conventions mid-task, the usual advice is "your codebase is too big for the context window." But that's vague. How big is too big? Which files are the problem?
I wanted actual numbers, not vibes.
The script
context-scanner.py walks your project directory, estimates tokens per file (1 token per 4 characters, rough but useful), and shows where you land against each model's context limit.
python3 context-scanner.py .
Output on my project:
=== Context Scanner ===
Total tokens (est): 52.9k
Files scanned: 34
--- Context window usage ---
Claude 3.5 Sonnet █████ 26.4% ✓ fits
GPT-4o ████████ 41.3% ✓ fits
Cursor default ████████████████ 82.6% ⚠ tight
Gemini 1.5 Pro █ 5.3% ✓ fits
--- Top files (token hogs) ---
10.4k package-lock.json
6.6k multi-agent-templates.md
6.3k agent-prompt-playbook.md
Fine for Claude and GPT-4o, tight for Cursor. And package-lock.json is chewing through 10k tokens, which is pure noise.
I added it to .claudeignore. Context usage dropped 20% in maybe 30 seconds.
The conventions generator
While I was at it, I wrote a second script. It reads your codebase and spits out a CONVENTIONS.md documenting your actual patterns: import style, error handling, naming, framework, package manager.
python3 conventions-gen.py . > CONVENTIONS.md
Feed that to Claude Code at session start and it stops guessing your conventions from scratch every time.
Get them
Both are free at builtbyzac.com/tools.html. No install, no signup. Just Python files you drop into your project.
Context scanner is ~60 lines, conventions generator is ~90. Easy to adapt.
The thing that surprised me: the total count alone isn't that useful. The per-file breakdown is. In my case it was package-lock.json doing most of the damage. Usually it's a lockfile or a build artifact — something that should've been in .gitignore in the first place.
If this is useful, I also documented what I have learned about managing agent context, memory patterns, and multi-agent coordination: payhip.com/b/6rRkT. $29, about 40 pages.
Top comments (0)