Last June, Karpathy posted something that got 2.3 million views. He said context engineering matters more than prompt engineering specifically, "the delicate art and science of filling the context window with just the right information for the next step."
Then last week he posted about building structured markdown knowledge bases that LLMs can reason over. Also went viral.
Both ideas point at the same problem: your AI is only as good as the context you give it. And right now, most of us are giving it terrible context.
the problem nobody's measuring
Every time you start a Claude Code session, it spends the first chunk of time just figuring out your project. Reading files. Grepping for routes. Opening package.json. Exploring the import graph. Finding your schema. Checking your env vars.
I started measuring how many tokens this costs. On a real 92-file monorepo (Hono + Drizzle, 4 workspaces): ~66,000 tokens. Every session. Not cached between sessions.
On a 53-file project: ~46,000 tokens. On a 40-file project: ~26,000.
That's your AI burning through your context window (the "RAM" in Karpathy's analogy) just to understand the project before it does anything you actually asked for.
what context engineering looks like in practice
If you follow Karpathy's framing, the solution is obvious: don't let your AI waste context exploring. Pre-compile the context it needs and hand it over at session start.
That's what I built. npx codesight scans your codebase via AST parsing and generates a structured context map (routes, schema, components, dependency graph, env vars, middleware, hot files)in one markdown file your AI reads immediately.
npx codesight
One command. Zero dependencies. It borrows TypeScript from your own node_modules for the compiler API. Falls back to regex for non-TS projects.
the numbers
Real production codebases. Not toy demos.
| Project | Files | codesight Output | Manual Exploration | Reduction |
|---|---|---|---|---|
| SaaS A (Hono + Drizzle monorepo) | 92 | 5,129 tokens | ~66,040 tokens | 12.9x |
| SaaS B (raw HTTP + Drizzle) | 53 | 3,945 tokens | ~46,020 tokens | 11.7x |
| SaaS C (Hono + Drizzle, 3 workspaces) | 40 | 2,865 tokens | ~26,130 tokens | 9.1x |
Average: 11.2x. Your AI reads 3-5K tokens of structured context instead of burning 26-66K tokens exploring.
why AST matters
Regex-based tools guess at your code structure. AST parsing actually understands it.
When TypeScript is in your project, codesight uses the real TypeScript compiler API. This means:
- Follows
router.use('/prefix', subRouter)chains (regex misses nested routers) - Combines NestJS
@Controller('users')+@Get(':id')into/users/:id - Parses tRPC
router({ users: userRouter })nesting correctly - Extracts Drizzle field types from
.primaryKey().notNull()chains - Detects middleware in route handler chains:
app.get('/path', auth, handler) - Filters out false positives like
c.get('userId')that regex would match as routes
Zero false positives across all three benchmark projects.
25+ frameworks detected. 8 ORMs parsed. React/Vue/Svelte components with props.
blast radius — context engineering for changes
Karpathy's framing isn't just about initial context. It's about giving your AI the right information for "the next step." When the next step is changing a file, your AI needs to know what breaks.
npx codesight --blast src/db/index.ts
BFS through the import graph. Shows every transitively affected file, route, and model.
On BuildRadar, changing the database module correctly identified 10 affected files, 33 routes, and all 12 models. Three hops deep.
Your AI reads this before touching the file. That's context engineering applied to refactoring.
the wiki layer (Karpathy's latest idea, automated)
Karpathy's April 3rd post was about structured markdown wikis that LLMs can reason over. codesight v1.6.2 added --wiki which does exactly this for your codebase:
npx codesight --wiki
It generates a wiki knowledge base in .codesight/wiki/ — an index.md (200 tokens) plus individual articles per topic. Your AI reads the index at session start, then pulls the one relevant article for each question.
Without codesight: AI reads 26-66K tokens exploring.
With codesight: AI reads 3-5K tokens (the full map).
With --wiki: AI reads ~200 tokens at start, then ~160-350 per question.
Combined reduction: ~91x.
it generates context for everything
One command creates context files for every major AI tool:
npx codesight --init
-
CLAUDE.mdfor Claude Code -
.cursorrulesfor Cursor -
codex.mdfor OpenAI Codex -
AGENTS.mdfor Codex agents -
.github/copilot-instructions.mdfor GitHub Copilot
Each pre-filled with your actual project structure.
MCP server mode
npx codesight --mcp
Runs as a Model Context Protocol server. Your AI queries specific context on demand instead of loading everything. Session caching — first call scans, subsequent calls return instantly.
the relationship to caveman mode
The caveman prompt trick reduces output tokens (what the AI says back).
codesight reduces input/exploration tokens (what the AI reads to understand your project).
Caveman = make the AI talk less.
codesight = give the AI exactly what it needs to know.
They're complementary. Use both.
try it
npx codesight
Zero deps. MIT. ~200ms scan time. Works with any Node.js project (and has regex fallback for Python, Go, Ruby, Rust, Java, Kotlin, Elixir, PHP).
(https://github.com/Houseofmvps/codesight)
If it saves you tokens, a star helps others find it too.
Karpathy defined the skill. This tool automates it.
Built by Kailesk Khumar, solo founder of houseofmvps.com.
Top comments (0)