Creatman

Posted on Apr 17

The context problem nobody talks about: why AI coding agents waste 80% of tokens on files they already read yesterday

#ai #graphify #agents #productivity

Every AI coding agent — Claude Code, Cursor, Codex, Gemini CLI — starts every session completely blind. It doesn't know your projects. It doesn't know your servers. It doesn't remember that you spent three hours yesterday debugging the payment system.

So it greps. It reads file after file. It SSHs into your server to check what's running. It asks you "which project?" for the hundredth time. By the time it's oriented, you've burned half your context window on reconnaissance.

I manage 15 projects across 4 VPS servers. This was costing me hours of context per day. So I built a fix.

The pattern: hierarchical context

The idea is dead simple. Instead of the agent searching bottom-up (grep everything → read files → build understanding), give it a top-down map:

Level 0: Project Map     — knows ALL your projects       (~2KB, always loaded)
Level 1: Project Detail  — architecture of one project   (~5KB, on demand)  
Level 2: Source Files     — actual code                   (only when needed)

That's it. Three files instead of fifty. The agent reads the map, knows where to look, and goes straight to the answer.

What this looks like in practice

Without hierarchy

You: "What payment methods does Project A support?"

Agent:

Greps C:\Users\ for anything payment-related (3 tool calls)
Finds 6 candidate files, reads them all (6 tool calls)
Realizes it's the wrong project, searches more (4 tool calls)
SSHs into your server to read the production config (2 tool calls)
Finally answers — 15+ tool calls, 80K+ tokens, 8 minutes

With hierarchy

You: "What payment methods does Project A support?"

Agent:

Reads Level 0 — sees Project A is at ~/projects/a/ (already loaded, 0 calls)
Reads ~/projects/a/CLAUDE.md — sees "Payments: Stars + CryptoCloud" (1 call)
Answers immediately — 1 tool call, ~15K tokens, 10 seconds

Same question. Same agent. Same model. The only difference is a 2KB file that says "here's where everything is."

Setting it up (10 minutes)

Step 1: Create your project map

Add this to your agent's global instruction file (~/.claude/CLAUDE.md for Claude Code, .cursorrules for Cursor, AGENTS.md for Codex):

## Project Map

| Project | Local path | Server |
|---------|-----------|--------|
| **Auth Service** | `~/projects/auth/` | prod-1:/root/auth/ |
| **Landing Page** | `~/projects/landing/` | Cloudflare Pages |
| **Mobile App** | `~/projects/mobile/` | — |
| **Admin Panel** | `~/projects/admin/` | prod-1 (Docker) |

### Servers
| Name | IP | Key |
|------|-----|-----|
| prod-1 | x.x.x.x | ~/.ssh/prod |
| staging | y.y.y.y | ~/.ssh/staging |

### Rule
Read project CLAUDE.md before reading source files.

This is your Level 0. It's ~2KB. The agent loads it automatically at the start of every session.

Step 2: Add CLAUDE.md to each project

In each project root, create a context file:

# Auth Service — CLAUDE.md

## Status: LIVE
API for user authentication. Handles OAuth, JWT, rate limiting.

## Tech Stack
Python 3.12, FastAPI, PostgreSQL, Redis

## Key Files
- main.py — entry point, route registration
- auth/jwt.py — token generation and validation  
- auth/oauth.py — Google/GitHub OAuth providers
- models/user.py — SQLAlchemy user model

## Deployment
- Server: prod-1 (x.x.x.x)
- Service: auth-service.service
- Logs: journalctl -u auth-service -f

This is Level 1. ~3-5KB per project. The agent reads it when you mention the project and immediately knows the architecture.

Step 3 (optional): Add Graphify for code navigation

Graphify turns your codebase into a knowledge graph. Run it once per project:

pip install graphifyy
cd ~/projects/auth

In your AI agent:

/graphify .
graphify claude install

Now the agent has Level 1.5 — a structural map of your code. Before grepping, it consults the graph and knows exactly which file to read.

Step 4 (optional): Connect Claude Desktop via MCP

If you use Claude Desktop, add Graphify as an MCP server:

{
  "mcpServers": {
    "graphify": {
      "command": "python",
      "args": ["-m", "graphify.serve", "/path/to/graphify-out/graph.json"]
    }
  }
}

Desktop automatically calls query_graph when you ask about your projects. No prompting needed — it just works.

Real test results

I ran the same questions with and without the hierarchy. Same model (Haiku — the cheapest), same machine, same projects.

"What is the architecture of Project A?"

	Blind agent	With hierarchy
Tool calls	12	1
Behavior	Grep → read 4 files → build answer	Read CLAUDE.md → answer
Accuracy	Correct	Correct

"Which of my projects use library X?"

	Blind agent	With hierarchy
Tool calls	44	2
Behavior	Scan entire disk	Targeted grep in known paths
Accuracy	Missed 1 of 3 projects	Found all 3

"Where is Project B deployed? Service name? Logs?"

	Blind agent	With hierarchy
Tool calls	9	0
Behavior	Read configs + SSH into server	Answered from context
SSH needed	Yes	No

The blind agent in T2 actually missed a project that the hierarchy-equipped agent found. More context didn't just save tokens — it produced better answers.

Why this works

AI coding agents are fundamentally search engines. When you ask a question, they search for the answer. The quality of the answer depends on the quality of the search.

Without context, the agent searches blind: grep everything, read everything, hope to find the right files. With a hierarchy, the search is directed: check the map, go to the right project, read the right file.

This isn't a new idea. It's how humans navigate codebases — you don't grep -r your company's entire monorepo every time someone asks about a service. You know which repo, which module, which file. The hierarchy gives the agent the same knowledge.

What this is NOT

Not a framework. It's a pattern — three markdown files.
Not a token compression tool. The savings come from not reading files, not from compressing them.
Not a replacement for Graphify. Graphify handles code-level navigation. This handles project-level navigation. They complement each other.
Not magic. If your project doesn't have a CLAUDE.md, the agent still greps. You have to write the context files.

The full setup

Templates, scripts, and multi-platform guides:

github.com/CreatmanCEO/ai-context-hierarchy

Includes:

Level 0 and Level 1 templates
Conversation indexing scripts (Claude Code sessions + Desktop export → searchable markdown)
VPS sync command template
Platform-specific setup for Claude Code, Cursor, Codex, Gemini CLI

Bonus: conversation indexing

Your past conversations with the AI contain architectural decisions, debugging sessions, deployment notes. But the agent can't search them.

The repo includes parsers that convert Claude Code session logs (.jsonl) and Claude Desktop exported chats into markdown files with YAML frontmatter:

---
title: "Fixed payment webhook"
date: 2026-04-14
project: auth-service
topics: ["webhook", "cryptocloud", "cloudflare"]
files_touched: ["payments.py", "webhook.py"]
---

Index these with Graphify and the agent can find "what did we decide about the payment flow last week" without you re-explaining it.

Start here

Write a project map (5 minutes)
Add CLAUDE.md to your main project (5 minutes)
Ask the agent about your project in a new session
Watch it answer without grepping

That's the whole thing. No dependencies, no installation, no configuration. Just three markdown files that turn your blind agent into one that knows where to look.

Built with Graphify for code-level navigation. Source and templates: ai-context-hierarchy.

Top comments (1)

Jill Mercer • Apr 18

this context drift is the silent killer of a good flow — i spend half my time re-explaining my schema just to get a simple ui change. i’ve been trying to keep my files tiny just to save the tokens but then the project structure gets messy. vibe coding works great until the agent forgets what we did ten minutes ago. really curious to see how you're handling the logic on your side.