vamshidhar reddy

Posted on Apr 6

I built a linter that proves 74% of your AGENTS.md is wasting your AI agent's time

#ai #productivity #opensource #discuss

If you use Claude Code, Cursor, Codex, or Gemini CLI, you probably have an AGENTS.md or CLAUDE.md sitting in your project root.

It gets loaded into every single session. Every token in it competes for attention with your actual task.

And most of them are full of junk.

The problem I kept seeing

I was reviewing context files across open-source repos and noticed the same patterns everywhere:

Directory trees that agents discover with ls in 200ms
"Built with React 18, TypeScript, and Tailwind CSS" — readable from package.json
References to src/middleware/auth.ts that was renamed to src/auth/middleware.ts three months ago
"Use 2-space indentation" when .prettierrc already enforces it
Entire sections copy-pasted from the README

None of this helps the agent. All of it costs tokens on every session.

What I built

ctxlint — a CLI linter for AI agent context files.

npx @ctxlint/ctxlint check

8 deterministic rules. Zero LLM dependency. Pure filesystem analysis. Runs in milliseconds.

It supports AGENTS.md, CLAUDE.md, GEMINI.md, .cursorrules, and copilot-instructions.md.

The 8 rules

Rule	Severity	What it catches
`stale-file-ref`	error	References to files or directories that no longer exist
`stale-command`	error	Build/test scripts that don't match your actual `package.json`
`no-directory-tree`	error	Embedded directory structures agents discover on their own
`no-inferable-stack`	warn	Tech stack descriptions already in your config files
`redundant-readme`	warn	Sections that duplicate your README.md
`no-style-guide`	info	Coding style rules that belong in eslint/prettier/ruff
`max-lines`	warn	Files over 200 lines (production teams keep theirs under 60)
`token-budget`	warn	Token cost estimate with signal-to-noise breakdown

Every rule runs against your actual codebase — it checks whether referenced files exist, whether npm scripts are real, whether your linter config already covers a style rule.

What it looks like

Running ctxlint check on a typical bloated context file:

AGENTS.md

  ✗ no-directory-tree  Lines 4-7 contain a directory tree (~14 tokens)
     Agents discover file structure via ls/find — this just adds noise.

  ✗ stale-command  `npm run test:e2e` — script does not exist
     Available: dev, build, test, lint, typecheck

  ✗ stale-file-ref  `src/config/auth.ts` does not exist
     Stale refs mislead agents into searching for ghost files.

  ⚠ token-budget  24 lines, ~104 tokens
     Signal: 40 tokens (38%) ✓  Noise: 64 tokens (62%) ✗
     Ratio: 0.38 (poor)  Monthly: $0.09 → $0.03 (67% saved)

  Summary: 3 errors, 1 warning, 0 info

I tested it on real repos

I didn't just test against toy fixtures. I cloned 8 popular open-source repos that have real context files maintained by their teams, and ran ctxlint against each one.

Repo	Context file	Key findings
next.js	AGENTS.md	Directory tree present. Multiple stale file refs — paths are relative to `packages/next/` but written as root-relative
langchain	CLAUDE.md	Directory tree. Stale monorepo paths. Parent-relative refs (`../`) that don't resolve from root
codex	AGENTS.md	5 stale file refs — files live under `codex-rs/` subdirectory but refs assume root
ruff	CLAUDE.md	Borderline redundant-readme overlap. Architectural naming convention flagged (false positive — we fixed it)
anthropic-cookbook	CLAUDE.md	Directory tree flagged correctly

Overall precision: 91%. Zero crashes across all repos.

The most common issue by far: stale file references in monorepos. People write paths assuming root, but the files actually live two directories deep in a workspace package.

What surprised me

Directory trees are universal and universally useless. Almost every auto-generated context file has one. Agents don't use them — they run ls and find themselves. It's the single biggest source of token waste.

Stale commands are dangerous, not just wasteful. When your context file says npm run test:e2e but that script was renamed to test:integration last month, the agent runs the stale command, gets an error, spends tokens debugging a non-existent script, and then discovers the right one on its own. You paid triple.

The hardest rule to get right was redundant-readme. I use trigram overlap to detect similarity between context file sections and README sections. At 40% threshold, it catches real duplication but occasionally flags sections that share vocabulary without actually saying the same thing. Still tuning this one.

Style guide rules are always inferable. If you have .prettierrc or .eslintrc in your repo, every style rule in your context file is redundant. The agent reads the formatter output, not your prose.

The technical decisions

Why rule-based instead of LLM-powered? Because the problem is deterministic. Checking if src/auth.ts exists is fs.existsSync(). Checking if npm run test:e2e is a real script is JSON parsing. Using an LLM to do filesystem checks would be slower, more expensive, and less reliable.

Why zero dependencies beyond commander? Keeps the install fast, the attack surface small, and npx @ctxlint/ctxlint check works without polluting node_modules. No chalk (ANSI codes directly), no glob (recursive readdir), no YAML parser (regex for key fields in pyproject.toml).

Why synchronous I/O? A linter runs once and exits. fs.readFileSync is simpler than async/await chains and fast enough — the entire analysis takes under 100ms on repos with 10,000+ files.

Try it

# Check your context file
npx @ctxlint/ctxlint check

# Generate a minimal one from scratch
npx @ctxlint/ctxlint init --dry-run

# Strip the bloat from an existing file
npx @ctxlint/ctxlint slim --dry-run AGENTS.md

# Check for drift since last update
npx @ctxlint/ctxlint diff

What's next — and where I need help

This is early stage. 116 npm downloads on day 1 tells me the problem is real, but the tool needs work:

Monorepo awareness — the biggest source of false positives. Paths in context files often assume a workspace root, not the repo root
Python/Rust/Go ecosystems — currently strongest on Node.js projects. Need pyproject.toml, Cargo.toml, and go.mod command extraction
VS Code extension — inline diagnostics instead of CLI output
GitHub Action — run ctxlint in CI and fail on stale refs

Good first issues are tagged in the repo. Whether it's a one-line regex fix or a full new rule, contributions are welcome.

GitHub: github.com/vamshidhar199/Ctxlint
npm: npmjs.com/package/@ctxlint/ctxlint

What's in your context file that you're not sure belongs there? Run npx @ctxlint/ctxlint check and share what it finds — I'm curious how it performs on projects I haven't tested yet.

Top comments (8)

Alois Sečkár • Apr 8

I like the idea. Just about that "116 npm downloads on day 1" - many, if not all, come from automated tools that monitor npmjs and test new packages for vulnerabilities and malicious code. So do not overestimate it ;) But I wish you luck with your project and hope you'll get natural numbers soon.

Benjamin Eckstein • Apr 6

Hey Vamshidhar,

The stale-file-ref rule landed immediately — I've watched agents spend multiple turns chasing a path that was renamed six months ago. But the signal-to-noise framing is the real insight. It's not just that noise wastes tokens — there's actual research showing context length hurts output quality even when the relevant info is right there. The noise competes for attention in a way that degrades the answer.

Ran into the same problem from a different angle: MCP tool schemas. My Atlassian server was loading 33 tools I'd explicitly disabled — around 10K tokens burning before my first prompt, every session. Same fix as your slim command: cut what the agent can discover itself (codewithagents.de/en/blog/the-22k-...). The disabledTools setting does nothing about it — you actually have to remove the server.

The no-directory-tree rule is the one I'd enforce first. Everyone adds them. Agents never use them.

Raju Dandigam • Jun 30

This is an interesting shift in thinking. Most teams optimize prompts, but far fewer examine the accumulated instruction debt sitting in AGENTS.md and similar files. Context hygiene is increasingly becoming a software engineering discipline of its own. I'd be interested in seeing how your scoring evolves when agent memory, MCP-discovered tools, and workflow-generated instructions are included alongside static guidance documents.

Thomas Landgraf • Apr 9

@itskondrat's pushback is the interesting one here — there IS a difference between "the agent can discover this" and "the agent should treat this as a constraint." A bloated context file often mixes both, and a linter that treats all redundancy as waste will flag intentional constraints as noise.

The approach I've been exploring with SPECLAN (I'm the creator) takes this to the structural extreme: instead of one context file with everything, each requirement is its own Markdown file with YAML frontmatter. The agent loads only the specific requirement it's implementing, plus its parent feature for scope. Constraints live in the frontmatter (status, acceptance criteria, parent ID), not prose paragraphs that compete with instructions for token space.

The 74% waste number probably holds for monolithic files because they're doing double duty — they're both "here's what exists" (discovery, which the agent could figure out from code) and "here's what matters" (constraints, which it can't). Splitting those into separate files eliminates the linting problem entirely because each file has a single purpose.

Archit Mittal • Apr 11

The signal-to-noise ratio framing is the right way to think about this. I've been maintaining CLAUDE.md files across multiple automation projects and the drift problem is real — you refactor a module, rename a directory, and the context file becomes a source of hallucination fuel rather than guidance.

The stale-command rule is the one that would save me the most pain. I've watched agents burn 3-4 tool calls trying to run a script that was renamed two sprints ago, then "discover" the correct command and act like nothing happened. That's not just token waste — it's latency in interactive sessions where you're waiting on the agent.

One thing I'd push back on slightly: the no-inferable-stack rule needs nuance. Declaring "this project uses n8n workflows with Claude API calls" in a context file isn't redundant even if package.json has the deps listed — it's telling the agent the architectural intent, not just the dependency graph. The agent knowing you chose n8n over Temporal for orchestration changes how it reasons about your error handling patterns. Raw deps don't convey that.

Would love to see the diff command expanded into a pre-commit hook that blocks PRs when context files reference paths changed in the diff. That would catch staleness at the source instead of during lint.

Mykola Kondratiuk • Apr 8

not sure linting for token efficiency is the right frame. some of that "redundant" context is there because you need the agent to treat it as a constraint, not just a discovery. there’s a difference between "the agent can find this" and "the agent should prioritize this".

Marc Verchiani • Apr 7 • Edited

Nice work and interesting project

stale-file-ref and stale-command are genuinely useful. Dead refs in context files are a real problem, especially in monorepos where paths drift after refactors. The agent wastes tokens chasing ghost files, that's real cost.

For others not sure, or I need to do the testing aswell

no-directory-tree : imho depends on the repo. On a monorepo with 200+ dirs, a targeted tree saves the agent 10 recursive ls calls. The rule should probably check repo size before flagging.

no-inferable-stack : "FastAPI + Next.js 16 + Supabase" in a context file gives immediate framing without the agent parsing package.json, pyproject.toml and docker-compose. The tokens cost should be less than the 3 file reads.

That said, I've noticed agents re-read config files anyway even when the info is already in the context. They're trained to verify rather than trust static context.

Apex Stack • Apr 6

This is exactly the kind of tooling the agent ecosystem needs right now. I maintain a pretty large CLAUDE.md for a project with 15+ scheduled agents, and the stale-file-ref rule alone would catch so many phantom references that accumulate as the codebase evolves.

The token-budget analysis is the real killer feature though. Most people have no idea how much of their context file is signal vs. noise — and when every token competes for attention in the context window, the noise isn't just wasteful, it actively degrades the agent's output quality.

Curious about one thing: does ctxlint handle nested includes or references between context files? E.g., when your CLAUDE.md references a glossary.md or other project files via @-mentions — does it validate those transitive references too?

Some comments may only be visible to logged-in visitors. Sign in to view all comments.