DEV Community

Ville Takanen
Ville Takanen

Posted on

How I Learned to Love the AGENTS.md

My experience with agent instruction files began when GitHub Copilot started supporting copilot-instructions.md. Suddenly, I could steer. Add a few lines about our architecture and conventions — and the stochastic beast would follow. It felt like a superpower.

So naturally, I drove harder.

The 8,000-Word Monster

The file grew the way they always do. Agent selects a deprecated library — adds a line. Mismarks an import — another line. A colleague includes their preferences. Someone pastes in the style guide, just in case.

By the time we chose AGENTS.md as our main file (symlinked to CLAUDE.md for Claude Code), it was 8,000 words long. Every design pattern, every edge case, every principle we had learned the hard way.

And the results began to decline. The agent explored more widely, spent tokens on reasoning about irrelevant constraints, and confidently made mistakes in areas we had over-specified. The file intended to make the agent smarter was actually confusing it.

"AGENTS.md Is Harmful" (It's Not)

Around this time, ETH Zurich published a study (Gloaguen et al., 2026) testing instruction files across 138 real repositories. The hot takes wrote themselves: "context files reduce agent performance!" Twitter declared the concept dead.

That's not what the study says.

LLM-generated context files — like those produced by /init commands — lower success rates and increase costs by over 20%. Human-written files improve outcomes (+4%) but only if they are minimal and precise. The mechanism: agents follow instructions faithfully, so every unnecessary instruction expands the search space. The agent dutifully considers constraints that don't apply and wastes tokens on reasoning that doesn't help.

The study isn't saying you shouldn't write an AGENTS.md. It's saying that most of what people include in it is noise.

Learning to Think Like an Agent

Studying what makes a good AGENTS.md taught me something unexpected. It didn't just make me a better agent driver. It helped me understand how an agent "thinks."

Agents don't build mental models. They read, tokenize, and generate. Every token of instruction competes for attention with every token of the actual task. Context is zero-sum.

That reframed the whole problem. Not documentation — signal engineering. What's the minimum context that produces the maximum behavioral shift?

The Pink Elephant Problem

The first thing I cut was every "don't do X" line.

Tell an agent "Don't use Zustand" and see what happens. The agent thinks about Zustand for the next 200 tokens. You've made it salient. It might avoid it — probably — but you've focused attention on avoidance instead of guidance.

I found dozens of these. "Don't use class components." "Avoid raw SQL." "Never import from legacy." Every line anchored the agent to the exact thing I wanted it to ignore.

The fix: only say what to use. "State lives in local-storage" is a constraint. "Don't use Redux" is noise. One guides. One confuses.

Toolchain First

The second change is exactly what a tool already enforces. If Biome detects it, why is it listed in AGENTS.md? If tsc --strict stops it, why create a rule?

Everything that survived had to pass one test: can a tool enforce this? If yes, delete it. What remained was pure judgment — the decisions no linter can make.

What's Left: 170 Words

After months of cutting, here's the structure. It fits on one screen.

Mission — Domain context that the agent can't infer from code.

Project: Real-time inventory management for warehouse staff.
Domain: logistics, low-bandwidth, fault tolerance.
Core: sync local state with the central DB when connectivity is restored.
Enter fullscreen mode Exit fullscreen mode

Toolchain — Commands. No philosophy. Just facts.

- pnpm test        — vitest (fails below 85% coverage)
- pnpm lint        — Biome (see biome.json)
- pnpm e2e         — Playwright
Enter fullscreen mode Exit fullscreen mode

Judgment Boundaries — Three tiers. No interpretation required.

NEVER: modify schema without migration, add deps without discussion
ASK:   change API contract, delete files
ALWAYS: run tests before commit, check the spec first
Enter fullscreen mode Exit fullscreen mode

Everything an agent needs. Nothing it doesn't.

What Changed

Since going minimal, three things have happened. Agent output quality went up — not because agents got smarter, but because I stopped confusing them. I stopped fighting the agent — when the file fits in your head, you can reason about what it'll do. And I started understanding agents — the discipline of writing minimal context forced me to think about attention, salience, and how LLMs process instructions.

Your AGENTS.md shouldn't be a manual. It should be a compass.


The AGENTS.md specification has the full pattern with research references. For the theory behind minimal context, see Context Engineering.

Top comments (0)