Viljami Kuosmanen for epilot

Posted on Feb 4

We Made Coding Agents Actually Reliable By Fixing One Thing

#ai #agents #coding #documentation

Last week, Vercel published research showing that giving coding agents a compact index of your documentation dramatically outperforms letting them search for answers on demand. Their eval results: 100% task success rate with the map approach, versus only 79% when agents had to actively look things up.

Same agent, same tasks, different approach to context. The difference between working and not working.

The insight clicked immediately. If we could give Claude Code reliable access to this institutional knowledge without forcing it to decide when to look things up, it would fundamentally change how people work in our codebase.

So we built it.

The Context Problem

Coding agents have a token limit. A ceiling on how much information they can process at once. Think of it like working memory. You can't hand Claude Code your entire codebase and documentation library upfront. It's too much.

The traditional solution is skills: the coding agent decides when it needs information and actively looks it up. "I need to know about authentication, let me search for that." Sounds reasonable. In practice it creates three problems:

Decision paralysis - the agent has to decide when to look up docs, and it often guesses wrong
Async delay - every lookup is a round-trip, breaks flow
Sequencing conflicts - exploring code vs. consulting docs creates timing issues

Vercel's approach flips this: give Claude Code a compressed index of what documentation exists and where to find it, before it starts work. The index is small enough to fit in context every turn, so it always knows what's available. When it needs details, it reads the specific file directly. No decisions, no lookups, no conflicts.

Why Compression Matters

The key innovation is compression. A full documentation tree - all the folder structures, file names, categories - takes significant space. Too much to include in every conversation turn.

The compressed index uses a simple pipe-delimited format that shrinks this by ~80%:

[epilot Docs Index]|root: ./.epilot-docs|00-general:{tech-stack.md,business-context.md,ci-cd.md}|01-apis:{api-design.md,calling-apis.md}|02-epilot360-microfrontends:{env-vars.md,local-dev.md}|...

That single line tells the agent:

Where the docs live (.epilot-docs/)
What categories exist (00-general, 01-apis, etc.)
What files are in each category

It's a table of contents, not the full book. But it's enough. The agent sees the map, understands what's available, and pulls specific files when needed. The decision-making load disappears.

Why This Matters

The accuracy improvement is significant (100% vs 79% task success), but it's not the real story.

The real story is what happens when you lower the barrier to contribution. Proper context enables the person closest to the problem to fix it, regardless of job title. Your PM can fix bugs. Designers can adjust component behavior. Support engineers can patch data issues. You remove bottlenecks. Context and authority live in the same person.

Coding agents are the equalizer. But they're only as good as the context you give them.

Most companies will throw Claude Code at their codebase and wonder why results are inconsistent. The agent hallucinates patterns. Makes incorrect assumptions. Writes code that doesn't match conventions.

The difference is context. Structured, compressed, always-available context about how your codebase works.

How to Build Your Own

The pattern is straightforward. Here's what we did at epilot:

Curate agent-friendly documentation - organize your internal knowledge: conventions, APIs, architectural patterns, framework usage, code style
Structure by domain - group related docs (general, backend, frontend, infrastructure, etc.)
Use descriptive filenames - the agent sees filenames in the compressed index before opening files. api-design.md is better than guidelines.md. error-handling.md is better than errors.md. Make filenames searchable and specific.
Automate updates - pull live data where possible (OpenAPI specs, schema definitions, framework docs)
Generate compressed index - use a simple format (pipe-delimited works well) that reduces the doc tree by ~80%
Embed in agent context - add the index to your CLAUDE.md or AGENTS.md file (the context files Claude Code reads)

One thing worth highlighting: we don't just include our internal documentation. We also package docs for the frameworks and libraries we heavily use - single-spa, openapi-backend, openapi-client-axios, i18next, and Volt UI (our custom design system). When Claude Code needs to know how i18next pluralization works or how to register a single-spa parcel, it already has the answer. No hallucination, no outdated Stack Overflow posts, just accurate framework documentation.

What This Enables

We're already seeing daily usage across the team. Developers context-switch between services faster. Non-engineers contribute directly instead of filing tickets.

But the real potential is broader: if compressed context improves coding agents for technical documentation, why stop there?

Runbooks and incident response - on-call engineers with instant access to procedures
Customer domain knowledge - support teams with context on product behavior
Business logic - product decisions and their rationale, preserved and accessible

The pattern is the same: curate the knowledge, compress it, embed it in context, let the agent work.

The Constraints Are Disappearing

For decades, contributing to a codebase required deep technical knowledge. You needed to understand the language, the frameworks, the architectural patterns, the implicit conventions. The barrier was high.

Coding agents lower it. Claude Code, Cursor, and similar tools don't replace engineers. They make technical knowledge more accessible. With the right tooling and the right context, a PM can fix bugs. A designer can adjust styling logic. A support engineer can patch data issues.

The question isn't whether this is possible. It's how fast you can adapt.

Organizations which enable broader contribution will move faster than those that don't. The tools exist. The research is clear. What's missing is execution.

Start Simple

You don't need to document everything upfront. Start with the knowledge that causes the most friction:

Code style conventions - how you write TypeScript, naming patterns, file structure
Common patterns - how you handle authentication, API calls, error handling
Framework specifics - non-obvious usage of your frameworks and libraries
Internal APIs - if you have OpenAPI specs, even better

Create a simple doc structure:

docs/
  00-general/
    code-style.md
    tech-stack.md
  01-apis/
    api-design.md
    calling-apis.md
  02-backend/
    error-handling.md
    database-patterns.md

Generate the compressed index (pipe-delimited format, one line per directory). Add it to your CLAUDE.md or AGENTS.md file - the context files that Claude Code and other coding agents read on startup. Done.

The compressed index approach works! 🎉 Vercel's research proved it: 100% task success versus 79% without it. We've validated it internally. Now it's about whether you'll adopt it before your competitors do.

Top comments (5)

PEACEBINFLOW • Feb 5

This really nails something a lot of teams are feeling but haven’t articulated well yet: the problem isn’t “agents aren’t smart enough,” it’s that we keep forcing them to decide when to be smart.

The compressed index idea is deceptively simple, but it flips the mental model in exactly the right way. You’re not asking the agent to hunt for truth mid-task; you’re giving it a map of the terrain up front so it can reason with confidence. That shift—from reactive lookup to always-on situational awareness—is the difference between flailing and flowing.

What resonated most for me is how this reframes reliability. Reliability doesn’t come from tighter prompts or more guardrails, it comes from reducing cognitive branching. When the agent always knows what exists, it stops hallucinating gaps and starts behaving like a teammate who’s read the handbook.

Also +1 to including framework and library docs alongside internal ones. That’s huge. A lot of “AI mistakes” are really just context asymmetry—humans know the non-obvious rules of the stack, the agent doesn’t. This closes that gap cleanly without bloating the context window.

This feels less like a Claude-specific trick and more like a general design principle: make structure cheap, decisions expensive, and access predictable. Once you do that, a lot of the “agent unreliability” narrative just… disappears.

Really solid work. This is one of those posts that quietly changes how people build.

Viljami Kuosmanen epilot • Feb 5

Absolutely! The same approach of course works with any coding agent, not just Claude Code. The tool we built already works out-of-the box with AGENTS.md as well as CLAUDE.md, depending on which one you have in the local repo.

Eka Terima epilot • Feb 5

Interesting read! Now even I can contribute to the codebase🤓

Salaria Labs • Feb 4

Reliability is the real bottleneck for agents right now, not intelligence.

Most teams I know stopped using agents because of unpredictability.

Interesting approach here — did it noticeably reduce retries or human corrections?

Viljami Kuosmanen epilot • Feb 4

Yes, huge difference! I no longer find myself having to stop the agent and tell it to go read docs or correcting it on how I want things done. Since it knows what docs are available, it proactively checks relevant docs to the tasks and understands our patterns & specific context.