Most conversations about coding agents eventually become model comparisons.
Which model is better?
Which one writes cleaner code?
Which one follows instructions better?
Which one can handle a larger repo?
Those questions matter.
But the more I use coding agents, the more I think we are missing a bigger part of the problem:
Agents often fail not because the model is bad, but because the repo is bad for agents.
A coding agent does not land in a clean abstract problem space.
It lands in your repository.
And your repository is part of the prompt.
The repo is part of the prompt
When an agent starts working in a codebase, it inherits everything around it.
It inherits the folder structure.
It inherits the naming conventions.
It inherits the missing docs.
It inherits the stale setup instructions.
It inherits the test suite that may or may not run.
It inherits the internal patterns that only a few people on the team understand.
It inherits the weird config files.
It inherits the shortcuts, inconsistencies, and half-written decisions that accumulated over time.
Then we ask it to make a safe change.
And when it gets something wrong, we blame the model.
Sometimes that is fair. Sometimes the model really does miss the point.
But often, the repo gave it a bad environment to work in.
“The model was bad” is sometimes the wrong diagnosis
I have seen agents fail in ways that look like intelligence problems at first.
For example:
- It rewrites too much code.
- It ignores a project convention.
- It adds a dependency the team would never approve.
- It cannot figure out how to run tests.
- It touches files it should probably avoid.
- It follows stale setup instructions.
- It makes a change that looks plausible but does not match how the repo works.
It is easy to call that a model failure.
But if a human engineer joined the same project with no onboarding doc, no architecture notes, no contribution guide, no clear test command, and no explanation of the team’s conventions, we would not call that engineer bad on day one.
We would say the repo has poor onboarding.
Agents have the same problem.
They just fail faster.
And they fail with more confidence.
Good agent workflows are boring
The best coding-agent workflows I have seen are not magical.
They are usually boring in a very useful way.
The repo has:
- A clear
AGENTS.mdor equivalent instruction file - Documented setup steps
- Pinned tools and versions
- Obvious test commands
- CI that gives useful feedback
- Clear project conventions
- Safe boundaries around secrets and config
- Enough context for the agent to understand what “good” looks like
That does not make the agent perfect.
But it changes the failure mode.
Instead of guessing wildly, the agent has rails. It can make a bounded change, run the expected checks, and explain what it did.
That is a very different workflow from “here is a vague task, please modify a repo you barely understand.”
Repo readiness should be measurable
We already measure many parts of software quality.
We measure test coverage.
We measure lint errors.
We measure build status.
We scan dependencies.
We check secrets.
We track CI health.
But when it comes to coding agents, we still use vague language:
This repo works pretty well with agents.
Claude Code struggles here.
Codex keeps making weird changes.
Cursor is good in this repo but bad in that one.
Those observations are useful, but they are not operational.
If agents are going to work inside real repositories, we need better questions:
- Does the repo explain its own conventions?
- Can an agent find the correct setup path?
- Are tool versions pinned?
- Are test commands discoverable?
- Are dangerous commands documented or isolated?
- Are secrets protected from casual reads?
- Is there enough project context to avoid broad guessing?
- Can this be checked repeatedly as the repo changes?
That last question matters.
Agent-readiness should not be a one-time cleanup. It should be something that can improve, regress, and be checked in CI like anything else.
This is not about replacing engineers
I do not think the practical future is “agents do everything and engineers disappear.”
The more useful version is this:
Agents become part of the engineering workflow.
They help with refactors, tests, docs, migrations, investigations, repetitive changes, and implementation work.
But if they are going to be part of the workflow, the workflow has to be designed for them.
Most repositories today are still designed for humans who already have context.
Humans have Slack history, team memory, onboarding calls, product knowledge, and years of accumulated judgment.
Agents do not have that unless we write it down somewhere they can use.
That means the repository has to carry more of its own context.
The next layer of developer tooling
I think this is where a lot of developer tooling is heading.
Not just better chat interfaces.
Not just better editors.
Not just bigger context windows.
But repo-level systems that make codebases safer and more understandable for agentic work.
That means:
- Clearer instructions
- Better defaults
- Deterministic checks
- Safer automation boundaries
- Better CI integration
- More explicit governance around what agents can and cannot do
This is also the idea I have been exploring with a small open-source project called Charter.
Charter is an offline CLI that gives a repository a deterministic 0–100 readiness score for coding-agent workflows.
It checks areas like:
- Context
- Secrets
- MCP/tool safety
- Environment setup
- CI
- Tests
- Governance
Then it points to concrete gaps that make agents more likely to fail.
No LLM scoring.
No network calls.
Same repo, same score.
The project is still early, but the framing has been useful for me:
If agents keep failing in a repo, maybe the repo is part of the problem.
GitHub: https://github.com/use-charter/charter
Final thought
Better models will help.
They will reason better. They will follow instructions better. They will make fewer mistakes.
But they will still inherit the environment we give them.
If the repo is confusing, unsafe, undocumented, and hard to verify, even a very good agent can make things worse.
The model matters.
The repo matters too.
Top comments (0)