2026 Q1 is the year developers still build the agent harness.
2026 Q3 / 2027 is the year the LLM builds its own harness.
Today, every AI coding agent — Claude Code, Cursor, Codex, Gemini CLI, Aider, you name it — depends on the same hidden layer:
the files that brief the agent before it starts work.
AGENTS.md
CLAUDE.md
.cursor/rules
SKILLS/
MCP server lists
memory schemas
test commands
lint commands
“Do not touch these paths.”
“Require human approval before this.”
Different IDE, same boilerplate.
Different repo, same boilerplate.
Different agent, same boilerplate.
That is the agent harness problem.
The hidden work behind AI coding agents
Most people talk about the coding agent itself.
But in practice, the quality of an AI coding session often depends on the context layer around the agent.
Before the agent starts coding, it needs to know:
- what kind of project this is
- what framework it uses
- what files are important
- what commands run tests
- what commands run linting
- what paths should not be touched
- what tools are available
- what memory should persist
- what failure modes to avoid
- what coding conventions to follow
- when human approval is required
Without this layer, even strong coding agents can make subtle mistakes.
With this layer, the same agent can behave much more consistently.
That layer is what I call the harness.
Why this still exists in 2026
In theory, the LLM should be able to inspect a repo and generate all of this itself.
In practice, we are not fully there yet.
The models are smart enough to do real coding work, but not always reliable enough to deterministically generate perfect project-specific ground truth from scratch on every fresh repo, every time.
They can do it sometimes.
Not always.
So the human stays in the loop.
We write the same repo instructions again.
We copy the same rules across projects.
We maintain separate files for Claude Code, Cursor, Codex-style agents, Continue, Windsurf, and others.
Small work per repo.
Painful in aggregate.
The future: self-generating harnesses
I think this is temporary.
Soon, the coding model should be able to:
- read the repo
- understand the task
- detect the project type
- generate the right harness
- connect the right tools
- create memory schemas
- write validation scripts
- refine the loop until the task is complete
At that point, the harness layer disappears as a separately authored artifact.
But until then, developers still need a bridge.
I built harnessforge
I built harnessforge to test this idea.
It is a local, open-source harness generator for AI coding agents.
It is not another coding agent.
Your coding agent stays the brain.
harnessforge just lays down the ground truth the agent reads before work begins.
Run:
uvx harnessforge init
or install:
pip install harnessforge
In a few seconds, fully local with no network calls by default, it inspects your repo and generates startup files commonly used by AI coding agents.
What it generates
Depending on the project and blueprint, harnessforge can generate files such as:
AGENTS.md
SOUL.md
TOOLS.md
MEMORY.md
SKILLS/
.claude/CLAUDE.md
.cursor/rules
.continue/
.windsurf/rules
blueprint-specific validators
The goal is simple:
give the coding agent a stronger starting point.
Current blueprints
The current version includes these blueprints:
rag-agent
For retrieval systems, knowledge-base agents, citation enforcement, and grounded responses.
finance-agent
For finance or stock-related agents, including market-data handling and validation rules around trade execution safety.
support-agent
For customer support flows such as intent detection, knowledge-base lookup, ticket creation, escalation, and ticket lineage.
workflow-agent
For multi-step orchestration with tool logs, idempotency, and validation structure.
python-cli-app
A default blueprint for greenfield Python CLI projects.
Why this matters
The important idea is not the specific files.
The important idea is that coding agents need a reliable project-specific operating context.
Today, we manually maintain that context.
Tomorrow, the model may generate it automatically.
harnessforge is meant to sit in the middle.
A bridge, not a moat.
Use it now.
Throw it away when the models catch up.
Example workflow
uvx harnessforge init
Then open Claude Code, Cursor, Codex, Gemini CLI, Aider, or another coding agent inside the repo.
The agent now has project-specific context files to read before it starts work.
Instead of starting from a blank repo, the agent starts with:
- project rules
- tool definitions
- memory structure
- validation expectations
- blueprint-specific failure modes
- agent-specific startup files
The coding agent still writes the code.
The harness just gives it the right context.
The bet
My bet is:
2026 Q1: developers still build the agent harness.
2026 Q3 / 2027: the LLM builds its own harness.
Until that happens, a local deterministic harness generator can make AI coding workflows more reliable.
GitHub:
https://github.com/jcaiagent7143-ui/harnessforge
PyPI:
https://pypi.org/project/harnessforge/
I would love feedback from developers using Claude Code, Cursor, Codex, Gemini CLI, Aider, Continue, Windsurf, or other coding agents in real repos.
How are you managing your agent harness today?
Are you manually maintaining AGENTS.md, CLAUDE.md, .cursor/rules, MCP configs, memory files, and validation rules?
Or do you think the next generation of coding models will generate this layer automatically?
Top comments (0)