Originally published in longer form on Substack. This DEV version is adapted for software engineers and platform practitioners who want the practical takeaway quickly.
Most AI harness work focuses on execution.
That makes sense. Teams need better context management, tool access, workflow boundaries, verification, memory, and sub-agent coordination. Without those pieces, coding agents are unreliable fast.
But there is a different failure mode that those harness improvements do not solve:
an agent can operate inside a well-designed execution harness and still produce the wrong architecture.
That is the missing layer.
The Real Problem Is Not Just Code Quality
Ask an agent to design a small SaaS product and it will often produce something that is technically coherent and operationally excessive at the same time.
You get things like:
- microservices where a monolith would do
- Kubernetes where managed PaaS is the obvious fit
- heavyweight observability and rollout machinery for a team with no real platform capacity
- provider choices that quietly add lock-in or operational burden
- reliability mechanisms sized for a much larger organization
None of that is necessarily irrational.
It is just architecture optimized for an imaginary team.
That is what happens when the harness governs what the agent can see and do, but not what kinds of systems it is allowed to design.
What the Harness Usually Misses
Most organizations already have architectural constraints, whether they write them down well or not:
- cost ceilings
- preferred cloud/saas providers
- approved deployment models
- auth and identity boundaries
- operational limits
- compliance expectations
- explicit exclusions
The problem is that these often live in:
- docs
- ADRs
- wiki pages
- tribal memory
- architecture review meetings
That is not enough for agent-driven workflows.
If those constraints are not machine-readable and enforceable, the agent is still reasoning inside an underconstrained design space.
What I Mean by "Architecture Inside the Harness"
The core idea is simple:
The harness should not only manage execution. It should also constrain architecture.
In practice, that means three pieces:
1. A pattern registry
Architectural knowledge has to live somewhere reusable.
A pattern in the registry can encode:
- what constraints it supports
- what NFR thresholds it can satisfy
- what it provides and requires
- what config decisions it exposes
- what cost and adoption trade-offs it carries
That turns architecture knowledge from conversation into versioned policy.
2. A deterministic architecture compiler
The compiler takes a canonical spec and selects patterns based on explicit rules.
The key property is determinism.
Given the same inputs, it should produce the same outputs. That gives teams something they can actually review and approve. It also makes architectural change visible as a diff instead of as implementation drift discovered too late.
3. Workflow rules around the compiler
The compiler alone is not enough.
You also need workflow discipline that tells the agent:
- when to compile
- when planning has surfaced a real architecture change
- when re-approval is required
- when implementation is allowed to proceed
That is what turns architecture from documentation into a control point.
Why Determinism Matters
At the architecture layer, the problem is not mainly creativity. It is governance.
That is why deterministic behavior matters more than people often expect.
It gives you:
- reproducibility
- auditability
- explicit assumptions
- explicit exclusions
- a recompile-and-diff path when constraints change
For senior engineers and platform teams, that is much more useful than a model producing a plausible design summary in slightly different words each time.
A Concrete Example
I used this approach in a Bird ID application workflow.
The product itself was simple: users upload bird photos, an AI model identifies likely species, and results are stored in per-user history.
The important part was not the feature list. It was the operating context:
- hosted PaaS backend
- managed Postgres
- OIDC for auth
- object storage for uploads
- low traffic
- strong cost sensitivity
- no real ops team
Once those became compiler inputs, the architecture was constrained mechanically rather than conversationally.
That made it much easier to reject patterns that would have been technically valid but wrong for the project:
- heavyweight deployment patterns
- overly complex topology choices
- infrastructure layers that added operational cost without real payoff
The downstream effect mattered too. The approved architecture could then be handed to planning and implementation as an explicit contract instead of a loose design memo.
The Real Deliverable Is Not Better Documentation
The main output of this style of harnessing is not prettier architecture docs.
The real output is an enforceable boundary between architecture and implementation.
That boundary matters because implementation agents are good at creating drift quickly.
If the architecture says:
- OAuth2/OIDC with PKCE
- hosted PaaS
- managed Postgres
- monolithic service topology
then implementation should not quietly reintroduce:
- server-side session state
- new provider choices
- new persistence layers
- unnecessary distributed complexity
Without a hard boundary, those changes show up as "implementation details." In practice, they are architecture changes.
What Platform Teams Should Take From This
If you are building internal agent workflows, the practical lesson is:
do not stop at context engineering.
Context engineering improves what the agent can see. Tool engineering improves what the agent can do. But neither is enough to keep the system architecture aligned with actual team constraints.
Platform teams need something stronger:
- explicit architecture inputs
- deterministic architecture selection
- approval and re-approval boundaries
- implementation workflows that are forced to stay inside the contract
That is what architecture inside the harness gives you.
Closing
The value of a harness is not only that it makes agents more capable.
The value is that it bounds the solution space so capability is applied in the right direction.
If the architecture layer stays implicit, fast agents will simply accelerate architectural drift.
If the architecture layer becomes explicit, reviewable, and enforceable, then agent speed becomes much easier to trust.
That is the argument: architecture is the missing layer in AI harness engineering.
Links
- Longer Substack version: https://inetgas.substack.com/p/ai-harness-engineering-at-the-architecture
- Architecture Compiler: https://github.com/inetgas/arch-compiler
- Bird ID case study: https://github.com/inetgas/arch-compiler-ai-harness-in-action
Top comments (0)