The Agent Is the Harness, Not the Model — and Why That Reorganizes Software Engineering

#agents #ai #career #softwareengineering

TL;DR — Every AI system decomposes into two things that matter: the model and the harness (the code wrapping it). Claude Code, GitHub Copilot, ChatGPT — those are harnesses, not models. Right now only frontier labs build both halves. That won't last. As harness engineering becomes its own discipline — domain-specialized, model-agnostic — it absorbs most of what we currently call software engineering. The app store becomes the agent store, and our job shifts from writing code for humans to writing harnesses that automate human workflows.

I keep coming back to one formula whenever someone asks where AI engineering is actually headed:

Agent = Model × Harness

It sounds almost too simple. But it draws a line that clears up a surprising amount of confusion — about what an agent is, about who builds them, and about what our jobs become.

The distinction: model vs harness

Claude Code is an agent. GitHub Copilot is an agent. Their CLIs are agents. ChatGPT is an agent.

None of those things are models. They're harnesses — the software infrastructure built on top of an LLM that turns raw next-token prediction into something that plans, calls tools, holds context, retries, and ships a result.

The clean way to hold it in your head:

If GPT-5.5 is the model, then ChatGPT is the harness wrapping it.

The model is one ingredient. The harness is the dish.

This isn't pedantry. Separating the two gives you the only two levers that matter at the highest level of any AI system:

The model — the raw reasoning. Swappable. Improving on a curve you don't control.
The harness — goals, loops, tools, memory, evals, the product surface. The part you actually engineer.

(There's a third ingredient — data — but it's implicit in both. It trains the model and it flows through the harness.)

Almost every interesting engineering decision in an AI product lives in the harness, not the model. Which prompts. Which tools, with which guardrails. How the loop terminates. What gets remembered. How failures get caught. You don't train GPT — you wrap it. The wrapping is the work.

Why this framing matters now

Here's the part that turns a definition into a thesis.

Today, only the frontier labs build both halves. OpenAI builds GPT and ChatGPT. Anthropic builds Claude and Claude Code. The model and the harness ship from the same building, by the same company, as one bundle.

That is a temporary arrangement. It's what the early phase of every platform looks like — the people who make the engine also make the only car.

It won't stay that way, because the harness is separable from the model, and we're already watching the split begin.

GitHub Copilot is the clearest preview. It's a harness that wraps any major model — you can point it at different frontier models underneath. The harness is the product; the model is a swappable backend. That's the shape of the future, generalized: harnesses as first-class products, model-agnostic, increasingly specialized to the domain the agent operates in.

A coding harness wants tool access, repo context, and test loops. A legal harness wants citation discipline and retrieval grounding. A support harness wants state verification and escalation paths. A finance harness wants determinism and audit trails. Same underlying models — radically different harnesses, because the domain is where the engineering lives.

The claim: harness engineering absorbs software engineering

So here's the bold version, and I'll own that it's bold:

As harness engineering matures into its own discipline, it consumes most of what we currently call software engineering.

Call it agentic engineering, call it harness engineering — the name matters less than the shift. The center of gravity of the work moves from writing the deterministic logic ourselves to engineering the system that wraps a model so it can do the non-deterministic parts well.

I'm not the only one pointing at this. Dario Amodei has said versions of this publicly — that an enormous fraction of code, and of knowledge work generally, is heading toward being written and operated by these systems rather than typed by hand. You don't have to accept the most aggressive timeline to see the direction.

And we're already seeing the early traces:

Companies are bolting chatbots and agents onto their existing products — first as a side feature, a widget in the corner.
Then those capabilities stop being bolt-ons and bleed into the core offering. The agent stops being the thing beside the product and becomes a primary way you use the product.

Follow that to its conclusion and you get an agentic app ecosystem. Think app store — but for agents. An agent store.

It happens on a spectrum, not a cliff

I want to be precise here, because the lazy version of this take is "AI replaces all code," and that's wrong.

The realistic version is a spectrum:

Foundational business processes stay code- and determinism-heavy. Payments, ledgers, auth, anything where "approximately right" is a defect — that stays deterministic code, and it should. You do not want a probabilistic model freelancing your double-entry accounting.
The human-driven parts get automated by agents. All the judgment, glue, triage, and workflow that never got automated because only a person could do it — that's exactly the territory agents move into. Not by replacing the deterministic core, but by filling the gaps around it that used to require a human in the loop.

So the work doesn't vanish. It re-shapes. Our job stops being only "write code for other humans to use" and increasingly becomes "write systems that automate human workflows via agents." The deterministic spine remains; the soft tissue around it gets agentic.

What this means for our roles

If you zoom out, the discipline splits cleanly along the same line as the formula:

Developers move to the harness side. ML folks own the model side.

Model side — the people training, fine-tuning, evaluating, and improving the raw reasoning engine. This stays specialized and stays with the people who do ML.
Harness side — the people designing goals, wiring tools, closing feedback loops, building the eval and observability layers, and shaping the domain-specific product the agent lives inside. This is where most developers end up.

That's not a downgrade for software engineers. It's a relocation. The harness is where correctness, safety, latency, cost, and user trust are actually decided. The model gives you capability; the harness decides whether that capability becomes a product or a liability.

The honest caveat

I'll flag the part it's tempting to oversell. None of this means "models do everything and engineers go home." The opposite, really: as models get more capable, the harness becomes more important, not less, because a more capable model with a sloppy harness is a more capable way to fail. The leverage of good harness engineering goes up as the underlying model improves.

That's the whole bet. The model is improving on its own, on a curve you don't control. Whether your agent — your product, your company, your role — improves is a question about your harness.

Agent = Model × Harness. The model half is being handed to all of us for free, and it's getting better every quarter. The harness half is the part we get to engineer. That's where the next decade of this work lives — and that's where I'd be placing my bets.

This is the long-form version of a thought I first posted on LinkedIn. If you want the short, punchy take and the discussion around it, it's here: the original LinkedIn post. For the companion piece on what actually goes inside a harness — goals, loops, tools, lens, and evals — and why your eval layer is part of the agent rather than a tool beside it, see Agent = Model × Harness: Your Eval Layer Is Part of the Agent.