Who Owns Your Harness? The Layer Above the Model

#aiarchitecture

Lately I find myself less interested in which model wins and far more interested in who owns the layer above the model. The benchmark wars — GLM versus Claude versus GPT versus Qwen versus DeepSeek — are already yesterday's conversation. Models improve fast and get cheaper faster. Open-source is closing the gap ahead of every schedule people drew a year ago. So "which model should we use?" is not the question. It never was.

The question that actually determines whether your company survives a bad quarter in AI policy is this: can we replace the model tomorrow? If the honest answer is no, you have already made the most expensive architectural decision of the decade without noticing.

You think you're buying AI. You're wiring an operating system.

Most companies believe they are buying AI the way they buy a database or a cloud region — a component, swappable, bounded. What they are actually doing is wiring their entire execution layer around a single vendor. The prompts. The memory. The agents. The routing logic. The evals. And then every integration on top: Slack, Jira, GitHub, the internal tools, the accumulated company knowledge that no one wrote down anywhere else.

Bit by bit, the model stops being a model. It becomes the operating system of the business. Every workflow assumes its quirks. Every prompt is tuned to its behavior. Every engineer's mental model of "how our AI works" is really a mental model of one vendor's API. That is where lock-in begins — not in a contract clause, but in a thousand small couplings nobody tracked.

Lock-in stopped being a commercial inconvenience

For most of software history, lock-in was a negotiating problem. You paid a switching cost, you grumbled, you migrated over a quarter. Annoying, survivable. That era is over for AI. Lock-in became a business-continuity risk, and recent events proved it in the harshest way possible.

A single US export-control order took Anthropic's top models — Mythos 5 and Fable 5 — offline for two weeks. Not just for foreign users. To stay compliant, they were pulled for everyone worldwide, the United States included. Every company that had wired its product around those models lost its core capability overnight, through no decision of its own.

Days later, GPT-5.6 shipped only as a gated, US-only preview, after Washington reportedly asked OpenAI to hold the launch. Two data points, one lesson: the model under your product can go dark on a government's timeline, not yours.

Closed is closed the moment someone decides it's closed to you — and that someone may not be your vendor.

That last clause is the whole point. You can have a perfect relationship with your model provider, pay every invoice on time, and still lose access because a regulator three time zones away signed an order. Your vendor's goodwill is irrelevant when the constraint sits above the vendor. This is the shift I described in the model wars ending and the clearance wars beginning: capability now ships when it clears, not when it's ready.

The default has to flip: open-source first

The conclusion writes itself. The default posture must invert. Open-source first — not because open models win every benchmark, because they do not yet, but because open weights are the only ones nobody can switch off. A file of weights sitting on your own hardware does not care what any government decides next week. It is inert, and it is yours.

Keep closed models for the frontier edge, the genuinely hard workloads where the quality gap earns its premium. But the core you cannot afford to lose should sit on weights you actually hold. This is the same architecture I lay out in routing by task, not vendor: self-host the core, pay for frontier only where it creates value you can't get elsewhere.

Above the model sits the harness

And above every model — open or closed — sits the harness. This is the layer that actually matters, and it should be vendor-agnostic by design. The harness owns:

Memory — what your system remembers across sessions, users, and workflows.
Context — how the right information reaches the model at the right moment.
Routing — which task goes to which model, and the fallback when one goes dark.
Permissions — who and what is allowed to do which action.
Tools — the integrations that let the model act on your actual systems.
Evals — how you know quality held after you swapped a model underneath.
Orchestration — the logic that ties it all into something that works.

The harness is the part that actually knows how your company works. The model is a replaceable engine bolted into it. If your harness is well-built and vendor-agnostic, swapping models is a config change and a re-run of your evals. If it is not — if the harness and the vendor are the same thing — then a model going dark takes your whole business with it.

The highest-ROI call of the decade

Building your own harness is slower and costlier today. There is no way around that. It is more engineering, more discipline, more upfront investment than plugging into one vendor's SDK and shipping. That is exactly why most teams will not do it until they are forced to.

But it may be one of the highest-ROI calls of the decade. Models come and go. Governments reshuffle who gets access to what, and on what timeline. Your company brain — the accumulated knowledge, workflows, and judgment encoded in your execution layer — should depend on neither. It should sit in a harness you own, feeding whichever model happens to be best, cheapest, and available this month.

Key takeaways

The right question is not "which model do you use?" but "can you replace it tomorrow?"
Companies think they're buying AI; they're wiring their whole execution layer around one vendor.
Lock-in became a business-continuity risk — a single export order pulled Anthropic's top models worldwide for two weeks.
Closed is closed the moment someone decides it's closed to you, and that someone may not be your vendor.
Default to open-source for the core; open weights are the only ones nobody can switch off.
Own the harness — memory, context, routing, permissions, tools, evals, orchestration — and models become swappable engines.

We are going to stop asking "which model do you use?" and start asking who owns your harness. I write about that transition and the architecture it demands across my essays on AI infrastructure and execution. The teams that build the harness now will be the ones still running when the next model goes dark.