Don't Ask AI to Stop Guessing. Design a System Where It Doesn't Need To.

#ai #webdev #programming #productivity

Designing out the guesswork.

Part of a short series (1 of 3) on engineering lessons from building governed, AI-assisted production systems. Each piece takes one real failure and the architectural idea it forced. The examples are ours; the principle is meant to be transferable.

We removed a capability that genuinely existed — from a tool whose entire job was to represent capabilities fairly. Not because the model hallucinated. Because our architecture made guessing the rational thing to do.

It reasoned correctly from the inputs it was given. The inputs were the problem — and so was our first instinct about how to fix them. This is a write-up of what failed, why the obvious fix didn't work, and the pattern we ended up with. None of it is about prompting.

The failure

We run a tool that compares options for a buyer and recommends the best fit. It's supposed to be neutral. One of the inputs to that tool is a set of capabilities — what each option can actually do.

An automated agent, doing maintenance work, updated that capability set. It removed one of our own capabilities, on the grounds that it had been discontinued. It cited its source: a handover note from an earlier work session that said, in passing, that the capability had been "parked."

The capability had not been parked. It was live, published, and in active use. But for the duration of that change, a tool that was supposed to be neutral became quietly unfair — it stopped representing something that genuinely existed.

The agent was not careless. If you read the note it was working from, you would have drawn the same conclusion. The note was wrong, and nothing in the workflow forced a check against the thing that was actually true.

Why the obvious fix doesn't work

The obvious fix is the one everyone reaches for: tell the model to be more careful. Add instructions. "Verify before asserting." "Do not rely on summaries." "Check the source of truth." Strengthen the prompt.

We tried versions of this. It moves the failure rate; it does not remove the failure. And once you sit with why, that becomes obvious too.

An LLM reasons over the context it's handed. If the nearest, most fluent description of reality is a summary, the model will use the summary — not because it's lazy, but because the summary is right there and reads authoritatively. "Be careful" is an instruction to expend extra effort against an unspecified target. It competes with every other instruction in the context, and it degrades exactly when you most need it: under long context, time pressure, or a confidently-worded but stale narrative.

The deeper issue is that we were treating a systems problem as a behaviour problem. We had two descriptions of reality — an authoritative one (the live system of record) and a convenient one (prose) — and we left it to the model's judgement to pick the right one every single time. That's not a judgement we should have delegated. The model didn't have a guessing problem. We had given it a reason to guess.

What we changed

We stopped trying to make the model choose correctly between sources, and instead made the authoritative source the only path to a fact.

Two ideas did the work.

First: rank the sources explicitly, and let the ranking — not the model — resolve conflicts.

authoritative = resolve(
    production,   # system of record — authoritative
    derived,      # computed from production
    override,     # a cited fact, only where production is silent
    narrative,    # summaries, handovers — NEVER authoritative
)

# A lower tier is consulted only when every higher tier is silent.
# Production truth can never silently fall through to a lower one.
assert not (production.has_answer and authoritative.source != PRODUCTION)

The point of the assert is not defensive coding. It's a statement of intent: when production has an answer, nothing below it gets a vote. Prose can inform where production is silent, but it can never override — or quietly stand in for — a fact the system of record already holds. And — this is the part that bit us — absence has to be proven from the authoritative source, not inferred from a summary that failed to mention it.

Second: derive facts, don't assert them.

The capability set is no longer something a human or an agent edits by hand. It is computed from the live system of record at build time.

capabilities = derive_from(system_of_record.published_items())
# absence of a capability is established by its absence in (system_of_record),
# never by its absence in a document.

Once capabilities are derived, the class of bug we hit becomes structurally impossible. You cannot remove a live capability by editing a note, because notes are no longer in the path. The system self-corrects whenever the system of record changes. Nobody has to remember to keep the description in sync, because there is no separate description to keep in sync.

What we deliberately did NOT automate

This is the part I'd most want a sceptical reader to notice, because it's where restraint mattered more than cleverness.

We automated the source of truth. We did not automate the decision.

When the derived capability set and someone's expectation disagree, the system does not silently "fix" anything. It surfaces the discrepancy and stops. A human decides whether the difference is a genuine change, a mistake, or an intended exception. We never gave the pipeline the authority to assert a new fact about the world — only to derive facts from a system that already holds them, and to flag when something looks off.

The temptation, once you've built a resolver, is to let it auto-resolve everything. We didn't, because "what exists" is a fact (derivable) but "what should exist" is a decision (not). Collapsing those two is how you build a system that is confidently, automatically wrong.

There's a clean line underneath all of this:

Facts come from systems. Decisions come from people. A pipeline may derive facts and flag conflicts; it may never decide what is true.

The behaviour / witness boundary

We still rely on behaviour — the agent is expected to reason from the authoritative source. But we no longer trust behaviour to be the only line of defence. There is one small machine check that runs in the pipeline: it recomputes the capability set from the system of record and fails the build if what we're about to ship has drifted from what the source actually exposes.

That check doesn't make the model behave. It makes the invariant observable. If behaviour silently regresses, the witness fails loudly before anything reaches a user. Behaviour governs; the check is evidence that the invariant still holds. We were careful not to confuse the two — a passing check is not proof of good judgement, only proof that one specific, decidable property is intact.

Lessons for other engineering teams

Most "AI reliability" problems are ambiguity-surface problems. Before you reach for a better prompt, ask whether the model even had an unambiguous source to reason from. If two descriptions of reality are in scope, you've already lost — you're just waiting to find out when.
Make the correct path the only easy path. "Be careful" is a tax on every future inference. Removing the alternative source is a one-time structural change. Prefer structure over diligence; diligence doesn't scale and doesn't survive context pressure.
Derive, don't describe. Any fact that exists authoritatively somewhere should be computed from that place, not transcribed. Every transcription is a copy that will eventually disagree with the original.
Rank your sources before you need to. The conflict between a system of record and a convenient summary is not exotic — it's the default condition of any system with documentation. Decide the precedence in advance, in code, so no human or model has to adjudicate it in the moment.
This generalises well beyond LLMs. Replace "the agent" with "a new engineer" or "a cron job." The same fix — single authoritative source, derived not described, conflicts surfaced not resolved — removes the same class of error. The LLM just made the latent design flaw fail faster and more visibly.

The reframe that mattered for us was small but total. We had been asking, "how do we get the model to stop guessing?" The better question was, "why is the model in a position where guessing is reasonable?" Once we removed the reason, the behaviour took care of itself.

The Engineering Principle

An agent guesses when its inputs leave room for guessing. Don't instruct it to stop — remove the ambiguity. Derive facts from the system of record, rank every source so prose can never outrank truth, and surface conflicts instead of resolving them automatically. Don't ask AI to stop guessing. Design a system where it doesn't need to.