Treat AI Output as Untrusted Input

Solomon Mithra — Sat, 31 Jan 2026 09:48:45 +0000

In every serious system we build, there’s a rule we don’t argue with:

User input is untrusted.

We validate it.

We sanitize it.

We enforce boundaries before it’s allowed to do anything meaningful.

Yet when it comes to AI systems, many teams quietly abandon this rule.

The dangerous assumption

In production AI systems, model output often flows directly into:

customer-facing responses
financial decisions
workflow automation
compliance-sensitive paths

The implicit assumption is:

“The model did what we asked, so the output must be okay.”

This is where things go wrong.

When failures happen, the postmortem usually says:

“The prompt wasn’t strict enough”
“We should retry more”
“The model hallucinated”

But those aren’t root causes.

The real failure is the boundary

The model didn’t break the system.

The system trusted the model.

From a systems perspective, AI output is just another external data source:

probabilistic
non-deterministic
not guaranteed to respect invariants

That puts it in the same category as:

user input
webhook payloads
third-party API responses

We don’t trust those.

We verify them.

Why prompts and retries don’t solve this

Prompts are instructions, not enforcement.

Retries increase the chance of a better answer, but they don’t guarantee:

structural correctness
compliance
safety
consistency

Using one LLM to judge another just adds more probability to the system.

None of these create a hard stop.

The correct production architecture

Once you see it, it’s hard to unsee.

LLM → Verification Layer → System

The verification layer runs:

after generation
before delivery
outside the model’s control

Its job is not to be smart.

Its job is to be strict.

What verification actually means

In practice, verification enforces three things:

1. Contracts

Does the output match the structure your system expects?

If not, it doesn’t proceed.

2. Policies

Does the output violate any deterministic rules?

compliance language
PII exposure
secret leakage
unsafe markup

If yes, the system blocks or rewrites explicitly.

3. Explicit decisions

Every response results in a clear outcome:

allow
block
rewrite
audit

No silent failures.

No “probably fine.”

Why this changes everything

Once AI output is treated as untrusted input:

simpler models become viable
failures become predictable
compliance becomes enforceable
incidents are caught before damage

The model becomes a suggestion engine, not a source of truth.

That’s exactly where probabilistic systems belong.

This isn’t about safety it’s about systems

This isn’t a moral argument.

It’s a production one.

Every mature system enforces trust at boundaries.

AI systems are no different.

Final principle

If your system cannot deterministically explain why an AI response was allowed,

then it should not have been allowed.

If you’re interested in enforcing this boundary in real systems,

Gateia is an open-source TypeScript SDK built specifically for post-generation verification:

npm install gateia

Built to be boring.
Built to be strict.
Built for production.

Probability Is a Liability in Production

Solomon Mithra — Fri, 30 Jan 2026 03:39:21 +0000

Large Language Models are impressive.

They’re also probabilistic.

Production systems are not.

That mismatch is where most AI failures actually happen.

AI failures are usually trust failures.

When AI systems fail in production, it’s rarely dramatic.

It’s not “the model crashed.”

It’s quieter and more dangerous:

malformed JSON reaches a parser
guarantee language slips into a response
PII leaks into customer facing text
unsafe markup reaches a client
assumptions are violated silently

These are trust failures, not intelligence failures.

We validate inputs. We don’t verify outputs.

Every serious system treats user input as untrusted.

We validate:

types
formats
invariants

We fail closed when validation fails.

But AI output often skips this step entirely.

Instead, teams rely on:

prompts
retries
“the model usually behaves”

That’s not a safety model.

That’s hope.

An LLM is just another untrusted computation.

Compliance is enforced at boundaries

This is the key insight.

Databases aren’t “GDPR-aware.”

APIs aren’t “SOC2-aware.”

Users aren’t trusted.

Compliance is enforced at boundaries:

validation layers
policy checks
explicit allow/block decisions
audit logs

AI systems need the same treatment.

Trying to make AI “behave” by adding more AI only increases uncertainty.

Deterministic verification beats AI judging AI

Many AI safety tools rely on:

LLMs evaluating LLMs
probabilistic moderation
confidence scores

That fails quietly.

A verifier should:

never hallucinate
never guess
never be creative

It should be boring and correct.

Gateia: verifying AI output before it ships

This is why I built Gateia.

Gateia does not generate AI output.

It does not orchestrate agents.

It does not manage prompts or models.

Gateia runs after generation and answers one question:

Is this output allowed to enter my system?

It enforces:

schema contracts
deterministic safety & compliance policies
explicit pass / warn / block decisions

Everything is auditable.

Failures are explicit.

Security fails closed.

A missing layer, not a framework

Gateia isn’t an orchestration framework.

It’s deliberately narrow.

Every production AI system eventually needs a gate — either by design or after an incident.

Verification is not exciting.

But it is inevitable.

Final thought

AI doesn’t fail in production because it’s not smart enough.

It fails because we trust probability where we should enforce rules.

Production systems don’t need smarter models.

They need stronger boundaries.

If you’re interested in deterministic verification for AI outputs,

Gateia is available as an open-source TypeScript SDK:

npm install gateia

DEV Community: Solomon Mithra