A design protocol born from DeFi infrastructure, now applied to AI systems
The Problem
You've built an AI agent. It works — sometimes brilliantly.
But then it starts doing things you didn't ask for.
- It makes assumptions and acts on them
- It fills in missing data instead of saying "I don't know"
- It optimizes when you only asked it to observe
- It gives confident answers when it should refuse
This isn't a model problem. It's an architecture problem.
Your agent has no boundary contract.
Where This Idea Came From
I built a DeFi risk observer for Aave v3 — a system that watches on-chain positions and reports liquidation risk in real time.
The hardest design decision wasn't the data model or the state machine.
It was this question:
When should the system refuse to output anything at all?
In DeFi, a wrong answer isn't just useless — it can cause real financial loss. So I designed a system that explicitly separates:
- What is verified (direct from protocol)
- What is derived (computed from verified data)
- What is estimated (approximate, labeled as such)
- What should be refused (uncertain, inconsistent, or unsafe to show)
When I applied this same philosophy to an AI agent I was building for content automation — something completely unrelated to DeFi — the agent's overreach dropped significantly.
The principle transferred. The boundary contract worked.
The Core Principle
Refusal over Uncertainty. Boundary over Prediction. Observability over Automation.
Most AI systems are designed to always produce output. Silence feels like failure. Uncertainty gets smoothed over. Gaps get filled with plausible-sounding content.
The result: agents that confidently do the wrong thing.
A boundary contract inverts this default.
The Four Layers
Every AI output can be classified into one of four trust layers:
1. VERIFIED
Directly observable. The system retrieved this from a reliable source and can confirm it.
"The article was published on June 1, 2026."
2. CONSISTENT
Derived deterministically from verified data. The logic is transparent and repeatable.
"Based on the publication date, this is within the 30-day window."
3. ESTIMATED
An approximation. Useful, but explicitly labeled as such. Not to be treated as fact.
"The reading time is approximately 4 minutes."
4. REFUSED
The system cannot produce a trustworthy output. It says nothing rather than something wrong.
Output withheld. Reason: source data inconsistent.
The State Model
Pair the trust layer with an observable state:
| State | Meaning |
|---|---|
STABLE |
Operating within safe boundaries |
WATCH |
Approaching a boundary — caution advised |
BOUNDARY_APPROACHING |
Near-limit — intervention may be needed |
DEGRADED |
Output possible but quality is reduced |
REFUSAL |
Output withheld intentionally |
These aren't errors. REFUSAL is a feature, not a failure.
Applying This to AI Agents
Here's a practical example. Suppose your agent summarizes recent news articles.
Without a boundary contract:
- Missing article → agent invents plausible content
- Stale data → agent presents it as current
- Conflicting sources → agent picks one and ignores the other
With a boundary contract:
- Missing article →
REFUSEDwith reason: "Source unavailable" - Stale data →
ESTIMATEDwith label: "Data may be outdated" - Conflicting sources →
DEGRADEDwith label: "Sources inconsistent"
The agent becomes honest about what it knows and doesn't know.
The System Prompt Pattern
Here's a minimal implementation in a system prompt:
You are an observer agent. Your role is to report state, not to act.
For every output, classify it as one of:
- VERIFIED: directly confirmed from source
- CONSISTENT: derived from verified data
- ESTIMATED: approximate — label it clearly
- REFUSED: do not output if data is missing, inconsistent, or unsafe
Rules:
- Never fill gaps with assumptions
- Never produce output when sources conflict
- Never optimize, advise, or act — only observe and report
- When in doubt, refuse
Refusal is correct behavior. Silence is safer than a confident wrong answer.
This single addition changed the behavior of my agents more than any other prompt engineering technique I've tried.
Why This Works
The underlying principle is simple:
The protocol restricts transitions, not states.
An AI agent can end up in a bad state through external circumstances — bad data, ambiguous input, conflicting context. That's unavoidable.
What you can control is whether the agent acknowledges that state and handles it explicitly, or papers over it with confident-sounding output.
The boundary contract makes the agent's epistemic state legible — to you, and to downstream systems.
What I'm Releasing
I've formalized this into a document:
Boundary Contract for AI Systems v0.1
It includes:
- The full trust layer specification (VERIFIED / CONSISTENT / ESTIMATED / REFUSED)
- The state model with transition rules
- System prompt templates for common agent patterns
- The Non-Advisory Integrity Clause (what your agent must never do)
- Refusal protocol with trigger conditions
Available on Gumroad: https://arcthree.gumroad.com/l/etb-boundary-contract
Final Thought
The most reliable AI systems I've seen have one thing in common:
They know what they don't know.
Building that awareness in requires explicit design. It doesn't happen by default.
A boundary contract is how you make it intentional.
Built on the UEH (Universal Exchange Adapters) design philosophy.
Originally developed for DeFi risk observation infrastructure.
GitHub: github.com/ueh-labs/ueh-observer
Top comments (0)