Why Complex SaaS Needs a White Box Protocol for AI
Beyond UI Generation — where humans and AI communicate through meaning, not pixels
TL;DR — Give AI a White Box, Not a Black Box
Most AI agents interact with web apps through Black Box methods: consuming DOM dumps or screenshots, then guessing what to click.
But HTML was never designed for machines. From the AI's perspective, the DOM is noise where business logic is faintly buried.
This essay argues for a White Box approach:
Instead of making agents reverse-engineer the UI, expose a Semantic State Layer that reveals the application's structure, rules, state, and valid transitions directly.
This is not about replacing UI. It's about giving AI agents a proper interface — what I call an Intelligence Interface (II) — alongside the traditional User Interface.
This post introduces Manifesto, an open-source engine that implements this philosophy with a concrete protocol: @manifesto-io/*.
1. Black Box: The Current State of AI + Web Apps
Here's how most teams "add AI" to their web apps today:
- Use LangChain, AutoGPT, or browser automation
- Drive Playwright or Puppeteer
- Dump the DOM or screenshot into the model
- Hope it figures out what to click
This is the Black Box approach. The agent sees only the rendered surface and must infer everything else.
What's Wrong with DOM Dumps?
Consider a typical Material UI form field:
<div class="MuiFormControl-root css-1u3bzj6">
<label class="MuiInputLabel-root">Product Name</label>
<div class="MuiInputBase-root">
<input aria-invalid="false" type="text" class="MuiInputBase-input" value="" />
</div>
<p class="MuiFormHelperText-root">This field is required.</p>
</div>
From an agent's perspective:
| Problem | Impact |
|---|---|
| Token waste | 90% of tokens are class names and wrappers |
| Missing constraints | Is it required? What's the max length? |
| No dependencies | Does this field depend on others? |
| No causality | Submit is disabled — but why? |
The agent is forced to guess. A CSS refactor breaks everything. A layout change confuses the model. The logic was never exposed — only its visual projection.
Signal < 10%. Noise > 90%.
2. White Box: Exposing the Application's Brain
The alternative is a White Box protocol.
Instead of showing HTML, the engine exposes a Semantic Snapshot — a structured representation of the application's internal state that agents can read directly.
{
"topology": {
"viewId": "product-create",
"mode": "create",
"sections": [
{ "id": "basic", "title": "Basic Info", "fields": ["name", "productType"] },
{ "id": "shipping", "title": "Shipping", "fields": ["shippingWeight"] }
]
},
"state": {
"form": { "isValid": false, "isDirty": false },
"fields": {
"name": {
"value": "",
"meta": { "valid": false, "hidden": false, "disabled": false, "errors": ["Required"] }
},
"productType": {
"value": "PHYSICAL",
"meta": { "valid": true, "hidden": false, "disabled": false, "errors": [] }
},
"shippingWeight": {
"value": null,
"meta": { "valid": true, "hidden": false, "disabled": false, "errors": [] }
}
}
},
"constraints": {
"name": { "required": true, "minLength": 2, "maxLength": 100 },
"shippingWeight": { "min": 0, "max": 2000, "dependsOn": ["productType"] }
},
"interactions": [
{ "id": "updateField:name", "intent": "updateField", "target": "name", "available": true },
{ "id": "updateField:productType", "intent": "updateField", "target": "productType", "available": true },
{ "id": "submit", "intent": "submit", "available": false, "reason": "Name is required" }
]
}
Now the agent has:
- Topology: Screen structure, sections, field hierarchy
- State: Current values, validity, visibility, errors — per field
- Constraints: Required, min/max, dependencies
- Interactions: What actions are available, and why some are blocked
No guessing. No inference. The agent reads the application's brain directly.
3. A Real Use Case: "Where Do I Select the Week?"
🎮 See it in action: Manifesto Playground — try changing field values and watch the semantic state update in real-time.
Here's a scenario from a complex SaaS scheduling interface:
User: "I see a date picker, but where do I select which week?"
AI Chatbot: "The week selector only appears when you set frequency to 'Weekly'. Right now it's set to 'Daily'. Should I change it for you?"
For this to work, the AI needs to know:
- A field called
weekSelectorexists - It's currently hidden
- It becomes visible when
frequency === 'WEEKLY' - The current value of
frequencyis'DAILY'
No amount of DOM parsing gives you this reliably. But a Semantic Snapshot does:
{
"fields": {
"frequency": {
"value": "DAILY",
"meta": { "hidden": false }
},
"weekSelector": {
"value": null,
"meta": { "hidden": true },
"visibleWhen": "frequency === 'WEEKLY'"
}
}
}
The AI reads this and knows — without inference — exactly why the field is hidden and what would make it appear.
4. The Protocol Loop
Manifesto implements a continuous feedback loop between the engine and AI agents:
┌─────────────────────────────────────────────────────────────────────┐
│ │
│ [Context Injection] → [Reasoning] → [Action Dispatch] → [Delta] │
│ ▲ │ │
│ └─────────────── Continuous Snapshots ─────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────┘
Step by Step:
-
Context Injection: Engine exports a Semantic Snapshot
- Topology (sections, fields, hierarchy)
- State (values, validity, visibility)
- Constraints (what's blocked and why)
- Interactions (available intents with reasons)
Reasoning: Agent plans next action based on snapshot
-
Action Dispatch: Agent calls abstract intents, not DOM events
-
updateField,submit,reset,validate
-
-
Delta Feedback: Engine returns what changed
- Not just "success" — the actual state diff
- Agent learns causality: "I changed X, and Y became hidden"
Loop continues with updated snapshot
This is fundamentally different from "click and hope." The agent operates on structured meaning with predictable feedback.
5. The API: Exploration and Execution
Manifesto exposes this protocol through @manifesto-io/ai:
Exploration Mode: "What can I do here?"
import { createInteroperabilitySession } from '@manifesto-io/ai'
const session = createInteroperabilitySession({
runtime, // FormRuntime instance
viewSchema, // View definition
entitySchema, // Entity definition
})
// Get current semantic snapshot
const snapshot = session.snapshot()
// snapshot.interactions tells the agent:
// - submit: available=false, reason="Name is required"
// - updateField:name: available=true
// - updateField:productType: available=true
The agent now knows the current state and exactly what actions are valid.
Execution Mode: "Change it to digital"
const result = session.dispatch({
type: 'updateField',
fieldId: 'productType',
value: 'DIGITAL',
})
if (result._tag === 'Ok') {
const { snapshot, delta } = result.value
// delta shows exactly what changed:
// {
// fields: {
// productType: { value: 'DIGITAL' },
// shippingWeight: { hidden: true },
// fulfillmentType: { hidden: true }
// },
// interactions: {
// 'updateField:shippingWeight': { available: false, reason: 'Field is hidden' }
// }
// }
}
The agent doesn't just get "success." It gets a delta showing the causal chain: changing productType to DIGITAL caused shippingWeight to become hidden.
LLM Tool Export
Convert the snapshot into OpenAI/Claude-compatible function definitions:
import { toToolDefinitions } from '@manifesto-io/ai'
const tools = toToolDefinitions(snapshot, { omitUnavailable: true })
// Returns JSON-Schema tool definitions:
// - updateField (with enum of available fields)
// - submit (if form is valid)
// - reset
// - validate
This enables agents to interact with forms through standard function-calling interfaces.
6. Safety Rails: The Hallucination Firewall
The Problem with Black Box
When an agent manipulates DOM directly:
- It can click anything, including elements it shouldn't
- It can input invalid values
- It can trigger actions outside its permission
- Failures are silent or cryptic
Manifesto's Safety Rails
Hallucination Firewall: Every agent action is validated before execution.
const result = session.dispatch({
type: 'updateField',
fieldId: 'nonexistent', // ❌ Unknown field
value: 'test',
})
// result._tag === 'Err'
// result.error === 'Field not found: nonexistent'
// State unchanged — no side effects
What gets rejected:
- Unknown fields →
Err - Type mismatches (string to number field) →
Err - Hidden field updates →
Err - Disabled field updates →
Err - Unauthorized actions →
Err
Atomic Rollback: On any failure, the previous snapshot remains intact. No partial mutations.
Deterministic Contracts: Same input + same state = same output. Agents can plan reliably.
This is capability-based access control for AI. The agent only sees and can only act on what's explicitly permitted.
7. The Schema Layer
The Semantic Snapshot is derived from a declarative schema. Here's how it looks:
Entity Schema (Domain truth)
import { entity, field, enumValue } from '@manifesto-io/schema'
export const productTypes = [
enumValue('PHYSICAL', 'Physical Product'),
enumValue('DIGITAL', 'Digital Product'),
] as const
export const productEntity = entity('product', 'Product', '1.0.0')
.fields(
field.string('name', 'Product Name')
.required('Product name is required')
.min(2).max(100)
.build(),
field.enum('productType', 'Product Type', productTypes)
.required()
.defaultValue('PHYSICAL')
.build(),
field.number('shippingWeight', 'Shipping Weight (kg)')
.min(0).max(2000)
.build(),
)
.build()
View Schema (UI behavior)
import {
view, section, layout, viewField,
on, actions, $, fieldEquals,
} from '@manifesto-io/schema'
export const productCreateView = view('product-create', 'Create Product', '1.0.0')
.entityRef('product')
.mode('create')
.sections(
section('basic')
.title('Basic Information')
.layout(layout.grid(2, '1rem'))
.fields(
viewField.textInput('name', 'name')
.label('Product Name')
.build(),
viewField.select('productType', 'productType')
.label('Product Type')
.reaction(
on.change()
.when(fieldEquals('productType', 'DIGITAL'))
.do(
actions.updateProp('shippingWeight', 'hidden', true)
)
)
.reaction(
on.change()
.when(['!=', $.state('productType'), 'DIGITAL'])
.do(
actions.updateProp('shippingWeight', 'hidden', false)
)
)
.build(),
viewField.numberInput('shippingWeight', 'shippingWeight')
.label('Shipping Weight (kg)')
.dependsOn('productType')
.props({ min: 0, max: 2000 })
.build(),
)
.build(),
)
.build()
The schema captures:
-
Dependencies:
.dependsOn('productType') -
Reactions:
on.change().when(...).do(...) - Business rules: DIGITAL hides shipping fields
All of this is introspectable. The engine reads the schema, builds a DAG of dependencies, and exports the current state as a Semantic Snapshot.
8. Why Not Existing Tools?
| Tool | Strength | Gap |
|---|---|---|
| XState | State machines | No UI semantics, no agent protocol |
| Zod | Validation | No field dependencies, no visibility rules |
| React Hook Form | Form state | Business logic buried in components |
| MCP | Tool invocation | No UI domain logic, no snapshot protocol |
The missing piece is a layer that captures:
- Why a field is hidden (not just that it is)
- What conditions enable an action
- How fields relate to each other
- What changed after an action (delta feedback)
This is UI domain logic. None of the above expose it in a machine-readable protocol.
Manifesto fills that gap.
9. UI for Humans, II for Agents
For decades we've built User Interfaces:
- Look good on screen
- Feel responsive
- Work across devices
That still matters. But it's no longer enough.
Software now needs both a UI for humans and an II — Intelligence Interface — for agents.
| Layer | Consumer | Content |
|---|---|---|
| UI | Humans | Pixels, clicks, visual feedback |
| II | Agents | Semantic Snapshot, intent dispatch, delta feedback |
Manifesto's architecture:
┌─────────────────────────────────────────────────────────────┐
│ Schema Layer │
│ ┌─────────────┬─────────────┬─────────────────────────┐ │
│ │ Entity │ View │ Reactions & Rules │ │
│ └─────────────┴─────────────┴─────────────────────────┘ │
├─────────────────────────────────────────────────────────────┤
│ Engine (DAG Runtime) │
├───────────────────────┬─────────────────────────────────────┤
│ UI Renderer │ AI Protocol (@manifesto/ai) │
│ (React/Vue/etc) │ Snapshot + Dispatch + Delta │
└───────────────────────┴─────────────────────────────────────┘
↓ ↓
Humans Agents
Define the schema once. Generate both UI and II from it.
10. What Is an "AI-Native Application"?
To me, an AI-native application has these properties:
White Box, not Black Box — The engine exposes semantic state, not just rendered output
UI is a projection — A visual representation of state, not the source of truth
Agents interact with meaning — Through structured snapshots and intent dispatch
Protocol over DOM — Actions are validated, deterministic, and return deltas
Safety by design — Hallucination firewall, atomic rollback, capability-based access
This doesn't mean abandoning UI. It means recognizing that UI alone is insufficient when your users include both humans and machines.
The Road Ahead
We're at an inflection point.
For decades, software was built for human consumption. UI was the interface, and it was enough.
Now, AI agents are becoming first-class users. They don't need pixels and click events. They need:
- Structured state
- Explicit constraints
- Causal feedback
- Safe execution boundaries
The teams that build for this will have AI integrations that are:
- Interpretable: Agents understand intent, not just surface
- Deterministic: Same input, same output
- Debuggable: Trace exactly what changed and why
- Safe: Hallucinations rejected, not silently executed
The teams that don't will find their AI integrations perpetually fragile — dependent on screenshot parsing and prompt hacks that break with every redesign.
Conclusion
HTML is a great language for humans.
For AI, it's a noisy encoding of things it shouldn't have to reverse-engineer.
AI doesn't need your pixels. It needs your meaning.
That meaning should be exposed as a Semantic State Layer — a White Box protocol where agents can read state, dispatch intents, and receive causal feedback.
Manifesto is my attempt to build that layer.
GitHub: github.com/eggplantiny/manifesto-io
Playground: manifesto-io-playground.vercel.app
Package: @manifesto-io/* — The interoperability protocol for agents
Don't feed HTML to your agents.
Give them a White Box: state, intent, and semantics.

Top comments (0)