Why UI Generation Is Not Enough in the AI Era
A philosophy of software architecture from the perspective of AI agents
For the past decade, UI generation has been the flagship promise of developer productivity. Visual builders, code templates, declarative schemas—all designed to accelerate the creation of user interfaces.
But something fundamental has changed.
The users of our software are no longer just humans. AI agents now interact with applications, trigger workflows, manipulate data, and evolve systems. And here's the uncomfortable truth: UI was never designed for them.
This essay argues that in the AI era, generating UI is no longer the bottleneck. The bottleneck is the absence of machine-readable semantics. And until we address this, AI agents will continue to guess, hallucinate, and fail.
The Lens Matters: Human Semantics ≠ Machine Semantics
Before we proceed, let me be explicit about the lens through which I'm viewing software:
This is not a critique of modern frontend development.
React, Vue, and Svelte are sophisticated frameworks. They organize code into semantic components. A <UserForm /> component is meaningful—to a human developer who can read its implementation.
But an AI agent cannot open UserForm.tsx and understand:
- What fields exist inside
- Which fields depend on others
- What validation rules apply
- When a field becomes disabled
- What side effects occur on submit
The component is a black box.
Modern UI frameworks are semantic for humans. They are opaque for machines.
This distinction is the foundation of everything that follows.
Why AI Fails: The Interpretation Gap
When an AI agent "mis-clicks" or "hallucinates form behavior," the instinct is to blame model accuracy.
My view is different.
AI fails because applications do not expose their internal semantics.
Consider what happens when an AI agent encounters a typical web form:
- It sees a screenshot (or DOM structure)
- It identifies visual elements: buttons, inputs, dropdowns
- It infers what those elements do based on labels and context
- It executes actions based on those inferences
Step 3 is where everything breaks.
The AI has no way of knowing:
- That selecting "Digital Product" should hide the "Shipping Weight" field
- That "Subcategory" depends on "Category" and should be disabled until Category is selected
- That "Freight" shipping type enforces a minimum weight of 50kg
- That "Final Price" is auto-calculated from "Price" and "Discount Rate"
- That SKU must be validated against a server endpoint for uniqueness
These rules are scattered across:
-
useEffecthooks - Custom validation schemas
- Service layer calls
- Global state management
- Ad-hoc conditional rendering
- Backend business logic
- Tribal knowledge in developers' heads
The AI sees the surface. The meaning is buried.
Any system that forces an AI to guess will inevitably break.
What AI Agents Actually Need
Let me reframe the problem.
Humans need:
- Buttons to click
- Inputs to fill
- Visual feedback to understand state
AI agents need:
- The intent behind a button
- The semantics of a field
- The rules governing a workflow
- The dependencies between values
- The effects triggered by a change
UI expresses almost none of this in a machine-readable way.
What AI agents truly need is not a screen. It's an Intent Graph—a declarative representation of:
| UI Concept | Semantic Equivalent |
|---|---|
| Button | Action with preconditions and effects |
| Input field | Typed semantic value with constraints |
| Screen | View into a domain model |
| Form submission | State transition with validation |
| Conditional visibility | Dependency graph |
The UI can be generated, regenerated, or discarded. But the Intent Graph remains constant.
That is what the AI should interact with.
A Concrete Example: Declarative View Schema
Let me show you what this looks like in practice.
Here's a traditional React form (simplified):
function ProductForm() {
const [productType, setProductType] = useState('PHYSICAL')
const [weight, setWeight] = useState(0)
const [fulfillment, setFulfillment] = useState('STANDARD')
useEffect(() => {
if (fulfillment === 'FREIGHT' && weight < 50) {
setWeight(50)
}
}, [fulfillment, weight])
return (
<form>
<select value={productType} onChange={e => setProductType(e.target.value)}>
<option value="PHYSICAL">Physical</option>
<option value="DIGITAL">Digital</option>
</select>
{productType !== 'DIGITAL' && (
<input
type="number"
value={weight}
onChange={e => setWeight(Number(e.target.value))}
/>
)}
</form>
)
}
An AI agent looking at this has to:
- Parse the JSX
- Trace the
useStatedependencies - Understand the
useEffectlogic - Infer conditional rendering rules
Now compare this to a declarative semantic schema:
const productView = view('product-create')
.fields(
viewField.select('productType')
.options(['PHYSICAL', 'DIGITAL'])
.reaction(
on.change()
.when(fieldEquals('productType', 'DIGITAL'))
.do(actions.updateProp('shippingWeight', 'hidden', true))
),
viewField.numberInput('shippingWeight')
.dependsOn('productType', 'fulfillmentType')
.reaction(
on.change()
.when(and(
fieldEquals('fulfillmentType', 'FREIGHT'),
['<', $.state('shippingWeight'), 50]
))
.do(actions.setValue('shippingWeight', 50))
)
)
The difference is profound.
In the declarative schema:
-
Dependencies are explicit:
dependsOn('productType', 'fulfillmentType') -
Reactions are declared, not imperative:
on.change().when(...).do(...) - Business rules are first-class: "FREIGHT requires minimum 50kg"
-
Visibility conditions are queryable:
hidden: truewhen productType is DIGITAL
An AI agent reading this schema knows—without inference, without guessing—exactly how the form behaves.
Why Existing Technologies Aren't Enough
You might ask: don't we already have semantic technologies?
| Technology | What It Captures | What It Misses |
|---|---|---|
| OpenAPI | API shape, request/response | UI state, field dependencies |
| GraphQL | Data relationships | Workflow logic, validation rules |
| JSON Schema | Data structure, constraints | View binding, reactive behavior |
| MCP | Tool contracts | Inter-field dependencies, UI state |
| React/Vue | Component structure | Business logic (encapsulated) |
Each of these is a fragment of semantics.
None provides a unified model that spans:
- Data (Entity)
- View (Presentation + Interaction)
- Rules (Validation, Constraints)
- Dependencies (Field relationships)
- Actions (State transitions)
- Workflows (Multi-step processes)
This is the missing layer. I call it the Semantic Execution Layer—a unified runtime where meaning is explicit, queryable, and executable.
The Architecture of Meaning
Here's how I envision the stack:
┌─────────────────────────────────────────┐
│ Natural Language │ ← User or AI intent
├─────────────────────────────────────────┤
│ Semantic Layer │ ← Intent Graph / View Schema
│ ┌─────────┬──────────┬──────────────┐ │
│ │ Entity │ View │ Reactions │ │
│ │ Schema │ Schema │ & Rules │ │
│ └─────────┴──────────┴──────────────┘ │
├─────────────────────────────────────────┤
│ Execution Runtime │ ← Interprets & executes
├─────────────────────────────────────────┤
│ UI Rendering │ API Calls │ ← Projections
└─────────────────────────────────────────┘
In this architecture:
- Entity Schema defines domain truth: what data exists, what constraints apply
- View Schema defines presentation and interaction: how fields are displayed, how they react
- Reactions define behavior: when X changes, do Y
- Execution Runtime interprets these schemas and manages state
- UI is just one possible rendering—a projection of the semantic layer
The AI agent interacts with the Semantic Layer directly. It doesn't need to parse UI. It doesn't need to guess.
Inference vs. Guarantee
There's a counterargument I often hear:
"But LLMs are getting really good at understanding UI! They can look at screenshots and figure out what to do."
This is true. And it's also insufficient.
LLMs can infer meaning from UI. But inference is probabilistic. It's a prediction, not a guarantee.
For simple forms, inference works. For complex business logic—cascading dependencies, conditional validation, multi-step workflows—inference fails unpredictably.
Consider the difference:
| Approach | Reliability | Debuggability | Determinism |
|---|---|---|---|
| Screenshot → LLM inference | ~70-80% | Low | No |
| DOM parsing → heuristics | ~80-85% | Medium | Partial |
| Semantic schema → direct access | ~99%+ | High | Yes |
If you're building a demo, inference is fine.
If you're building production systems where AI agents execute real transactions, move real money, and affect real outcomes—you need guarantees.
Semantic schemas provide those guarantees.
UI Is Ephemeral. Semantics Endure.
Let me be clear about what I'm not saying.
I'm not saying UI is useless. UI is essential—for humans.
I'm not saying we should stop building beautiful interfaces. We should.
What I'm saying is this:
UI is one projection of a deeper structure. AI agents need access to that deeper structure directly.
Today's software is built like this:
- Design the UI
- Implement the logic (scattered everywhere)
- Hope the AI can figure it out
Tomorrow's software should be built like this:
- Define the semantic model (Entity + View + Rules)
- Generate UI as one projection
- Expose semantics to AI agents as another projection
- Let both humans and machines interact with the same underlying truth
The semantic layer becomes the single source of truth.
UI becomes a rendering concern—important, but not foundational.
Implications for Software Architecture
If you accept this philosophy, several implications follow:
1. Schema-First Development
Define your semantic schemas before writing UI code. The schema is the specification. The UI is an implementation detail.
2. Explicit Dependencies
Never hide field dependencies in useEffect hooks. Declare them in the schema where they can be queried and reasoned about.
3. Reactions as Data
Business rules should be expressed as declarative reactions, not imperative code. on.change().when(...).do(...) is data. An if statement buried in a component is not.
4. AI-Native APIs
Expose your semantic layer to AI agents. Not just REST endpoints, but the full Intent Graph: fields, dependencies, constraints, valid actions, expected outcomes.
5. Runtime Introspection
Build execution runtimes that can answer questions like:
- "What fields are currently visible?"
- "What would happen if I set this value?"
- "What are the preconditions for this action?"
The Road Ahead
We're at an inflection point.
For decades, software has been built for human consumption. UI was the interface, and it was sufficient.
Now, AI agents are becoming first-class users of our systems. They don't need pixels and click events. They need structured meaning.
The companies that recognize this shift early—that invest in semantic layers, declarative schemas, and AI-native architectures—will build systems that are:
- Interpretable: AI agents can understand intent, not just surface
- Deterministic: Behavior is predictable, not probabilistic
- Evolvable: Schemas can be versioned, migrated, and extended
- Debuggable: When something fails, you can trace exactly why
The companies that don't will find their AI integrations perpetually fragile—dependent on prompt engineering hacks and screenshot parsing tricks that break with every UI change.
Conclusion
UI generation was the endgame for human developers.
For AI agents, it's not enough.
What we need is a Semantic Execution Layer—a unified representation of data, view, rules, and behavior that both humans and machines can interact with.
UI is a projection of this layer. An important one, but just one of many.
As AI accelerates, the question shifts:
"What does my application look like?" → "What does my application mean?"
The first question produces screens.
The second question produces understanding.
And in the age of AI agents, understanding is everything.
UI is transient. Semantics endure.
Top comments (7)
Great article! You've hit on something critical that most of the industry is missing.
The "Intent Graph" concept you describe machine-readable semantics that both humans and AI can reason about, is exactly what we're building with Almadar (almadar.io).
Our approach: instead of generating UI directly from prompts, we use JSON schemas as the single source of truth. These schemas define:
Entities (your data model)
State machines (behavior as "traits")
Pages (UI containers)
The UI is just a projection of the schema. Change the schema, everything updates API, database, frontend. Everything derives from the schema.
Would love your thoughts on whether declarative schemas should include execution semantics (state machines) or stay purely structural?
Thanks for reading! It’s awesome to connect with someone tackling the exact same problem. Almadar's approach of using JSON schemas as the single source of truth resonates perfectly with what I'm trying to solve.
To answer your question: Yes, I strongly believe schemas should include execution semantics — but absolutely not as opaque imperative code (like "if/else in components").
For Manifesto/MEL-style systems, the core philosophy is separating deterministic state transitions from explicit effect boundaries.
If we want AI to reason about the world safely, the schema must provide:
computed= pure/deterministic (so AI can simulate and reason before acting)available when= explicit preconditions (Affordances)patch= explicit state diffs (debuggable and inspectable)Side effects (I/O) should be declared strictly at the boundaries, so an agent knows exactly what mutates the real world versus what is just a pure state computation.
Here is a quick example of how it looks in Manifesto:
So yes, schemas must carry behavior, but entirely as queryable semantics, keeping pure logic strictly separated from side effects.
How does Almadar currently handle the boundary between the state machine logic and actual side effects (like DB writes or external API calls)? Would love to hear how you're approaching that!
Great question,
This is exactly the design tension we spent a lot of time on. The short answer is: Almadar's schema carries full execution semantics, and the boundary between pure logic and side effects is structurally enforced by the schema itself, not by convention.
Why S-Expressions?
Before diving in, a quick note on the syntax you'll see below. Almadar uses S-expressions — the notation that originated in Lisp in the 1950s — for all guards, effects, and computed values. An S-expression is just a nested list where the first element is the operator:
This is the JSON equivalent of Lisp's
(and (= entity.status "review") (> entity.count 0)).We chose S-expressions for a specific reason: they're data, not code. Because they're plain JSON arrays, they can be parsed, validated, transformed, and serialized by any language — the Rust compiler statically analyzes them, the TypeScript runtime evaluates them, and an AI agent can reason about them without needing a language-specific parser. They're homoiconic: the representation is the AST. That's what makes the entire schema machine-queryable — guards, effects, and computed values are all just data structures you can walk, validate, and simulate.
Here's how it maps to the concepts you described:
Guards = Pure/Deterministic (your
computed)Guards are S-expressions that evaluate to a boolean — they're pure, side-effect-free, and the runtime (or compiler) can evaluate them without touching the outside world:
This is equivalent to your
computed canPublish = and(eq(doc.status, "review"), gt(len(sections), 0)). Guards reference entity state via bindings (@entity.*,@payload.*,@user.*) and use a well-defined S-expression algebra (comparison, logic, math, collection ops). An AI agent can fully simulate guard evaluation without any I/O — the evaluator is a pure function over the binding context.Transitions = Explicit Preconditions (your
available when)A transition is only valid from a specific state, triggered by a specific event, and gated by a guard:
The state machine itself acts as the affordance system — you can only
PUBLISHwhen you're in thereviewstate, AND the guard passes. The compiler even has a--simulatemode that does BFS over all reachable states to verify what's possible from where, without running anything.Effects = Declared Side Effects at the Boundary (your
patch+effect)This is where Almadar's design is very deliberate. Effects are typed, declared in the schema, and partitioned by where they execute:
setpatch)render-uinavigatepersistfetchcall-serviceemitnotifyThe runtime enforces this partition — the server ignores
render-ui/navigate/notify, and the client ignorespersist/fetch/call-service. So the schema declares what mutates the real world (persist,call-service), what's a pure state computation (set, guards), and what's a UI concern (render-ui). An agent reading the schema knows exactly which effects touch external systems.Here's how your Manifesto example would look in Almadar:
The
setis yourpatch(pure state diff). Thepersistandemitare your declared side effects at the boundary. The guard is youravailable when. All queryable, all inspectable, all simulatable.The Dual Execution Model
What makes this work in practice is that every event flows through the same pipeline on both client and server simultaneously:
The client gets instant UI feedback (optimistic), the server handles the real-world mutations, and the response reconciles them. But both sides are executing the same schema — it's not two separate codebases, it's one schema with partitioned effect execution.
Compile-Time Verification
The Rust compiler validates this entire model at compile time:
OirExpr) and validated@entity.field) are checked against actual entity definitions--simulate) traces all reachable states via BFS--execute) runs generated test cases against the state machine without a browserSo the schema isn't just documentation — it's the executable specification that both the compiler and runtime enforce.
Where We Differ from Manifesto
The main structural difference is that Almadar uses state machines as its organizing principle rather than standalone actions. Every action exists within a state context — you can only
PUBLISHfrom thereviewstate. This gives you not just preconditions per action, but a complete graph of all possible state trajectories, which the compiler can statically analyze.The tradeoff is that Manifesto's
actiondeclarations are more concise for simple cases. At the same time, Almadar's state machine model pays off when you have complex multi-step flows (wizards, checkout pipelines, game loops) where the state context matters.Looking forward to discussing this some more.
Thank you for the incredibly deep and detailed explanation. Almadar's approach of structurally separating guards, transitions, and effects, while treating all execution semantics as data via S-expressions, is highly impressive. I strongly resonate with the philosophy that "the schema is the executable specification itself."
However, where I took a slightly different approach when designing Manifesto is that I do not completely equate the semantic model with the execution model.
Manifesto begins with the following axiom:
compute(Snapshot, intent) = projectionHere, the
Snapshotrepresents a coordinate in the domain's semantic space, and theintentis a movement vector within that space. The projection (whether UI, backend, DB, etc.) is merely the computed result, not the semantics itself.Because of this, I intended to design the semantic layer not as a fixed executable state machine, but rather as a topological semantic space that an AI can directly navigate and modify. This is why MEL (Manifesto Expression Language) is treated as a first-class citizen and made mutable at runtime. In my view, an AI shouldn't just be an agent that triggers state transitions; it needs the ability to read the structure of the state space itself, simulate it, and dynamically reconfigure it at runtime if necessary.
Almadar’s state machine model clearly offers powerful advantages for execution path integrity and compile-time static analysis. In contrast, Manifesto places the semantic space at an ontological layer prior to projection, intentionally pushing side-effects outside the boundaries of its interpretation.
Ultimately, this distinction seems to stem from a foundational difference in perspective: whether to view the system as "Semantic = Executable Graph" or "Semantic = Navigable Coordinate Space."
This leaves me with one question. Since Almadar has such an elegant S-expression-based data structure, do you envision scenarios where the schema itself is modified at runtime, or where the state machine structure is dynamically reconfigured by an AI agent? I'd love to hear your thoughts on the potential of your semantic model evolving beyond a fixed executable specification into a space that is explorable and extensible at runtime.
Your reply gave me a much better understanding of how Manifesto is designed. I might be wrong here, but the problem with staying at the semantic level and not providing concrete, verifiable primitives in your language is that it introduces too much non-determinism (correct me if I misunderstood you). When I was constructing Almadar, my main constraints were these: agents should be able to construct this language, but at all times, a human can read what the agent produces and easily infer the intent and execution model of what it produced. Also, agents still hallucinate, so you need a fixed set of primitives with which they can work. That is why I have the state machine as a first-class object in the language itself.
Now, regarding your question, yes, the agent can modify the state machine itself (it's a .orb file), but it must pass the deterministic validator we have (via the almadar cli, validate command). But again, modifying things at runtime is not something I tested, because in my use case, I usually construct the .orb or schema file -> compile it after running all the validations -> deploy it.
I haven't really tested modifying the schema at runtime (I have a runtime built, but I also usually pass the .orb or schema file to it, and it just runs it).
But your question sparked some ideas in my head.
Awesome discussion btw. I really enjoy discussing this with you. It's nice to talk to someone who is thinking and working on similar things.
Your concern makes sense, and I think you’re pointing at a very real tension.
I agree that staying purely at the semantic level without grounding it in concrete, verifiable primitives can introduce non-determinism. That’s something I’m actively thinking about as well.
In Manifesto, my goal is not to leave semantics abstract, but to design a layer where:
So in a way, I’m trying to reconcile both perspectives:
Semantic as a coordinate space, but with explicit structural invariants that prevent it from collapsing into probabilistic interpretation.
Your emphasis on state machines as first-class primitives is a very strong anchor for determinism. I’m exploring whether a similar guarantee can be achieved while keeping the semantic layer slightly more ontologically prior to execution.
Your comment actually clarifies the exact design boundary I need to formalize more rigorously.
Really appreciate the push on this.
Some comments may only be visible to logged-in visitors. Sign in to view all comments.