Jung Sungwoo

Posted on Dec 3, 2025

Why UI Generation Is Not Enough in the AI Era

#ai #architecture #discuss

Why UI Generation Is Not Enough in the AI Era

A philosophy of software architecture from the perspective of AI agents

For the past decade, UI generation has been the flagship promise of developer productivity. Visual builders, code templates, declarative schemas—all designed to accelerate the creation of user interfaces.

But something fundamental has changed.

The users of our software are no longer just humans. AI agents now interact with applications, trigger workflows, manipulate data, and evolve systems. And here's the uncomfortable truth: UI was never designed for them.

This essay argues that in the AI era, generating UI is no longer the bottleneck. The bottleneck is the absence of machine-readable semantics. And until we address this, AI agents will continue to guess, hallucinate, and fail.

The Lens Matters: Human Semantics ≠ Machine Semantics

Before we proceed, let me be explicit about the lens through which I'm viewing software:

This is not a critique of modern frontend development.

React, Vue, and Svelte are sophisticated frameworks. They organize code into semantic components. A <UserForm /> component is meaningful—to a human developer who can read its implementation.

But an AI agent cannot open UserForm.tsx and understand:

What fields exist inside
Which fields depend on others
What validation rules apply
When a field becomes disabled
What side effects occur on submit

The component is a black box.

Modern UI frameworks are semantic for humans. They are opaque for machines.

This distinction is the foundation of everything that follows.

Why AI Fails: The Interpretation Gap

When an AI agent "mis-clicks" or "hallucinates form behavior," the instinct is to blame model accuracy.

My view is different.

AI fails because applications do not expose their internal semantics.

Consider what happens when an AI agent encounters a typical web form:

It sees a screenshot (or DOM structure)
It identifies visual elements: buttons, inputs, dropdowns
It infers what those elements do based on labels and context
It executes actions based on those inferences

Step 3 is where everything breaks.

The AI has no way of knowing:

That selecting "Digital Product" should hide the "Shipping Weight" field
That "Subcategory" depends on "Category" and should be disabled until Category is selected
That "Freight" shipping type enforces a minimum weight of 50kg
That "Final Price" is auto-calculated from "Price" and "Discount Rate"
That SKU must be validated against a server endpoint for uniqueness

These rules are scattered across:

useEffect hooks
Custom validation schemas
Service layer calls
Global state management
Ad-hoc conditional rendering
Backend business logic
Tribal knowledge in developers' heads

The AI sees the surface. The meaning is buried.

Any system that forces an AI to guess will inevitably break.

What AI Agents Actually Need

Let me reframe the problem.

Humans need:

Buttons to click
Inputs to fill
Visual feedback to understand state

AI agents need:

The intent behind a button
The semantics of a field
The rules governing a workflow
The dependencies between values
The effects triggered by a change

UI expresses almost none of this in a machine-readable way.

What AI agents truly need is not a screen. It's an Intent Graph—a declarative representation of:

UI Concept	Semantic Equivalent
Button	Action with preconditions and effects
Input field	Typed semantic value with constraints
Screen	View into a domain model
Form submission	State transition with validation
Conditional visibility	Dependency graph

The UI can be generated, regenerated, or discarded. But the Intent Graph remains constant.

That is what the AI should interact with.

A Concrete Example: Declarative View Schema

Let me show you what this looks like in practice.

Here's a traditional React form (simplified):

function ProductForm() {
  const [productType, setProductType] = useState('PHYSICAL')
  const [weight, setWeight] = useState(0)
  const [fulfillment, setFulfillment] = useState('STANDARD')

  useEffect(() => {
    if (fulfillment === 'FREIGHT' && weight < 50) {
      setWeight(50)
    }
  }, [fulfillment, weight])

  return (
    <form>
      <select value={productType} onChange={e => setProductType(e.target.value)}>
        <option value="PHYSICAL">Physical</option>
        <option value="DIGITAL">Digital</option>
      </select>

      {productType !== 'DIGITAL' && (
        <input 
          type="number" 
          value={weight}
          onChange={e => setWeight(Number(e.target.value))}
        />
      )}
    </form>
  )
}

An AI agent looking at this has to:

Parse the JSX
Trace the useState dependencies
Understand the useEffect logic
Infer conditional rendering rules

Now compare this to a declarative semantic schema:

const productView = view('product-create')
  .fields(
    viewField.select('productType')
      .options(['PHYSICAL', 'DIGITAL'])
      .reaction(
        on.change()
          .when(fieldEquals('productType', 'DIGITAL'))
          .do(actions.updateProp('shippingWeight', 'hidden', true))
      ),

    viewField.numberInput('shippingWeight')
      .dependsOn('productType', 'fulfillmentType')
      .reaction(
        on.change()
          .when(and(
            fieldEquals('fulfillmentType', 'FREIGHT'),
            ['<', $.state('shippingWeight'), 50]
          ))
          .do(actions.setValue('shippingWeight', 50))
      )
  )

The difference is profound.

In the declarative schema:

Dependencies are explicit: dependsOn('productType', 'fulfillmentType')
Reactions are declared, not imperative: on.change().when(...).do(...)
Business rules are first-class: "FREIGHT requires minimum 50kg"
Visibility conditions are queryable: hidden: true when productType is DIGITAL

An AI agent reading this schema knows—without inference, without guessing—exactly how the form behaves.

Why Existing Technologies Aren't Enough

You might ask: don't we already have semantic technologies?

Technology	What It Captures	What It Misses
OpenAPI	API shape, request/response	UI state, field dependencies
GraphQL	Data relationships	Workflow logic, validation rules
JSON Schema	Data structure, constraints	View binding, reactive behavior
MCP	Tool contracts	Inter-field dependencies, UI state
React/Vue	Component structure	Business logic (encapsulated)

Each of these is a fragment of semantics.

None provides a unified model that spans:

Data (Entity)
View (Presentation + Interaction)
Rules (Validation, Constraints)
Dependencies (Field relationships)
Actions (State transitions)
Workflows (Multi-step processes)

This is the missing layer. I call it the Semantic Execution Layer—a unified runtime where meaning is explicit, queryable, and executable.

The Architecture of Meaning

Here's how I envision the stack:

┌─────────────────────────────────────────┐
│           Natural Language              │  ← User or AI intent
├─────────────────────────────────────────┤
│         Semantic Layer                  │  ← Intent Graph / View Schema
│  ┌─────────┬──────────┬──────────────┐  │
│  │ Entity  │  View    │  Reactions   │  │
│  │ Schema  │  Schema  │  & Rules     │  │
│  └─────────┴──────────┴──────────────┘  │
├─────────────────────────────────────────┤
│         Execution Runtime               │  ← Interprets & executes
├─────────────────────────────────────────┤
│    UI Rendering    │    API Calls       │  ← Projections
└─────────────────────────────────────────┘

In this architecture:

Entity Schema defines domain truth: what data exists, what constraints apply
View Schema defines presentation and interaction: how fields are displayed, how they react
Reactions define behavior: when X changes, do Y
Execution Runtime interprets these schemas and manages state
UI is just one possible rendering—a projection of the semantic layer

The AI agent interacts with the Semantic Layer directly. It doesn't need to parse UI. It doesn't need to guess.

Inference vs. Guarantee

There's a counterargument I often hear:

"But LLMs are getting really good at understanding UI! They can look at screenshots and figure out what to do."

This is true. And it's also insufficient.

LLMs can infer meaning from UI. But inference is probabilistic. It's a prediction, not a guarantee.

For simple forms, inference works. For complex business logic—cascading dependencies, conditional validation, multi-step workflows—inference fails unpredictably.

Consider the difference:

Approach	Reliability	Debuggability	Determinism
Screenshot → LLM inference	~70-80%	Low	No
DOM parsing → heuristics	~80-85%	Medium	Partial
Semantic schema → direct access	~99%+	High	Yes

If you're building a demo, inference is fine.

If you're building production systems where AI agents execute real transactions, move real money, and affect real outcomes—you need guarantees.

Semantic schemas provide those guarantees.

UI Is Ephemeral. Semantics Endure.

Let me be clear about what I'm not saying.

I'm not saying UI is useless. UI is essential—for humans.

I'm not saying we should stop building beautiful interfaces. We should.

What I'm saying is this:

UI is one projection of a deeper structure. AI agents need access to that deeper structure directly.

Today's software is built like this:

Design the UI
Implement the logic (scattered everywhere)
Hope the AI can figure it out

Tomorrow's software should be built like this:

Define the semantic model (Entity + View + Rules)
Generate UI as one projection
Expose semantics to AI agents as another projection
Let both humans and machines interact with the same underlying truth

The semantic layer becomes the single source of truth.

UI becomes a rendering concern—important, but not foundational.

Implications for Software Architecture

If you accept this philosophy, several implications follow:

1. Schema-First Development

Define your semantic schemas before writing UI code. The schema is the specification. The UI is an implementation detail.

2. Explicit Dependencies

Never hide field dependencies in useEffect hooks. Declare them in the schema where they can be queried and reasoned about.

3. Reactions as Data

Business rules should be expressed as declarative reactions, not imperative code. on.change().when(...).do(...) is data. An if statement buried in a component is not.

4. AI-Native APIs

Expose your semantic layer to AI agents. Not just REST endpoints, but the full Intent Graph: fields, dependencies, constraints, valid actions, expected outcomes.

5. Runtime Introspection

Build execution runtimes that can answer questions like:

"What fields are currently visible?"
"What would happen if I set this value?"
"What are the preconditions for this action?"

The Road Ahead

We're at an inflection point.

For decades, software has been built for human consumption. UI was the interface, and it was sufficient.

Now, AI agents are becoming first-class users of our systems. They don't need pixels and click events. They need structured meaning.

The companies that recognize this shift early—that invest in semantic layers, declarative schemas, and AI-native architectures—will build systems that are:

Interpretable: AI agents can understand intent, not just surface
Deterministic: Behavior is predictable, not probabilistic
Evolvable: Schemas can be versioned, migrated, and extended
Debuggable: When something fails, you can trace exactly why

The companies that don't will find their AI integrations perpetually fragile—dependent on prompt engineering hacks and screenshot parsing tricks that break with every UI change.

Conclusion

UI generation was the endgame for human developers.

For AI agents, it's not enough.

What we need is a Semantic Execution Layer—a unified representation of data, view, rules, and behavior that both humans and machines can interact with.

UI is a projection of this layer. An important one, but just one of many.

As AI accelerates, the question shifts:

"What does my application look like?" → "What does my application mean?"

The first question produces screens.

The second question produces understanding.

And in the age of AI agents, understanding is everything.

UI is transient. Semantics endure.

Top comments (7)

Osama Alghanmi • Feb 20 • Edited

Great article! You've hit on something critical that most of the industry is missing.
The "Intent Graph" concept you describe machine-readable semantics that both humans and AI can reason about, is exactly what we're building with Almadar (almadar.io).
Our approach: instead of generating UI directly from prompts, we use JSON schemas as the single source of truth. These schemas define:
Entities (your data model)
State machines (behavior as "traits")
Pages (UI containers)
The UI is just a projection of the schema. Change the schema, everything updates API, database, frontend. Everything derives from the schema.
Would love your thoughts on whether declarative schemas should include execution semantics (state machines) or stay purely structural?

Jung Sungwoo • Feb 25

Thanks for reading! It’s awesome to connect with someone tackling the exact same problem. Almadar's approach of using JSON schemas as the single source of truth resonates perfectly with what I'm trying to solve.
To answer your question: Yes, I strongly believe schemas should include execution semantics — but absolutely not as opaque imperative code (like "if/else in components").
For Manifesto/MEL-style systems, the core philosophy is separating deterministic state transitions from explicit effect boundaries.

If we want AI to reason about the world safely, the schema must provide:

computed = pure/deterministic (so AI can simulate and reason before acting)
available when = explicit preconditions (Affordances)
patch = explicit state diffs (debuggable and inspectable)

Side effects (I/O) should be declared strictly at the boundaries, so an agent knows exactly what mutates the real world versus what is just a pure state computation.
Here is a quick example of how it looks in Manifesto:

computed canPublish = and(eq(doc.status, "review"), gt(len(sections), 0))

action publish() available when canPublish {
    onceIntent {
        patch doc.status = "published"
        effect persist(doc)
        effect emit("doc.published", { ... })
    }
}

So yes, schemas must carry behavior, but entirely as queryable semantics, keeping pure logic strictly separated from side effects.

How does Almadar currently handle the boundary between the state machine logic and actual side effects (like DB writes or external API calls)? Would love to hear how you're approaching that!

Osama Alghanmi • Feb 25

Great question,
This is exactly the design tension we spent a lot of time on. The short answer is: Almadar's schema carries full execution semantics, and the boundary between pure logic and side effects is structurally enforced by the schema itself, not by convention.

Why S-Expressions?

Before diving in, a quick note on the syntax you'll see below. Almadar uses S-expressions — the notation that originated in Lisp in the 1950s — for all guards, effects, and computed values. An S-expression is just a nested list where the first element is the operator:

["and", ["=", "@entity.status", "review"], [">", "@entity.count", 0]]

This is the JSON equivalent of Lisp's (and (= entity.status "review") (> entity.count 0)).

We chose S-expressions for a specific reason: they're data, not code. Because they're plain JSON arrays, they can be parsed, validated, transformed, and serialized by any language — the Rust compiler statically analyzes them, the TypeScript runtime evaluates them, and an AI agent can reason about them without needing a language-specific parser. They're homoiconic: the representation is the AST. That's what makes the entire schema machine-queryable — guards, effects, and computed values are all just data structures you can walk, validate, and simulate.

Here's how it maps to the concepts you described:

Guards = Pure/Deterministic (your `computed`)

Guards are S-expressions that evaluate to a boolean — they're pure, side-effect-free, and the runtime (or compiler) can evaluate them without touching the outside world:

"guard": ["and",
  ["=", "@entity.status", "review"],
  [">", ["count", "@entity.sections"], 0]
]

This is equivalent to your computed canPublish = and(eq(doc.status, "review"), gt(len(sections), 0)). Guards reference entity state via bindings (@entity.*, @payload.*, @user.*) and use a well-defined S-expression algebra (comparison, logic, math, collection ops). An AI agent can fully simulate guard evaluation without any I/O — the evaluator is a pure function over the binding context.

Transitions = Explicit Preconditions (your `available when`)

A transition is only valid from a specific state, triggered by a specific event, and gated by a guard:

{
  "from": "review",
  "to": "published",
  "event": "PUBLISH",
  "guard": ["and", ["=", "@entity.status", "review"], [">", ["count", "@entity.sections"], 0]],
  "effects": [
    ["set", "@entity.id", "status", "published"],
    ["persist", "update", "Document", "@entity.id", { "status": "published" }],
    ["emit", "DOC_PUBLISHED", { "docId": "@entity.id" }]
  ]
}

The state machine itself acts as the affordance system — you can only PUBLISH when you're in the review state, AND the guard passes. The compiler even has a --simulate mode that does BFS over all reachable states to verify what's possible from where, without running anything.

Effects = Declared Side Effects at the Boundary (your `patch` + `effect`)

This is where Almadar's design is very deliberate. Effects are typed, declared in the schema, and partitioned by where they execute:

Effect	Runs On	Side Effect?
`set`	Both	State mutation (your `patch`)
`render-ui`	Client only	UI update
`navigate`	Client only	Route change
`persist`	Server only	DB write
`fetch`	Server only	DB read
`call-service`	Server only	External API
`emit`	Both	Event bus publish
`notify`	Client only	Toast/notification

The runtime enforces this partition — the server ignores render-ui/navigate/notify, and the client ignores persist/fetch/call-service. So the schema declares what mutates the real world (persist, call-service), what's a pure state computation (set, guards), and what's a UI concern (render-ui). An agent reading the schema knows exactly which effects touch external systems.

Here's how your Manifesto example would look in Almadar:

{
  "from": "review",
  "to": "published",
  "event": "PUBLISH",
  "guard": ["and", ["=", "@entity.status", "review"], [">", ["count", "@entity.sections"], 0]],
  "effects": [
    ["set", "@entity.id", "status", "published"],
    ["persist", "update", "Document", "@entity.id", { "status": "published" }],
    ["emit", "DOC_PUBLISHED", { "docId": "@entity.id", "publishedAt": "@now" }]
  ]
}

The set is your patch (pure state diff). The persist and emit are your declared side effects at the boundary. The guard is your available when. All queryable, all inspectable, all simulatable.

The Dual Execution Model

What makes this work in practice is that every event flows through the same pipeline on both client and server simultaneously:

User Action → Event → Guard Check → Transition → Effects → UI Update
                                         ↓
                              Client executes: render-ui, navigate, set
                              Server executes: persist, fetch, call-service, set

The client gets instant UI feedback (optimistic), the server handles the real-world mutations, and the response reconciles them. But both sides are executing the same schema — it's not two separate codebases, it's one schema with partitioned effect execution.

Compile-Time Verification

The Rust compiler validates this entire model at compile time:

Guard expressions are parsed into a typed AST (OirExpr) and validated
Entity field bindings (@entity.field) are checked against actual entity definitions
Circular event chains (emit → listen → emit loops) are detected via DFS
Render simulation (--simulate) traces all reachable states via BFS
Headless test execution (--execute) runs generated test cases against the state machine without a browser

So the schema isn't just documentation — it's the executable specification that both the compiler and runtime enforce.

Where We Differ from Manifesto

The main structural difference is that Almadar uses state machines as its organizing principle rather than standalone actions. Every action exists within a state context — you can only PUBLISH from the review state. This gives you not just preconditions per action, but a complete graph of all possible state trajectories, which the compiler can statically analyze.

The tradeoff is that Manifesto's action declarations are more concise for simple cases. At the same time, Almadar's state machine model pays off when you have complex multi-step flows (wizards, checkout pipelines, game loops) where the state context matters.

Looking forward to discussing this some more.

Jung Sungwoo • Feb 25

Thank you for the incredibly deep and detailed explanation. Almadar's approach of structurally separating guards, transitions, and effects, while treating all execution semantics as data via S-expressions, is highly impressive. I strongly resonate with the philosophy that "the schema is the executable specification itself."

However, where I took a slightly different approach when designing Manifesto is that I do not completely equate the semantic model with the execution model.

Manifesto begins with the following axiom:

compute(Snapshot, intent) = projection

Here, the Snapshot represents a coordinate in the domain's semantic space, and the intent is a movement vector within that space. The projection (whether UI, backend, DB, etc.) is merely the computed result, not the semantics itself.

Because of this, I intended to design the semantic layer not as a fixed executable state machine, but rather as a topological semantic space that an AI can directly navigate and modify. This is why MEL (Manifesto Expression Language) is treated as a first-class citizen and made mutable at runtime. In my view, an AI shouldn't just be an agent that triggers state transitions; it needs the ability to read the structure of the state space itself, simulate it, and dynamically reconfigure it at runtime if necessary.

Almadar’s state machine model clearly offers powerful advantages for execution path integrity and compile-time static analysis. In contrast, Manifesto places the semantic space at an ontological layer prior to projection, intentionally pushing side-effects outside the boundaries of its interpretation.

Ultimately, this distinction seems to stem from a foundational difference in perspective: whether to view the system as "Semantic = Executable Graph" or "Semantic = Navigable Coordinate Space."

This leaves me with one question. Since Almadar has such an elegant S-expression-based data structure, do you envision scenarios where the schema itself is modified at runtime, or where the state machine structure is dynamically reconfigured by an AI agent? I'd love to hear your thoughts on the potential of your semantic model evolving beyond a fixed executable specification into a space that is explorable and extensible at runtime.

Osama Alghanmi • Feb 25 • Edited

Your reply gave me a much better understanding of how Manifesto is designed. I might be wrong here, but the problem with staying at the semantic level and not providing concrete, verifiable primitives in your language is that it introduces too much non-determinism (correct me if I misunderstood you). When I was constructing Almadar, my main constraints were these: agents should be able to construct this language, but at all times, a human can read what the agent produces and easily infer the intent and execution model of what it produced. Also, agents still hallucinate, so you need a fixed set of primitives with which they can work. That is why I have the state machine as a first-class object in the language itself.

Now, regarding your question, yes, the agent can modify the state machine itself (it's a .orb file), but it must pass the deterministic validator we have (via the almadar cli, validate command). But again, modifying things at runtime is not something I tested, because in my use case, I usually construct the .orb or schema file -> compile it after running all the validations -> deploy it.

I haven't really tested modifying the schema at runtime (I have a runtime built, but I also usually pass the .orb or schema file to it, and it just runs it).

But your question sparked some ideas in my head.

Awesome discussion btw. I really enjoy discussing this with you. It's nice to talk to someone who is thinking and working on similar things.

Jung Sungwoo • Feb 25

Your concern makes sense, and I think you’re pointing at a very real tension.

I agree that staying purely at the semantic level without grounding it in concrete, verifiable primitives can introduce non-determinism. That’s something I’m actively thinking about as well.

In Manifesto, my goal is not to leave semantics abstract, but to design a layer where:

The semantic space remains navigable and inspectable
The primitives are still constrained and structurally verifiable
Deterministic reasoning is preserved at the execution boundary

So in a way, I’m trying to reconcile both perspectives:
Semantic as a coordinate space, but with explicit structural invariants that prevent it from collapsing into probabilistic interpretation.

Your emphasis on state machines as first-class primitives is a very strong anchor for determinism. I’m exploring whether a similar guarantee can be achieved while keeping the semantic layer slightly more ontologically prior to execution.

Your comment actually clarifies the exact design boundary I need to formalize more rigorously.
Really appreciate the push on this.

Some comments may only be visible to logged-in visitors. Sign in to view all comments.

Why UI Generation Is Not Enough in the AI Era

The Lens Matters: Human Semantics ≠ Machine Semantics

Why AI Fails: The Interpretation Gap

What AI Agents Actually Need

A Concrete Example: Declarative View Schema

Why Existing Technologies Aren't Enough

The Architecture of Meaning

Inference vs. Guarantee

UI Is Ephemeral. Semantics Endure.

Implications for Software Architecture

1. Schema-First Development

2. Explicit Dependencies

3. Reactions as Data

4. AI-Native APIs

5. Runtime Introspection

The Road Ahead

Conclusion

Why S-Expressions?

Guards = Pure/Deterministic (your computed)

Transitions = Explicit Preconditions (your available when)

Effects = Declared Side Effects at the Boundary (your patch + effect)

The Dual Execution Model

Compile-Time Verification

Where We Differ from Manifesto

Guards = Pure/Deterministic (your `computed`)

Transitions = Explicit Preconditions (your `available when`)

Effects = Declared Side Effects at the Boundary (your `patch` + `effect`)