DEV Community

Cover image for Why UI Generation Is Not Enough in the AI Era
Jung Sungwoo
Jung Sungwoo

Posted on

Why UI Generation Is Not Enough in the AI Era

Why UI Generation Is Not Enough in the AI Era

A philosophy of software architecture from the perspective of AI agents


For the past decade, UI generation has been the flagship promise of developer productivity. Visual builders, code templates, declarative schemas—all designed to accelerate the creation of user interfaces.

But something fundamental has changed.

The users of our software are no longer just humans. AI agents now interact with applications, trigger workflows, manipulate data, and evolve systems. And here's the uncomfortable truth: UI was never designed for them.

This essay argues that in the AI era, generating UI is no longer the bottleneck. The bottleneck is the absence of machine-readable semantics. And until we address this, AI agents will continue to guess, hallucinate, and fail.


The Lens Matters: Human Semantics ≠ Machine Semantics

Before we proceed, let me be explicit about the lens through which I'm viewing software:

This is not a critique of modern frontend development.

React, Vue, and Svelte are sophisticated frameworks. They organize code into semantic components. A <UserForm /> component is meaningful—to a human developer who can read its implementation.

But an AI agent cannot open UserForm.tsx and understand:

  • What fields exist inside
  • Which fields depend on others
  • What validation rules apply
  • When a field becomes disabled
  • What side effects occur on submit

The component is a black box.

Modern UI frameworks are semantic for humans. They are opaque for machines.

This distinction is the foundation of everything that follows.


Why AI Fails: The Interpretation Gap

When an AI agent "mis-clicks" or "hallucinates form behavior," the instinct is to blame model accuracy.

My view is different.

AI fails because applications do not expose their internal semantics.

Consider what happens when an AI agent encounters a typical web form:

  1. It sees a screenshot (or DOM structure)
  2. It identifies visual elements: buttons, inputs, dropdowns
  3. It infers what those elements do based on labels and context
  4. It executes actions based on those inferences

Step 3 is where everything breaks.

The AI has no way of knowing:

  • That selecting "Digital Product" should hide the "Shipping Weight" field
  • That "Subcategory" depends on "Category" and should be disabled until Category is selected
  • That "Freight" shipping type enforces a minimum weight of 50kg
  • That "Final Price" is auto-calculated from "Price" and "Discount Rate"
  • That SKU must be validated against a server endpoint for uniqueness

These rules are scattered across:

  • useEffect hooks
  • Custom validation schemas
  • Service layer calls
  • Global state management
  • Ad-hoc conditional rendering
  • Backend business logic
  • Tribal knowledge in developers' heads

The AI sees the surface. The meaning is buried.

Any system that forces an AI to guess will inevitably break.


What AI Agents Actually Need

Let me reframe the problem.

Humans need:

  • Buttons to click
  • Inputs to fill
  • Visual feedback to understand state

AI agents need:

  • The intent behind a button
  • The semantics of a field
  • The rules governing a workflow
  • The dependencies between values
  • The effects triggered by a change

UI expresses almost none of this in a machine-readable way.

What AI agents truly need is not a screen. It's an Intent Graph—a declarative representation of:

UI Concept Semantic Equivalent
Button Action with preconditions and effects
Input field Typed semantic value with constraints
Screen View into a domain model
Form submission State transition with validation
Conditional visibility Dependency graph

The UI can be generated, regenerated, or discarded. But the Intent Graph remains constant.

That is what the AI should interact with.


A Concrete Example: Declarative View Schema

Let me show you what this looks like in practice.

Here's a traditional React form (simplified):

function ProductForm() {
  const [productType, setProductType] = useState('PHYSICAL')
  const [weight, setWeight] = useState(0)
  const [fulfillment, setFulfillment] = useState('STANDARD')

  useEffect(() => {
    if (fulfillment === 'FREIGHT' && weight < 50) {
      setWeight(50)
    }
  }, [fulfillment, weight])

  return (
    <form>
      <select value={productType} onChange={e => setProductType(e.target.value)}>
        <option value="PHYSICAL">Physical</option>
        <option value="DIGITAL">Digital</option>
      </select>

      {productType !== 'DIGITAL' && (
        <input 
          type="number" 
          value={weight}
          onChange={e => setWeight(Number(e.target.value))}
        />
      )}
    </form>
  )
}
Enter fullscreen mode Exit fullscreen mode

An AI agent looking at this has to:

  1. Parse the JSX
  2. Trace the useState dependencies
  3. Understand the useEffect logic
  4. Infer conditional rendering rules

Now compare this to a declarative semantic schema:

const productView = view('product-create')
  .fields(
    viewField.select('productType')
      .options(['PHYSICAL', 'DIGITAL'])
      .reaction(
        on.change()
          .when(fieldEquals('productType', 'DIGITAL'))
          .do(actions.updateProp('shippingWeight', 'hidden', true))
      ),

    viewField.numberInput('shippingWeight')
      .dependsOn('productType', 'fulfillmentType')
      .reaction(
        on.change()
          .when(and(
            fieldEquals('fulfillmentType', 'FREIGHT'),
            ['<', $.state('shippingWeight'), 50]
          ))
          .do(actions.setValue('shippingWeight', 50))
      )
  )
Enter fullscreen mode Exit fullscreen mode

The difference is profound.

In the declarative schema:

  • Dependencies are explicit: dependsOn('productType', 'fulfillmentType')
  • Reactions are declared, not imperative: on.change().when(...).do(...)
  • Business rules are first-class: "FREIGHT requires minimum 50kg"
  • Visibility conditions are queryable: hidden: true when productType is DIGITAL

An AI agent reading this schema knows—without inference, without guessing—exactly how the form behaves.


Why Existing Technologies Aren't Enough

You might ask: don't we already have semantic technologies?

Technology What It Captures What It Misses
OpenAPI API shape, request/response UI state, field dependencies
GraphQL Data relationships Workflow logic, validation rules
JSON Schema Data structure, constraints View binding, reactive behavior
MCP Tool contracts Inter-field dependencies, UI state
React/Vue Component structure Business logic (encapsulated)

Each of these is a fragment of semantics.

None provides a unified model that spans:

  • Data (Entity)
  • View (Presentation + Interaction)
  • Rules (Validation, Constraints)
  • Dependencies (Field relationships)
  • Actions (State transitions)
  • Workflows (Multi-step processes)

This is the missing layer. I call it the Semantic Execution Layer—a unified runtime where meaning is explicit, queryable, and executable.


The Architecture of Meaning

Here's how I envision the stack:

┌─────────────────────────────────────────┐
│           Natural Language              │  ← User or AI intent
├─────────────────────────────────────────┤
│         Semantic Layer                  │  ← Intent Graph / View Schema
│  ┌─────────┬──────────┬──────────────┐  │
│  │ Entity  │  View    │  Reactions   │  │
│  │ Schema  │  Schema  │  & Rules     │  │
│  └─────────┴──────────┴──────────────┘  │
├─────────────────────────────────────────┤
│         Execution Runtime               │  ← Interprets & executes
├─────────────────────────────────────────┤
│    UI Rendering    │    API Calls       │  ← Projections
└─────────────────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

In this architecture:

  1. Entity Schema defines domain truth: what data exists, what constraints apply
  2. View Schema defines presentation and interaction: how fields are displayed, how they react
  3. Reactions define behavior: when X changes, do Y
  4. Execution Runtime interprets these schemas and manages state
  5. UI is just one possible rendering—a projection of the semantic layer

The AI agent interacts with the Semantic Layer directly. It doesn't need to parse UI. It doesn't need to guess.


Inference vs. Guarantee

There's a counterargument I often hear:

"But LLMs are getting really good at understanding UI! They can look at screenshots and figure out what to do."

This is true. And it's also insufficient.

LLMs can infer meaning from UI. But inference is probabilistic. It's a prediction, not a guarantee.

For simple forms, inference works. For complex business logic—cascading dependencies, conditional validation, multi-step workflows—inference fails unpredictably.

Consider the difference:

Approach Reliability Debuggability Determinism
Screenshot → LLM inference ~70-80% Low No
DOM parsing → heuristics ~80-85% Medium Partial
Semantic schema → direct access ~99%+ High Yes

If you're building a demo, inference is fine.

If you're building production systems where AI agents execute real transactions, move real money, and affect real outcomes—you need guarantees.

Semantic schemas provide those guarantees.


UI Is Ephemeral. Semantics Endure.

Let me be clear about what I'm not saying.

I'm not saying UI is useless. UI is essential—for humans.

I'm not saying we should stop building beautiful interfaces. We should.

What I'm saying is this:

UI is one projection of a deeper structure. AI agents need access to that deeper structure directly.

Today's software is built like this:

  • Design the UI
  • Implement the logic (scattered everywhere)
  • Hope the AI can figure it out

Tomorrow's software should be built like this:

  • Define the semantic model (Entity + View + Rules)
  • Generate UI as one projection
  • Expose semantics to AI agents as another projection
  • Let both humans and machines interact with the same underlying truth

The semantic layer becomes the single source of truth.

UI becomes a rendering concern—important, but not foundational.


Implications for Software Architecture

If you accept this philosophy, several implications follow:

1. Schema-First Development

Define your semantic schemas before writing UI code. The schema is the specification. The UI is an implementation detail.

2. Explicit Dependencies

Never hide field dependencies in useEffect hooks. Declare them in the schema where they can be queried and reasoned about.

3. Reactions as Data

Business rules should be expressed as declarative reactions, not imperative code. on.change().when(...).do(...) is data. An if statement buried in a component is not.

4. AI-Native APIs

Expose your semantic layer to AI agents. Not just REST endpoints, but the full Intent Graph: fields, dependencies, constraints, valid actions, expected outcomes.

5. Runtime Introspection

Build execution runtimes that can answer questions like:

  • "What fields are currently visible?"
  • "What would happen if I set this value?"
  • "What are the preconditions for this action?"

The Road Ahead

We're at an inflection point.

For decades, software has been built for human consumption. UI was the interface, and it was sufficient.

Now, AI agents are becoming first-class users of our systems. They don't need pixels and click events. They need structured meaning.

The companies that recognize this shift early—that invest in semantic layers, declarative schemas, and AI-native architectures—will build systems that are:

  • Interpretable: AI agents can understand intent, not just surface
  • Deterministic: Behavior is predictable, not probabilistic
  • Evolvable: Schemas can be versioned, migrated, and extended
  • Debuggable: When something fails, you can trace exactly why

The companies that don't will find their AI integrations perpetually fragile—dependent on prompt engineering hacks and screenshot parsing tricks that break with every UI change.


Conclusion

UI generation was the endgame for human developers.

For AI agents, it's not enough.

What we need is a Semantic Execution Layer—a unified representation of data, view, rules, and behavior that both humans and machines can interact with.

UI is a projection of this layer. An important one, but just one of many.

As AI accelerates, the question shifts:

"What does my application look like?""What does my application mean?"

The first question produces screens.

The second question produces understanding.

And in the age of AI agents, understanding is everything.


UI is transient. Semantics endure.

Top comments (7)

Collapse
 
almadar profile image
Osama Alghanmi • Edited

Great article! You've hit on something critical that most of the industry is missing.
The "Intent Graph" concept you describe machine-readable semantics that both humans and AI can reason about, is exactly what we're building with Almadar (almadar.io).
Our approach: instead of generating UI directly from prompts, we use JSON schemas as the single source of truth. These schemas define:
Entities (your data model)
State machines (behavior as "traits")
Pages (UI containers)
The UI is just a projection of the schema. Change the schema, everything updates API, database, frontend. Everything derives from the schema.
Would love your thoughts on whether declarative schemas should include execution semantics (state machines) or stay purely structural?

Collapse
 
eggp profile image
Jung Sungwoo

Thanks for reading! It’s awesome to connect with someone tackling the exact same problem. Almadar's approach of using JSON schemas as the single source of truth resonates perfectly with what I'm trying to solve.
To answer your question: Yes, I strongly believe schemas should include execution semantics — but absolutely not as opaque imperative code (like "if/else in components").
For Manifesto/MEL-style systems, the core philosophy is separating deterministic state transitions from explicit effect boundaries.

If we want AI to reason about the world safely, the schema must provide:

  • computed = pure/deterministic (so AI can simulate and reason before acting)
  • available when = explicit preconditions (Affordances)
  • patch = explicit state diffs (debuggable and inspectable)

Side effects (I/O) should be declared strictly at the boundaries, so an agent knows exactly what mutates the real world versus what is just a pure state computation.
Here is a quick example of how it looks in Manifesto:

computed canPublish = and(eq(doc.status, "review"), gt(len(sections), 0))

action publish() available when canPublish {
    onceIntent {
        patch doc.status = "published"
        effect persist(doc)
        effect emit("doc.published", { ... })
    }
}
Enter fullscreen mode Exit fullscreen mode

So yes, schemas must carry behavior, but entirely as queryable semantics, keeping pure logic strictly separated from side effects.

How does Almadar currently handle the boundary between the state machine logic and actual side effects (like DB writes or external API calls)? Would love to hear how you're approaching that!

Collapse
 
almadar profile image
Osama Alghanmi

Great question,
This is exactly the design tension we spent a lot of time on. The short answer is: Almadar's schema carries full execution semantics, and the boundary between pure logic and side effects is structurally enforced by the schema itself, not by convention.

Why S-Expressions?

Before diving in, a quick note on the syntax you'll see below. Almadar uses S-expressions — the notation that originated in Lisp in the 1950s — for all guards, effects, and computed values. An S-expression is just a nested list where the first element is the operator:

["and", ["=", "@entity.status", "review"], [">", "@entity.count", 0]]
Enter fullscreen mode Exit fullscreen mode

This is the JSON equivalent of Lisp's (and (= entity.status "review") (> entity.count 0)).

We chose S-expressions for a specific reason: they're data, not code. Because they're plain JSON arrays, they can be parsed, validated, transformed, and serialized by any language — the Rust compiler statically analyzes them, the TypeScript runtime evaluates them, and an AI agent can reason about them without needing a language-specific parser. They're homoiconic: the representation is the AST. That's what makes the entire schema machine-queryable — guards, effects, and computed values are all just data structures you can walk, validate, and simulate.

Here's how it maps to the concepts you described:

Guards = Pure/Deterministic (your computed)

Guards are S-expressions that evaluate to a boolean — they're pure, side-effect-free, and the runtime (or compiler) can evaluate them without touching the outside world:

"guard": ["and",
  ["=", "@entity.status", "review"],
  [">", ["count", "@entity.sections"], 0]
]
Enter fullscreen mode Exit fullscreen mode

This is equivalent to your computed canPublish = and(eq(doc.status, "review"), gt(len(sections), 0)). Guards reference entity state via bindings (@entity.*, @payload.*, @user.*) and use a well-defined S-expression algebra (comparison, logic, math, collection ops). An AI agent can fully simulate guard evaluation without any I/O — the evaluator is a pure function over the binding context.

Transitions = Explicit Preconditions (your available when)

A transition is only valid from a specific state, triggered by a specific event, and gated by a guard:

{
  "from": "review",
  "to": "published",
  "event": "PUBLISH",
  "guard": ["and", ["=", "@entity.status", "review"], [">", ["count", "@entity.sections"], 0]],
  "effects": [
    ["set", "@entity.id", "status", "published"],
    ["persist", "update", "Document", "@entity.id", { "status": "published" }],
    ["emit", "DOC_PUBLISHED", { "docId": "@entity.id" }]
  ]
}
Enter fullscreen mode Exit fullscreen mode

The state machine itself acts as the affordance system — you can only PUBLISH when you're in the review state, AND the guard passes. The compiler even has a --simulate mode that does BFS over all reachable states to verify what's possible from where, without running anything.

Effects = Declared Side Effects at the Boundary (your patch + effect)

This is where Almadar's design is very deliberate. Effects are typed, declared in the schema, and partitioned by where they execute:

Effect Runs On Side Effect?
set Both State mutation (your patch)
render-ui Client only UI update
navigate Client only Route change
persist Server only DB write
fetch Server only DB read
call-service Server only External API
emit Both Event bus publish
notify Client only Toast/notification

The runtime enforces this partition — the server ignores render-ui/navigate/notify, and the client ignores persist/fetch/call-service. So the schema declares what mutates the real world (persist, call-service), what's a pure state computation (set, guards), and what's a UI concern (render-ui). An agent reading the schema knows exactly which effects touch external systems.

Here's how your Manifesto example would look in Almadar:

{
  "from": "review",
  "to": "published",
  "event": "PUBLISH",
  "guard": ["and", ["=", "@entity.status", "review"], [">", ["count", "@entity.sections"], 0]],
  "effects": [
    ["set", "@entity.id", "status", "published"],
    ["persist", "update", "Document", "@entity.id", { "status": "published" }],
    ["emit", "DOC_PUBLISHED", { "docId": "@entity.id", "publishedAt": "@now" }]
  ]
}
Enter fullscreen mode Exit fullscreen mode

The set is your patch (pure state diff). The persist and emit are your declared side effects at the boundary. The guard is your available when. All queryable, all inspectable, all simulatable.

The Dual Execution Model

What makes this work in practice is that every event flows through the same pipeline on both client and server simultaneously:

User Action → Event → Guard Check → Transition → Effects → UI Update
                                         ↓
                              Client executes: render-ui, navigate, set
                              Server executes: persist, fetch, call-service, set
Enter fullscreen mode Exit fullscreen mode

The client gets instant UI feedback (optimistic), the server handles the real-world mutations, and the response reconciles them. But both sides are executing the same schema — it's not two separate codebases, it's one schema with partitioned effect execution.

Compile-Time Verification

The Rust compiler validates this entire model at compile time:

  • Guard expressions are parsed into a typed AST (OirExpr) and validated
  • Entity field bindings (@entity.field) are checked against actual entity definitions
  • Circular event chains (emit → listen → emit loops) are detected via DFS
  • Render simulation (--simulate) traces all reachable states via BFS
  • Headless test execution (--execute) runs generated test cases against the state machine without a browser

So the schema isn't just documentation — it's the executable specification that both the compiler and runtime enforce.

Where We Differ from Manifesto

The main structural difference is that Almadar uses state machines as its organizing principle rather than standalone actions. Every action exists within a state context — you can only PUBLISH from the review state. This gives you not just preconditions per action, but a complete graph of all possible state trajectories, which the compiler can statically analyze.

The tradeoff is that Manifesto's action declarations are more concise for simple cases. At the same time, Almadar's state machine model pays off when you have complex multi-step flows (wizards, checkout pipelines, game loops) where the state context matters.

Looking forward to discussing this some more.

Thread Thread
 
eggp profile image
Jung Sungwoo

Thank you for the incredibly deep and detailed explanation. Almadar's approach of structurally separating guards, transitions, and effects, while treating all execution semantics as data via S-expressions, is highly impressive. I strongly resonate with the philosophy that "the schema is the executable specification itself."

However, where I took a slightly different approach when designing Manifesto is that I do not completely equate the semantic model with the execution model.

Manifesto begins with the following axiom:

compute(Snapshot, intent) = projection

Here, the Snapshot represents a coordinate in the domain's semantic space, and the intent is a movement vector within that space. The projection (whether UI, backend, DB, etc.) is merely the computed result, not the semantics itself.

Because of this, I intended to design the semantic layer not as a fixed executable state machine, but rather as a topological semantic space that an AI can directly navigate and modify. This is why MEL (Manifesto Expression Language) is treated as a first-class citizen and made mutable at runtime. In my view, an AI shouldn't just be an agent that triggers state transitions; it needs the ability to read the structure of the state space itself, simulate it, and dynamically reconfigure it at runtime if necessary.

Almadar’s state machine model clearly offers powerful advantages for execution path integrity and compile-time static analysis. In contrast, Manifesto places the semantic space at an ontological layer prior to projection, intentionally pushing side-effects outside the boundaries of its interpretation.

Ultimately, this distinction seems to stem from a foundational difference in perspective: whether to view the system as "Semantic = Executable Graph" or "Semantic = Navigable Coordinate Space."

This leaves me with one question. Since Almadar has such an elegant S-expression-based data structure, do you envision scenarios where the schema itself is modified at runtime, or where the state machine structure is dynamically reconfigured by an AI agent? I'd love to hear your thoughts on the potential of your semantic model evolving beyond a fixed executable specification into a space that is explorable and extensible at runtime.

Thread Thread
 
almadar profile image
Osama Alghanmi • Edited

Your reply gave me a much better understanding of how Manifesto is designed. I might be wrong here, but the problem with staying at the semantic level and not providing concrete, verifiable primitives in your language is that it introduces too much non-determinism (correct me if I misunderstood you). When I was constructing Almadar, my main constraints were these: agents should be able to construct this language, but at all times, a human can read what the agent produces and easily infer the intent and execution model of what it produced. Also, agents still hallucinate, so you need a fixed set of primitives with which they can work. That is why I have the state machine as a first-class object in the language itself.

Now, regarding your question, yes, the agent can modify the state machine itself (it's a .orb file), but it must pass the deterministic validator we have (via the almadar cli, validate command). But again, modifying things at runtime is not something I tested, because in my use case, I usually construct the .orb or schema file -> compile it after running all the validations -> deploy it.

I haven't really tested modifying the schema at runtime (I have a runtime built, but I also usually pass the .orb or schema file to it, and it just runs it).

But your question sparked some ideas in my head.

Awesome discussion btw. I really enjoy discussing this with you. It's nice to talk to someone who is thinking and working on similar things.

Thread Thread
 
eggp profile image
Jung Sungwoo

Your concern makes sense, and I think you’re pointing at a very real tension.

I agree that staying purely at the semantic level without grounding it in concrete, verifiable primitives can introduce non-determinism. That’s something I’m actively thinking about as well.

In Manifesto, my goal is not to leave semantics abstract, but to design a layer where:

  • The semantic space remains navigable and inspectable
  • The primitives are still constrained and structurally verifiable
  • Deterministic reasoning is preserved at the execution boundary

So in a way, I’m trying to reconcile both perspectives:
Semantic as a coordinate space, but with explicit structural invariants that prevent it from collapsing into probabilistic interpretation.

Your emphasis on state machines as first-class primitives is a very strong anchor for determinism. I’m exploring whether a similar guarantee can be achieved while keeping the semantic layer slightly more ontologically prior to execution.

Your comment actually clarifies the exact design boundary I need to formalize more rigorously.
Really appreciate the push on this.

Thread Thread

Some comments may only be visible to logged-in visitors. Sign in to view all comments.