DEV Community

Nishchya Verma
Nishchya Verma

Posted on

Designing a UI That AI Can Actually Understand (CortexUI Deep Dive)

CortexUI is an AI-native interface system that turns UI into a contract for intelligent agents. You can explore the source on GitHub and browse the docs and demos at cortexui.llcortex.ai.

If you want AI to operate a UI reliably, you have to stop making it guess.

That is the shortest possible explanation of CortexUI.

The longer explanation is more interesting.

Most web automation today works by inference. The system looks at the DOM, searches for a likely button, reads labels, tracks layout, maybe uses screenshots, and tries to decide what to do next. It works until the interface changes. Then the guessing starts to fall apart.

CortexUI fixes that by giving the UI its own explicit machine-readable layer.

Here is the simplest example:

<button
  data-ai-id="save-profile"
  data-ai-role="action"
  data-ai-action="save-profile"
  data-ai-state="idle"
>
  Save Profile
</button>
Enter fullscreen mode Exit fullscreen mode

This looks almost trivial.

It is not.

Those four attributes turn a generic button into a deterministic action contract.

data-ai-id="save-profile" gives the element a stable identity.

This is how agents, tests, and runtime tools refer to the exact element across renders and redesigns.

data-ai-role="action" tells the system what kind of thing this is.

Not just “an element.” An action trigger.

data-ai-action="save-profile" declares intent.

This is the operation being performed, independent of the visible button label.

data-ai-state="idle" exposes the current machine state.

If it becomes loading, an agent knows to wait. If it becomes error, an agent knows the operation failed. If it becomes success, it knows the action completed.

That is the contract layer in miniature.

Once a page is annotated like this, CortexUI’s runtime can read those attributes and expose a structured API in the browser.

The basic idea looks like this:

window.__CORTEX_UI__.getAvailableActions()
Enter fullscreen mode Exit fullscreen mode

And the result is not a raw element tree. It is a usable representation of what the UI can do right now.

Something like:

[
  {
    id: "save-profile",
    action: "save-profile",
    state: "idle",
    section: "profile-form"
  }
]
Enter fullscreen mode Exit fullscreen mode

That changes the interaction model completely.

Instead of asking:
“Can I find a button that probably means save?”

An agent can ask:
“What actions are currently available?”
“Is save-profile one of them?”
“What state is it in?”
“What screen am I on?”
“What form fields are required before I trigger it?”

That is the difference between heuristic automation and deterministic automation.

A More Realistic Example

CortexUI gets more useful as soon as you move beyond a single button.

Imagine a profile editing flow.

You have:

  • a form with fields
  • a save action
  • a status banner
  • a table of related records
  • a confirmation dialog for destructive actions

A CortexUI-style screen can declare all of that.

A form might look like this:

<form data-ai-role="form" data-ai-id="edit-profile-form">
  <input
    data-ai-role="field"
    data-ai-id="profile-name"
    data-ai-field-type="text"
    data-ai-required="true"
  />

  <input
    data-ai-role="field"
    data-ai-id="profile-email"
    data-ai-field-type="email"
    data-ai-required="true"
  />

  <button
    data-ai-role="action"
    data-ai-id="save-profile"
    data-ai-action="save-profile"
    data-ai-state="idle"
  >
    Save Profile
  </button>
</form>
Enter fullscreen mode Exit fullscreen mode

Now the runtime can return the form schema directly:

window.__CORTEX_UI__.getFormSchema("edit-profile-form")
Enter fullscreen mode Exit fullscreen mode

Which means an agent can learn:

  • which fields exist
  • what type each field expects
  • which fields are required
  • what action submits the form

No label scraping. No input guessing. No hardcoded selectors.

That is especially important because forms are where AI often fails in production. A human can see that one field is for billing zip code and another is for email. An agent looking at generic markup often cannot do that reliably. CortexUI removes that ambiguity by encoding the schema directly into the interface.

Runtime: The Missing Bridge

The runtime is one of the most practical parts of the system.

CortexUI installs a browser API that acts like an inspection layer over the UI:

window.__CORTEX_UI__.getScreenContext()
window.__CORTEX_UI__.getAvailableActions()
window.__CORTEX_UI__.getFormSchema("edit-profile-form")
window.__CORTEX_UI__.getVisibleEntities()
window.__CORTEX_UI__.getRecentEvents()
Enter fullscreen mode Exit fullscreen mode

That gives an agent a real operating loop.

  1. Orient itself with getScreenContext()
  2. Discover available operations with getAvailableActions()
  3. Read form structure with getFormSchema()
  4. Inspect visible business objects with getVisibleEntities()
  5. Verify outcomes with getRecentEvents()

This matters because AI agents do not just need to act. They need to know whether acting worked.

A typical sequence might look like this:

  • agent sees it is on settings
  • finds save-profile
  • notices the button is idle
  • fills required fields from getFormSchema()
  • triggers the action
  • observes state move to loading
  • waits for an action_completed event
  • confirms success instead of assuming it

That is how you remove guesswork.

Demo Scenario: Form, Table, Dialog

A useful way to picture CortexUI is to imagine a realistic admin screen.

At the top is a user form.

In the middle is a table of related entities.

At the bottom is a destructive action that opens a confirm dialog.

In traditional UI automation, each of those pieces would require different heuristics. The form would be inferred from labels. The table rows would be identified by structure. The dialog would be detected visually. Success and failure would be inferred from DOM changes.

In CortexUI, all three pieces can expose the same kind of contract.

The table can declare entity type and row identity:

<table data-ai-role="table" data-ai-id="users-table">
  <tr data-ai-entity="user" data-ai-entity-id="user-42">
    ...
  </tr>
</table>
Enter fullscreen mode Exit fullscreen mode

That means an agent can reason about the row as a real user, not just “the second row in a table.”

A confirm dialog can declare that it is a modal and that it is expanded:

<div
  data-ai-role="modal"
  data-ai-id="delete-user-dialog"
  data-ai-state="expanded"
>
  ...
</div>
Enter fullscreen mode Exit fullscreen mode

That means the system knows it is in a confirmation state. It does not have to infer that from a darkened backdrop and a centered box.

Once you start seeing UI this way, the benefit compounds quickly. The same interface becomes understandable to developers, test suites, runtime tooling, and AI agents using the same contract vocabulary.

What the Developer Experience Looks Like

The practical developer story is straightforward.

You can install the packages:

pnpm add @cortexui/components @cortexui/runtime @cortexui/ai-contract
Enter fullscreen mode Exit fullscreen mode

Use CortexUI components such as ActionButton, FormField, DataTable, StatusBanner, and ConfirmDialog.

Or, if you need lower-level control, use the primitives and add the contract manually.

The important shift is that state is no longer just visual. If a save operation is loading, your component should expose that in aiState. If a dialog is open, it should say so. If a field is required, the contract should say so. The UI becomes self-describing by design.

That ends up helping more than just AI.

Tests get more stable because they can target semantic identifiers instead of class names.

Debugging gets easier because the runtime can tell you what the page believes is currently true.

Automation gets less fragile because the meaning of the interface survives restyles and refactors.

And AI systems stop operating your app like a tourist reading street signs in a language they barely know.

Why This Is Powerful

The deepest value of CortexUI is not that it adds new components.

It changes the role of the interface.

Instead of being a surface that must be interpreted, the UI becomes a system that can declare itself.

That matters for AI today because the current alternative is a pile of heuristics.

But it matters even more in the long run because software is clearly moving toward a world where interfaces are used by multiple kinds of operators: humans, copilots, agents, tests, automations, and devtools.

Those operators need more than visuals.

They need a contract.

CortexUI treats that contract as a first-class part of interface design.

And once you see the browser that way, it becomes hard to go back.

To explore the system in detail, visit the CortexUI project on GitHub and the website at cortexui.llcortex.ai.

For a quick DEMO of CortexUI in ACTION click-here .

Top comments (0)