There's a quiet shift happening in how AI tools interact with users, and most developers haven't noticed yet, because just one week after the MCP Apps specs were published by Anthropic, OpenAI launched a huge marketing campaign around OpenClaw and they got all the attention, at least for a while.
For the past two years, every AI assistant has been stuck behind the same interface: a text box. You ask a question, you get text back. Maybe some markdown. Maybe a code block. Maybe an image.
Claude, GPT, Copilot, Gemini, every local model render into the same narrow pipe. MCP Apps change that.
What MCP Apps actually is
MCP Apps is a protocol extension that lets MCP tool results include interactive UIs. Actual interactive components running inside the AI host's sandbox.
The mechanics are straightforward:
- Your MCP tool returns a
structuredContentpayload alongside the normal textcontent - The host loads a
ui://resource (an HTML page you provide) into a sandboxed iframe - The host forwards the tool result to your iframe via
postMessageusing a JSON-RPC 2.0 protocol (ui/*methods) - Your renderer mounts the UI inside the iframe
- The UI can call tools back, send messages to the conversation, request display mode changes, and resize itself
The text content still serves LLM reasoning and hosts without rendering support. The UI is a progressive enhancement. Let's be honest: natural language is not always the most convenient or fastest way to express a desire. A click or a tap on a button can be much faster.
The spec behind it
The MCP Apps protocol (spec version 2026-01-26) is an official extension to MCP maintained by Anthropic. It defines a JSON-RPC 2.0 message layer over postMessage between a sandboxed iframe and its host. The spec covers:
-
Handshake:
ui/initializerequest/response with protocol version negotiation, app capabilities, and host capabilities -
Lifecycle:
tool-input,tool-result,tool-cancelled,tool-input-partial(streaming),host-context-changed,resource-teardown -
Actions:
ui/open-link,ui/message,ui/request-display-mode,ui/update-model-context,tools/call -
Sizing:
ui/notifications/size-changed(reactive, from app to host),ui/notifications/preferred-size(declarative hints) -
Security: CSP via
_meta.ui.cspon resource content items, Permission Policy for camera/microphone/geolocation/clipboard
The spec is deliberately renderer-agnostic. It defines how the host and the iframe talk to each other. What you put inside the iframe is entirely your choice. The host doesn't parse your component tree or validate your DOM structure. It sends you JSON-RPC messages and expects JSON-RPC messages back.
This is a deliberate design decision and the reason multiple rendering approaches can coexist. The Anthropic ext-apps SDK, a prefab renderer, a raw React app, a Svelte component -- all valid. The protocol doesn't care.
Current host implementations:
| Host | Status | Notes |
|---|---|---|
| VS Code Copilot Chat | Shipping | Full spec support, CSP via <meta> tag, acquireVsCodeApi() transport |
| Claude Desktop | Shipping | Full spec support, CSP via HTTP headers on sandboxed origin ({hash}.claudemcpcontent.com). Can send results before init completes |
| ChatGPT | Shipping | Full spec support, CSP via HTTP headers on sandboxed origin ({slug}.oaiusercontent.com) |
Three independent hosts shipping the same protocol make apparent that this is not a proposal, but infrastructure.
Why this matters
Here's the thing I think people are going to underestimate: this turns every MCP server into a full-stack application.
An MCP server is already a backend that exposes typed tools over stdio or HTTP. It already has access to databases, APIs, file systems. The only thing missing was a frontend.
Now it has on that lives inside the AI conversation, gets tool arguments and results pushed to it automatically, can call tools back on its own MCP server via the host, inherits the host's theme for free, and works across VS Code, Claude Desktop, ChatGPT, and anything else that implements the spec.
That last point is the critical one. Write once, render everywhere. The protocol is the same across hosts. The sandboxing model is the same. The postMessage bridge is the same.
Two different things driven by the same idea
MCP Apps is going to do to AI tooling what PWAs were supposed to do to mobile apps. PWAs were competing against native apps that users already loved. MCP Apps are filling a vacuum. There is no existing standard for rendering interactive UIs inside AI conversations. The alternative is pasting JSON into the chat. MCP Apps are going to be "AI Native Apps".
The bar is low. And the protocol is good enough.
What it looks like in practice
Here's a complete MCP tool that returns an interactive patient list with search, sorting, and click-to-view detail:
import { display, autoTable, Column, H1 } from '@maxhealth.tech/prefab'
async function listPatients(args) {
const patients = await db.query('SELECT * FROM patients')
return display(
Column({ gap: 6 }, [
H1('Patients'),
autoTable(patients),
]),
{ title: 'Patient List' }
)
}
display() wraps the component tree into the MCP wire format. The host renders it. The table is interactive -- search, sort, row selection -- without writing any frontend code.
But you don't even need to compose components. The auto-renderers infer the right UI from your data shape:
import { display } from '@maxhealth.tech/prefab/mcp'
import { autoTable, autoChart, autoForm, autoMetrics, autoDetail } from '@maxhealth.tech/prefab'
// Array of objects -> searchable, sortable table
return display(autoTable(patients))
// Array with numeric fields -> line/bar chart with axes and tooltips
return display(autoChart(salesData, { xAxis: 'month', title: 'Revenue' }))
// Schema fields -> validated form that submits back to an MCP tool
return display(autoForm([
{ name: 'name', label: 'Name', required: true },
{ name: 'email', label: 'Email', type: 'email' },
{ name: 'role', label: 'Role', type: 'select', options: ['admin', 'user'] },
], 'save_user'))
// Key-value object -> formatted detail card
return display(autoDetail({ name: 'Alice', status: 'active', lastSeen: '2026-04-30' }))
// Object with numeric values -> metric cards with labels
return display(autoMetrics({ patients: 1284, appointments: 47, waitTime: '12min' }))
Each auto-renderer picks columns, axes, labels, and formatting based on what it finds in the data. You pass an array or object, you get a production-quality UI back. When you outgrow them, you drop down to the component API and build exactly what you want.
And when you want to tweak the look without writing full custom components, every element accepts utility classes:
import { display, Column, H1, Badge, autoTable } from '@maxhealth.tech/prefab'
return display(
Column({ gap: 6, cssClass: 'p-6 max-w-4xl' }, [
H1('Patient Dashboard'),
Badge({ label: '47 active', variant: 'success', cssClass: 'text-sm' }),
autoTable(patients, { cssClass: 'rounded-lg shadow-md' }),
]),
{ title: 'Dashboard', layout: { preferredHeight: 600 } }
)
The built-in CSS ships ~200 Tailwind-compatible utility classes (padding, margin, flex, grid, gap, typography, colors, borders, shadows, max-height, overflow). No Tailwind dependency, no build step, no purge config. They're there in the 15KB stylesheet that the CDN serves alongside the renderer.
Three levels of control: auto-renderers for zero-config, utility classes for visual tweaks, full component API for complete custom UIs.
The HTML renderer is only 80kB in size and imported by a single script tag:
<div id="root"></div>
<script src="https://cdn.jsdelivr.net/npm/@maxhealth.tech/prefab@0.2/dist/renderer.auto.min.js"></script>
The protocol is the product
The real value is the protocol itself.
ui/initialize handshake. ui/notifications/tool-result for pushing data. ui/notifications/size-changed for responsive layout. ui/open-link for navigation. ui/message for sending messages back into the conversation. All JSON-RPC 2.0 over postMessage.
Anyone can build a renderer against this protocol. React, Svelte, vanilla JS, raw DOM manipulation. The host sends JSON-RPC messages to an iframe and expects JSON-RPC messages back. Your rendering stack is your business.
This is why I think MCP Apps will win where similar attempts failed. It's a host protocol that allows any rendering approach, and that makes adoption straightforward on both sides.
What's missing
It's early. The spec is dated 2026-01-26. Here's what's still rough:
-
CSP implementation varies by host. All hosts read
_meta.ui.cspfrom the content item returned byreadResource, but the enforcement mechanism differs. VS Code injects a<meta>tag. Claude Desktop and ChatGPT enforce via HTTP headers on a sandboxed origin. The spec standardizes the declaration format, but you should test your CSP config against each host you target. -
No standard component format. The protocol defines the transport, the payload is up to you. Every renderer invents its own component schema. (We use a
$prefabJSON wire format, but nothing stops someone from using entirely different components.) -
Permission Policy support varies. Camera, microphone, geolocation, clipboard-write access via iframe
allowattribute. Hosts report what they support inhostCapabilities.sandbox.permissions, but not all hosts honor all permissions yet. -
Buffering and timing are tricky. Claude Desktop can send
tool-resultbefore theui/initializeresponse arrives. If your renderer doesn't buffer, you lose the first result. This took us several hours to debug.
These are solvable problems. The architecture is sound.
The prediction
Within 12 months, we will see a market emerging around MCP Apps.
MCP Apps hosts will compete on rendering quality, theme support, and permission handling. MCP servers will compete on UI polish. And the protocol will quietly become the standard that holds it all together. MCP Hosts potentially replace the need for an internet browser.
Follow Max Health on GitHub
Top comments (0)