Carl Pei said it at SXSW last week. His company, Nothing, makes smartphones. He stood on stage and told the room: "The future is not the agent usin...
For further actions, you may consider blocking this person and/or reporting abuse
But does having a well designed (REST) API not solve/address this?
Reminds me of a dev.to article that I came across some time ago, where someone said:
"What if I do not design a GUI or a frontend for my system but just let an AI agent, talking to the API, be my 'user interface' ?"
That sounded like a fascinating idea - maybe not an "AI agent" but more a kind of "chat bot" ...
The "no GUI, just API as UI" idea is exactly the direction the piece is pointing and it's closer than most people think. The REST API is the right foundation. What changes is what the API needs to say about itself.
When a human developer consumes an API, they read the docs, infer the intent, ask Slack. When an agent consumes an API, it needs the contract to be explicit — what's callable, what the boundaries are, what failure looks like, what it's permitted to do on behalf of the user. The API design question shifts from "how do I make this easy to call" to "how do I make this safe to delegate."
The chat bot framing is close but slightly off — the interesting case isn't a bot with conversational UX, it's an agent with structured permissions acting autonomously. Less chat, more contract.
Well, you point the agent to the API docs and let them read it (they tend to be pretty good at that), and you're halfway there - doing all of the other investment might make sense economically, or not ... but of course it depends on your goals.
Yeah the "bot" idea was that the user can have a conversation with the bot (by way of an alternative UI), and then the bot (or agent) talks to the API (maybe via an MCP server?) - I'd have to look up what the original article said exactly.
The MCP server is the bridge and it changes what "the API" means. Instead of a developer calling endpoints, you have a protocol that declares capabilities, describes what's callable, and defines the contract upfront. The bot doesn't need to read API docs. It gets a structured description of what the system can do. That's closer to what agent-native design actually requires than a well-documented REST API alone.
Do protocols or standards exist for that?
MCP is the closest thing to a standard right now — Anthropic's Model Context Protocol defines how agents discover capabilities, call tools, and receive structured responses. It's gaining adoption fast: Claude Code, Cursor, and several other agents already support it natively. The OpenAPI spec covers the REST layer but doesn't address agent-specific concerns like capability declarations, permission scoping, or failure contracts. There's active work on agent identity and trust standards but nothing settled yet. Early days, but MCP is where the convergence is happening.
Once an agent is your primary “user”, you’re forced to admit the real product is the contract, and the UI is just a skin and oversight layer on top. In my own projects, as soon as I design clear agent‑first capabilities (what it can do, limits, and failure modes), the human UI naturally becomes thinner and more about audit/override than driving every step.
"Audit and override rather than driving every step" is the governance clock applied to interface design. The UI's job shifts from primary control surface to reconciliation layer — you're not operating the system through it, you're verifying the agent operated correctly and intervening when it didn't. That's a fundamentally different design brief than thirty years of UX has been optimizing for.
Your article is the first of its kind that I have seen. Well done and very proud to have been able to read it.
Keep going ! More people should read this and think long and hard about the implications that this brings.
Thank you — means something coming from someone thinking carefully about cloud security. The implications run deep.
You are welcome :)
This is one of those reframes that sounds obvious once stated but has real practical consequences. We've been thinking through this at Othex while designing our AI-assisted workflows.
The "agent using a human interface" pattern is fragile by design — you're adding a layer of brittleness (UI parsing, click coordination) when the underlying problem is just data and actions. It's like building a robot to push elevator buttons instead of just wiring the elevator directly.
The harder question for us has been: what do you do when the "real interface" doesn't exist? A lot of legacy systems only have the human-facing UI as the accessible surface. No API, no webhook, nothing. In those cases, browser automation isn't laziness — it's the only option.
But for greenfield and modern systems, 100% agree. Design the agent interface first, the human interface second. It changes what the system is fundamentally about.
The legacy systems case is the honest gap in the piece. Browser automation for legacy UI-only systems isn't the fragile pattern — it's a reasonable bridge while the underlying contract gets extracted. The fragility argument applies to greenfield systems that choose the UI-only path when they didn't have to.
The harder version of your question: what happens when a legacy system gets acquired or integrated and the new owners discover there's no API surface? The retrofit isn't just adding endpoints — it's excavating a contract that was never explicitly designed, because the UI handled all the ambiguity that a contract would have to resolve. That work is expensive and almost always gets deferred until an agent breaks something by working around the UI.
Agent-first for greenfield is the easy win. Legacy contract extraction is the actual problem.
This is the clearest articulation of something I've been struggling to explain to my team. We've been building internal tools with traditional CRUD UIs, and recently started adding MCP server endpoints so our agents can interact with the same data. The funny thing is -- the MCP interface ended up being simpler and more honest about what the system actually does than the UI ever was.
The UI hides complexity behind modals and multi-step wizards. The agent interface forced us to define: what are the actual operations? What are the real constraints? What are the failure modes? It's like the agent API became the source of truth and the UI became a skin on top of it.
I think the transition is gonna be rougher than people expect for SaaS products that only have a UI and no real API. They'll basically need to rebuild their product contract from scratch.
"The agent API became the source of truth and the UI became a skin on top of it" — that's the transition stated precisely, and it happened faster than most people expect because the MCP forcing function is real.
The UI wasn't hiding complexity out of negligence. Modals and wizards were solving a genuine human problem — progressive disclosure, the ability to change your mind mid-flow, context at each step. Strip those away and you don't get a simpler product. You get the product's actual contract, which was always there underneath the chrome.
Your point about SaaS products with UI-only products is the one that should make people nervous. They don't just need to add an API. They need to excavate a contract that may never have been explicitly designed — because the UI was handling all the ambiguity that a contract would have to resolve.
As someone building an AI coding terminal right now, this reframing is spot on. The tools that win won't be agents clicking buttons for us -- they'll be the ones with native agent APIs from day one. The UI becomes the debug view, not the primary interface.
"The UI becomes the debug view" . That's the governance clock applied to interface design. The UI's job shifts from navigation to reconciliation: you're not operating the system through it, you're verifying the agent operated correctly. Build the agent API first, then the UI as the oversight layer.
Been heavy thinking about this.
Building for agents isn't about abandoning the human interface, it's about recognising that the contract was always the real product; the UI was just the only way to express it.
The contract was always the real product" That's the piece in one sentence.
And it explains why retrofitting is harder than building agent-native from scratch. You're not adding a new interface to an existing product. You're excavating the implicit contract that was always there but never had to be stated because humans resolved the ambiguity through context and judgment. Agents can't. The retrofit work is making legible what was always implicit, and implicit contracts are harder to surface than they look.