Part 1 of "The Great Decoupling" series
Have you ever stopped to consider why software has a user interface at all?
Not the philosophical "what is an interface" question, but the practical one: why does every SaaS product ship with a specific arrangement of screens, buttons, and workflows that every customer must learn and adapt to?
The honest answer is a historical accident. Software needed interfaces because humans needed to operate it, and the technology for anything else didn't exist. We've spent decades refining these interfaces — better layouts, smoother interactions, more intuitive navigation — without questioning whether the interface itself was the right abstraction.
That assumption is dissolving faster than most people realize.
What I've been exploring recently — and what I believe represents a genuine architectural inflection point — is the emergence of capability-first systems where the interface becomes an ephemeral rendering layer rather than a product feature. This isn't an incremental improvement. It's a fundamental restructuring of what software is.
What follows is my synthesis of where I see this heading — informed by three decades of watching technology transitions play out, but still very much a thesis about the future rather than a description of the present. I've been wrong before, and I'll be wrong again. But I've also learned to recognize when architectural assumptions are shifting beneath us, and I believe we're in one of those moments now.
The Core Problem: We've Been Shipping Frozen Interactions
Let me paint a picture. A product manager at a mid-sized company needs to understand why Q3 revenue dipped in the enterprise segment. Today, this triggers a multi-application odyssey: log into Salesforce, export pipeline data, pivot to the BI tool, cross-reference with the financial system, maybe ping someone in Slack who knows where that one spreadsheet lives. Each application presents its own interface, its own mental model, its own friction.
The question worth asking: why does the product manager need to learn five interfaces to answer one question?
The traditional response has been better integration — connect the systems, sync the data, build dashboards that pull from multiple sources. This helps, but it treats symptoms rather than causes. The fundamental problem is that we've been shipping frozen interactions. Every SaaS product encodes specific workflows in specific screens, and customers bend their processes to match. The interface is the product, which means changing how you work means changing products.
Consider what this implies. Salesforce doesn't just provide CRM capabilities — it provides the Salesforce way of doing CRM, frozen into its interface. Notion doesn't just provide document collaboration — it provides the Notion way, with its specific blocks and layouts and mental models. Users don't just adopt functionality; they adopt an entire interaction paradigm.
This made sense when humans were the only consumers of software. Interfaces were necessary translation layers between human intent and machine capability. But that constraint — human as sole operator — is the assumption now breaking down.
The Moment of Decoupling
Here's where things get interesting. What if the product wasn't the interface? What if it were the underlying capability — the ability to query pipeline data, generate forecasts, track customer health — and the interface was generated dynamically based on context, user preference, and modality?
This isn't speculative. It's already happening in narrow contexts.
When you ask Claude to visualize data, and it generates a custom chart in seconds, that's not "using a BI tool." That's generating a BI tool instance for a specific question. The dashboard isn't a product feature — it's a byproduct of a capability (data access plus visualization rendering) meeting a moment of intent.
When a developer asks an AI assistant to query a database and format the results, they're not learning SQL syntax or navigating a database GUI. They're expressing intent and receiving a contextual rendering of the result.
When a support agent asks their AI copilot to pull up customer history, the interface that appears isn't the CRM's standard screen — it's a synthesized view optimized for the specific question being answered.
Each of these is a small example. But they share a pattern: the capability exists independently of its presentation. The interface becomes a rendering choice, not a product feature.
The Model Context Protocol — MCP — standardizes this pattern. And once it's standardized, everything downstream shifts.
MCP: The Protocol Layer for Capabilities
For those less familiar with MCP, here's the essential mental model. MCP defines a standard way for AI systems to discover and invoke capabilities exposed by external servers. An MCP server publishes tools (executable operations with typed parameters), resources (data sources the AI can query), and prompts (reusable interaction templates). An MCP client — typically an AI assistant or agent — connects to servers, discovers available capabilities, and invokes them as needed.
// An MCP server exposing order management capabilities
const orderTools = {
getOrderHistory: {
description: 'Retrieve order history for a customer',
parameters: {
customerId: { type: 'string', description: 'Customer identifier' },
dateRange: {
type: 'object',
properties: {
start: { type: 'string', format: 'date' },
end: { type: 'string', format: 'date' }
}
},
status: {
type: 'array',
items: { type: 'string', enum: ['pending', 'shipped', 'delivered', 'cancelled'] }
}
},
execute: async (params, context) => {
// Implementation details...
return { orders, totalCount, summary };
}
},
exportOrders: {
description: 'Export orders in various formats',
parameters: {
format: { type: 'string', enum: ['csv', 'xlsx', 'json'] },
filters: { /* ... */ }
},
execute: async (params, context) => {
// Generate export, return download URL
}
}
};
If you've read my earlier post on capability-based architecture, this should look familiar. It's essentially the defineActions pattern with a standardized protocol wrapper. The same action definition serves human interfaces (buttons, forms) and AI interfaces (tool invocations). One implementation, multiple consumers, zero duplication.
But MCP's significance isn't the protocol mechanics — it's what standardization enables.
When Anthropic released MCP in November 2024, adoption happened at unprecedented speed. OpenAI announced support within months. Google, Microsoft, and AWS followed. By late 2025, Anthropic had donated MCP to the Linux Foundation's Agentic AI Foundation, with governance designed to prevent vendor lock-in.
The ecosystem metrics tell the story: thousands of MCP servers in the registry, nearly 100 million monthly SDK downloads, major SaaS platforms racing to expose their functionality via MCP. This isn't a protocol looking for adoption — it's adoption looking for maturity.
The Great Decoupling
Here's the architectural shift I believe we're witnessing: the permanent decoupling of capability from presentation.
In the current model, SaaS products bundle three things together:
- Capabilities — the actual functionality (manage orders, analyze data, send messages)
- Data — the information the capability operates on
- Interface — the specific screens and workflows users interact with
The interface has been the product's face, its brand, its differentiation. This bundling made sense when humans were the only consumers. But when AI agents become primary consumers, the interface bundle breaks apart.
Consider what happens when your AI assistant can directly invoke the getOrderHistory capability. You don't need the vendor's orders screen. You don't need to learn their navigation. You just ask your question and get an answer, rendered in whatever format suits your current context — a table, a chart, a voice summary, a spatial visualization in AR glasses.
The capability remains essential. The data remains essential. The interface becomes... optional. Or more precisely, the interface becomes a contextual rendering layer generated on demand rather than shipped in advance.
This is why I believe the interface isn't just less important in this future — it's fundamentally different in kind. It's not a product feature but a rendering choice. The same capability might appear as a dashboard widget, a conversational exchange, a voice command response, or a data payload for another agent. Same capability server, different clients.
What "Contextual Rendering" Actually Means
Let me make this concrete. Contextual rendering isn't just "different devices get different layouts." It's the interface being generated based on:
User context: What role does this person have? What's their expertise level? What are they trying to accomplish right now? A CFO asking about revenue trends gets a different rendering than a sales rep asking the same underlying question.
This is already visible in Microsoft's Copilot ecosystem. The same underlying AI capabilities render differently in Excel (formulas and data analysis), Word (document generation and editing), Teams (meeting summaries and action items), and Outlook (email triage and drafting). Same capability engine, different contextual presentations based on where the user is and what they're doing. Microsoft isn't building four separate AI products — they're building one capability layer with context-aware rendering.
Session context: What happened earlier in this conversation? What's already been established? The rendering builds on prior context rather than starting fresh with each interaction.
Watch what ChatGPT and Claude are doing with projects and memory. These aren't just features — they're early implementations of session context that persists across interactions. When Claude remembers that you're working on a specific codebase, or ChatGPT recalls your preferences from previous conversations, the rendering adapts. The responses build on an established context rather than starting cold. We're seeing session context evolve from "within this conversation" to "across all our interactions."
Modality context: Are they at a desktop with a large screen? On mobile while walking? In a car using voice? Wearing AR glasses? The same capability renders appropriately for each.
Preference context: Does this user prefer tables or charts? Detailed or summary views? Do they want the AI to explain its reasoning or just give the answer?
Again, Claude's memory and ChatGPT's customization features are early moves here — learning that you prefer concise answers, or that you want code examples in Python rather than JavaScript. The preference layer is thin today but growing.
The capability doesn't change. The data doesn't change. But the presentation adapts fluidly to the moment.
This is what I mean by the interface becoming ephemeral. It's not a fixed artifact shipped with the product. It's generated fresh for each interaction, optimized for the specific context of that moment. And crucially, the major AI platforms are already building this way, which tells you something about where the industry sees the future heading.
The Multi-Modal Default
Here's something I find genuinely exciting about this shift. Once you decouple capability from presentation, multi-modal becomes the default rather than the exception.
Today, supporting voice interfaces or AR, or chat-based interaction requires building separate products or integrations. Each modality is a project. But when capabilities are exposed through a standard protocol, each modality is just another rendering client.
The same getOrderHistory capability:
- Renders as a data table in a desktop dashboard
- Becomes a voice summary in the car: "You had 47 orders last month, up 12% from the previous month."
- Appears as a spatial visualization in AR glasses
- Returns structured data to another AI agent for further processing
One capability, many renderings, zero duplication. The patterns that enable AI integration turn out to be exactly what you need for genuine multi-modal applications.
The Transition Has Already Started
Look around, and you'll see the early signals:
AI assistants generating visualizations on demand rather than users navigating to dashboards. Every custom chart Claude or ChatGPT generates is a micro-instance of this pattern.
Microsoft's Copilot ecosystem treats AI capabilities as a layer that renders contextually across applications. They're not building separate AI features for each Office app — they're building a capability layer that presents differently based on context. This is the architectural pattern, implemented at scale.
Memory and project features in ChatGPT and Claude, building persistent session and preference context. These aren't just convenience features — they're the foundation for contextual rendering that adapts to you, not just to the current request.
Voice interfaces are gaining sophistication as they move from simple command-response to genuine capability invocation. Alexa asking Salesforce for pipeline updates isn't navigating Salesforce's UI — it's invoking a capability.
SaaS platforms racing to expose MCP servers. Stripe, Salesforce, ServiceNow, Plaid — the major platforms are building capability interfaces alongside (and eventually instead of) building more UI features.
Enterprise interest in "headless" architectures. The API-first movement prepared the ground; capability-first extends it to AI-native consumption.
The frozen interface isn't melting slowly. It's being disrupted by a fundamentally different model of human-software interaction.
Where This Leads
The decoupling of capability from presentation is just the first domino. Once you see software as capability providers with ephemeral interfaces, deeper questions emerge:
If external SaaS products expose capabilities via MCP, why wouldn't internal enterprise systems do the same? And if they do, what happens to the distinction between "our software" and "their software"?
If capabilities become standardized and interchangeable, where does value live? The interface was the product's differentiation — what replaces it?
And if the interface no longer binds users to specific products, what happens to the data those products have accumulated? Who owns it? Who should?
These questions lead somewhere profound — a restructuring not just of how software gets built, but of how it's sold, owned, and governed.
That's where we're headed in Part 2.
Next in the series: **The Great Decoupling: The Enterprise Capability Graph* — When internal systems and external SaaS become indistinguishable*
References
MCP Protocol and Adoption: Anthropic introduced the Model Context Protocol in November 2024. By late 2025, the ecosystem had grown to approximately 2,000 servers in the MCP Registry and 97+ million monthly SDK downloads. Source: Model Context Protocol Blog - One Year of MCP
Industry Adoption: OpenAI announced MCP support in March 2025, with Sam Altman noting "people love MCP and we are excited to add support across our products." Google DeepMind, Microsoft, and AWS followed with native MCP integration. Source: Wikipedia - Model Context Protocol
Linux Foundation Governance: Anthropic donated MCP to the Agentic AI Foundation under the Linux Foundation in late 2025, co-founded with Block and OpenAI. Source: Anthropic - Donating the Model Context Protocol
SaaS Platform MCP Adoption: Major platforms, including Stripe, Salesforce (Agentforce), and Plaid, have released MCP integrations. Source: FinTech Magazine - Plaid Boosts Fintech with Claude AI Integration
MCP as Industry Standard: Analysis of why MCP achieved rapid adoption over competing approaches. Source: The New Stack - Why the Model Context Protocol Won

Top comments (0)