Anthropic announced the Model Context Protocol in November 2024. A year and a half later it is the closest thing the agent ecosystem has to a standard, with Anthropic's reference servers, a long tail of community implementations, IDE-level integrations in Cursor and Zed, and a W3C-adopted browser-side variant called WebMCP. The protocol is no longer the question. The question is what to build on top of it.
David Soria Parra, the MCP co-creator at Anthropic, recently gave a talk on the design lessons of the protocol's first year — the things teams keep doing wrong, and the design discipline the second year is going to require. The talk is the kind of artifact you don't normally get from a protocol author at this stage of an ecosystem's life: a clear, opinionated summary of where the mistakes are concentrated and what the corrective shape looks like.
This piece is my summary of those lessons, with the design rationale I think a developer building an MCP server in May 2026 actually needs. There are six of them.
1. Don't wrap REST as MCP
This is the design mistake every team makes once.
A REST API is a contract for developers and CRUD operations. It is structured around resources. The verbs are HTTP verbs. The granularity is one row at a time: GET /users, POST /orders, PATCH /documents/:id. That granularity is correct for a developer writing a frontend. It is wrong for an agent.
An agent wants to perform an action — a piece of work the user actually cares about. Prepare a customer for onboarding. Find the contract and check it for risk clauses. Draft a response to the open support ticket. Each of those actions, in CRUD terms, is a sequence of dozens of REST calls, each of which the agent has to plan, sequence, and check the result of. The agent ends up doing the work of an ORM and a controller, in a context window that has better things to do.
The bad version of an MCP server looks like this:
# DON'T: REST-style MCP server (one tool per API method)
@server.tool()
def get_user(user_id: str) -> dict: ...
@server.tool()
def create_order(user_id: str, items: list) -> dict: ...
@server.tool()
def patch_document(doc_id: str, fields: dict) -> dict: ...
@server.tool()
def list_permissions(user_id: str) -> list: ...
# ... seventeen more atomic CRUD wrappers
The good version is the same backend exposing tools at the granularity the agent actually thinks in:
# DO: action-shaped MCP server (one tool per meaningful task)
@server.tool()
def prepare_user_for_onboarding(user_id: str) -> OnboardingResult:
"""Run the full onboarding checklist: provision account, send welcome
email, set initial permissions, attach to default workspace, audit-log."""
...
@server.tool()
def find_and_review_contract(party: str, period: str) -> ContractReview:
"""Locate the contract, parse it, flag clauses by risk category,
summarize key terms, return a structured review."""
...
@server.tool()
def draft_customer_response(ticket_id: str) -> DraftResponse:
"""Pull the ticket history, the customer's account context, and the
current state of the issue; produce a draft response with citations."""
...
The two servers wrap the same backend. The agent's behavior against them is not comparable. The first one produces a long chain of tool calls and a high probability of getting one of them wrong. The second one is one call, returns a structured result, and the agent moves on to the next step of its actual task.
The shift required is conceptual, not technical. You are no longer designing an interface for a developer to make CRUD calls. You are designing an interface for an agent to perform business operations. The CRUD layer still exists underneath; it is not what your MCP server exposes.
2. Progressive discovery — don't dump every tool into context
The second mistake is also a context-window problem.
An MCP server with thirty tools, connected naively, dumps thirty tool descriptions into the agent's context at the start of every conversation. With several MCP servers connected, the count climbs to a hundred or more. The cost is real on every dimension:
- The model has to choose between many tools, which it does worse than choosing between few
- Per-request input-token cost goes up
- Latency goes up, because the prompt is longer
- The risk of an incorrect tool call goes up, because the model has more confusable options
- The actual data the agent needs gets squeezed out by descriptions of tools it will never invoke
The corrective is progressive discovery — the agent fetches the tools it needs as it needs them, rather than receiving the full catalog at session start. The MCP server presents a small index. The agent reads the index, decides what it needs, asks for the specific tool descriptions, and only then has those descriptions in its context.
In practice this means a developer building an MCP server in 2026 should be thinking not only about which tools to expose but about how the agent finds them. A flat list of forty tools is rarely the right answer. A short categorized index with two-line descriptions, plus an on-demand expansion mechanism, is closer.
3. Skills and MCP are not competitors
Anthropic also ships a Skills system — a different mechanism for giving an agent a procedural capability. The natural reaction is to ask which one wins. The answer that came out of the talk is that neither wins; they are answers to different questions.
Skills are good for local, procedural, CLI-utility-driven work. Take this video file, run it through ffmpeg with these flags, put the result in this directory. The capability is a self-contained procedure that does not need authorization or audit. The cost of building it is low. The cost of running it is also low.
MCP is the right tool when you need the things a procedure cannot provide on its own: authorization, role-based access control, audit logs, observability, stable contracts between the agent and an external system, lifecycle management of the tool itself. If the action involves customer data, finances, or anything that lives in an enterprise system, you want MCP. If the action is a deterministic local procedure, Skills are simpler.
The decision rule is: use the simplest mechanism that solves the problem. MCP servers are not free to build, run, or maintain. If a Skill works, ship the Skill.
4. Enterprise needs a gateway
The largest organizational mistake is letting every team build its own MCP server independently. After twelve months, the result is a sprawl of half-documented servers with inconsistent authorization models, no shared audit trail, and a security team that cannot tell you which agents have access to which systems. This is shadow IT for agents, and the standard response — we'll catalog them later — does not work, for all the reasons it has not worked in any prior generation of integration sprawl.
The corrective is a connectivity-and-gateway layer: a single platform-side component that sits between the agents and the enterprise systems and handles, in one place:
- Tool registration and lifecycle
- Authorization policies (which agent, calling on whose behalf, can invoke which tool)
- Audit logging of every tool call
- Rate limits and quotas
- Observability and tracing
- The lifecycle of tool definitions across versions
This is not a novel architectural pattern. It is API-gateway design applied to a slightly different problem. The pieces that are different — agent identity, multi-step delegated authority, async tool execution — are listed in lesson six. The frame that is the same is single chokepoint; consistent policy; auditable trail.
A company that does not invest in this layer in 2026 will spend most of 2027 trying to put it in retroactively, while incidents in the meantime keep paging the security team.
5. Agent identity is the unsolved part
This is the lesson the talk treated most carefully, because it is the one with the fewest solutions.
The traditional auth pattern is: user authenticates → user clicks a button → system performs an action. The user's identity, the user's intent, and the system's action are all colocated in a single moment.
Agents break that pattern in several directions at once. The agent acts on the user's behalf, but the action may happen minutes or hours after the user's last interaction. The agent may chain through multiple systems, each of which sees a different intermediate identity. The agent may continue a task overnight while the user is offline. The agent may be one of several agents working on the same task, with no clean way to attribute a specific tool call to a specific initiating user action.
The open questions are concrete:
- Who exactly called the tool? The agent, the user, the system that invoked the agent?
- On whose behalf? With what claim chain?
- With what permissions? The agent's own, the user's, or some intersection?
- Can the agent continue without re-asking the user? Under what conditions?
- How do you revoke access mid-task, when the agent is in the middle of a multi-step operation that has already done some of its work?
- How do you show an auditor, six months later, what actually happened?
These questions are not resolved by MCP itself. MCP carries the tool-call protocol, not the identity-and-authorization chain. The industry is mostly bolting OAuth flows, signed delegation tokens, and ad-hoc audit logging on top, and each of those is its own engineering effort with no canonical answer yet. The talk frames this as the unsolved layer. That framing is honest.
6. The MCP server is not your old API
Pulling the threads together: the lesson at the center of all five preceding ones is that an MCP server is a different kind of artifact from the REST API the same team probably built last year, even when both expose the same backend.
The REST API is a contract for software developers. It is granular, resource-oriented, designed to compose into application logic written by humans who know which calls to make in what order. The MCP server is a contract for agents. It should be coarse-grained at the level of meaningful tasks, on-demand discoverable, gated by an enterprise-aware authorization model, and instrumented for an audit trail that survives multi-step delegated execution.
The teams that have figured this out in the first year of MCP did so by treating "design for an agent" as a different design problem from "design for a frontend." The teams that have not figured it out yet are mostly the teams whose first MCP server was a one-to-one wrapper of an existing REST API, shipped because the JIRA ticket said expose this as MCP and the path of least resistance was to wrap the methods. Those servers are now what their teams have to maintain.
The corrective in May 2026 is a short, opinionated list:
- Tools as actions, not endpoints. Design the verb the agent actually wants. Compose the CRUD calls underneath.
- Atomic-but-composable. Each tool does one meaningful thing. The agent can chain tools, but each chain link is a complete unit on its own.
- Bounded tool sets per task. Don't expose 40 tools to a workflow that needs 4. Curate the surface.
- Progressive discovery as default. Tools loaded into context only when relevant.
- Audit and access from day one. Bolt it on later and you'll bolt it on poorly.
- CLI where CLI is enough. Skills where Skills are enough. MCP where you need the contract.
- Gateway for enterprise. No exceptions. The cost of skipping this is paid in incidents.
A year in, MCP is a working protocol with real adoption and real production traffic. It is also a category in which most teams are still learning the design discipline. The protocol authors have now said publicly what they have been telling teams in private for a year: it is not a CRUD wrapper, it is not a silver bullet, and the parts of the problem that look easy at the start are the parts the second year will be spent retrofitting.
The teams that ship the second-year version of this design discipline get to compound on the foundation. The teams that ship the first-year version get to do it again.
Top comments (1)
The action-vs-endpoint framing is the one I'd attach a caveat to. Collapsing seventeen CRUD calls into
prepare_user_for_onboardingis right for the agent's context budget, but it also collapses seventeen recovery points into a single opaque outcome. When step 12 of 17 fails, the agent has no way to reason about the state it left behind — it just sees a tool that didn't return cleanly. We've found coarse tools only hold up in practice when they return structured partial-failure (what completed, what rolled back, what's now in limbo), so the agent can decide whether to retry or escalate rather than re-run the whole action. That actually compounds with your lesson 5: delegated authority over a long multi-step action is exactly where "which sub-step was the user authorized for" gets murky. Curious whether you'd put step-level audit in the gateway layer you describe, or whether that detail has to live inside the tool itself.