DEV Community

kirandeepjassal-crypto
kirandeepjassal-crypto

Posted on • Originally published at prepstack.co.in

MCP Deep Dive, Part 2: Inside the Model Context Protocol Architecture (Hosts, Clients, Servers)

Most teams meet MCP as "a way to give your model tools" and stop there. That framing will cost you. Model Context Protocol is a small distributed system with three roles and three primitives, and once you see the architecture clearly, every later question — auth, scale, versioning, streaming — has an obvious home. This part draws the map.

This is Part 2 of a 15-part deep dive. In Part 1 we made the case for adopting MCP: it turns N×M integration glue into N+M (on Mattrx, our .NET/Azure SaaS, that meant 14 bespoke clients collapsing to 3 MCP servers). Now we open the box.

TL;DR

Concept Ad-hoc agent (before) MCP architecture (after)
Roles One process does everything Host, Client, Server — separated
Connection One shared client, mixed state One client per server, isolated
Capabilities Tools only, conflated Tools, Resources, Prompts
Versioning Assumed; breaks on change Negotiated at initialize
Transport Hardcoded stdio (local) or Streamable HTTP+SSE
Mattrx topology Tangled 1 host → 3 servers on Azure
  • Three roles: the host owns the agent loop and its clients, a client owns exactly one server connection, a server owns capabilities.
  • Three primitives: tools (model-controlled actions), resources (app-controlled data by URI), prompts (user-controlled templates).
  • Capability negotiation at initialize lets servers ship new versions without breaking clients.
  • Two transports: stdio for local/co-located, Streamable HTTP + SSE for remote/multi-tenant.
  • Modeling reads as resources held context tokens at 3.5k (down from 14k); the handshake keeps tool-call error rate at 0.8%.

The one mental shift: stop thinking "MCP = tool calling." Think "MCP = three roles and three primitives." Get the roles right and the hard parts (auth at the boundary, scaling the server, versioning the contract) stop being architecture problems and become configuration.

1. The three roles: Host, Client, Server

Before, the first agent was a god object — host, client, and server fused:

// BEFORE: orchestration, tool logic, and data access fused.
public sealed class InsightsAgent(IChatModel model, AppDbContext db, IReportService reports)
{
    public async Task<string> AnswerAsync(string goal, CancellationToken ct)
    {
        var data   = await db.Campaigns.Where(/* ... */).ToListAsync(ct); // it IS the data layer
        var plan   = await model.PlanAsync(goal, ct);                     // it IS the host
        var report = await reports.CreateAsync(/* ... */, ct);            // it IS the action layer
        // Everything welded together; nothing reusable or securable independently.
    }
}
Enter fullscreen mode Exit fullscreen mode

After, MCP names three roles and keeps them apart. The host owns the loop and one client per server:

// AFTER: the host owns the loop and a client per server. That's all it owns.
public sealed class InsightsHost(IReadOnlyList<IMcpClient> clients, IChatModel model)
{
    public async Task<AgentAnswer> RunAsync(AiPrincipal p, string goal, CancellationToken ct)
    {
        var tools = new List<McpTool>();
        foreach (var client in clients)
            tools.AddRange(await client.ListToolsAsync(ct));      // gather from every server

        var plan = await model.PlanAsync(goal, tools, ct);

        foreach (var call in plan.ToolCalls)
        {
            var client = clients.First(c => c.Owns(call.ToolName)); // dispatch to the owner
            call.Result = await client.CallToolAsync(call.ToolName, call.Arguments, ct);
        }
        return await model.SynthesizeAsync(goal, plan, ct);
    }
}
Enter fullscreen mode Exit fullscreen mode

The server is the mirror image — it declares capabilities and is ignorant of who calls it:

// AFTER: the server owns capabilities. No agent, no model, no idea who's calling.
builder.Services
    .AddMcpServer(o => o.ServerInfo = new() { Name = "mattrx-analytics", Version = "2.4.0" })
    .WithHttpTransport()                 // Streamable HTTP + SSE
    .WithTools<AnalyticsTools>()         // model-controlled actions
    .WithResources<CampaignResources>()  // app-controlled data
    .WithPrompts<AnalyticsPrompts>();    // user-controlled templates
Enter fullscreen mode Exit fullscreen mode

The host↔server boundary is exactly where auth, rate-limiting, and audit belong — and now there is a boundary.

2. The three primitives: Tools, Resources, Prompts

MCP gives servers three primitives, each with a different controller:

  • Tools — model-controlled. The model decides when to call them.
  • Resources — application-controlled. The host attaches relevant data to context, addressed by URI. The model reads; it does not "call."
  • Prompts — user-controlled. Reusable templates the user (or UI) selects deliberately.
// Resource: URI-addressable data the host attaches to context — not a tool call.
[McpServerResource(UriTemplate = "mattrx://analytics/campaigns/{campaignId}")]
[Description("A campaign record for the caller's tenant.")]
public async Task<ResourceContents> GetCampaign(string campaignId, CancellationToken ct)
{
    var c = await campaigns.GetAsync(principal.TenantId, campaignId, ct);
    return ResourceContents.Json(c); // tenant bound from the principal, as always
}
Enter fullscreen mode Exit fullscreen mode

The win is the control model, not the syntax. A resource lets the host attach exactly the campaign record the user is looking at — by URI — instead of the model guessing which tool to call and pulling back more than it needs. That move is part of how we hold context tokens at 3.5k (down from 14k).

3. Capability negotiation: the initialize handshake

Every MCP session opens with initialize, where both sides exchange a protocol version and their capabilities:

// initialize result (server -> client)
{
  "jsonrpc": "2.0", "id": 1,
  "result": {
    "protocolVersion": "2025-06-18",
    "capabilities": {
      "tools": { "listChanged": true },
      "resources": { "subscribe": true },
      "prompts": {}
    },
    "serverInfo": { "name": "mattrx-analytics", "version": "2.4.0" }
  }
}
Enter fullscreen mode Exit fullscreen mode
// AFTER: adapt to negotiated capabilities — never hardcode them.
var session = await client.InitializeAsync(ct);
if (session.Server.Supports(ServerCapability.Resources))
    await PreloadCampaignResourcesAsync(client, ct);   // only if advertised
if (session.Server.Supports(ServerCapability.Prompts))
    await RegisterPromptMenuAsync(client, ct);
Enter fullscreen mode Exit fullscreen mode

The handshake is small but it's the load-bearing wall: it's what lets mattrx-analytics ship a v2.5 with a new tool while a v3.0 client and a v3.1 client both keep working. Independent versioning behind this handshake is a major reason the tool-call error rate fell to 0.8%.

4. Transports: stdio vs Streamable HTTP + SSE

Transport is a property of the deployment, not the server. The same server code runs two ways:

  • stdio — the server is a child process of the host. No network, no auth, lowest latency. Local dev and co-located tools.
  • Streamable HTTP + SSE — a single HTTP endpoint that upgrades to Server-Sent Events for streaming. For anything crossing a network or trust boundary, with TLS, OAuth, and horizontal scale.
// Local dev: stdio child process, zero auth.
builder.Services.AddMcpServer().WithStdioTransport().WithTools<AnalyticsTools>();

// Production: Streamable HTTP + SSE, behind the gateway, multi-tenant.
builder.Services.AddMcpServer()
    .WithHttpTransport(o => o.Stateless = false)  // keep session for SSE streams
    .WithTools<AnalyticsTools>()
    .WithResources<CampaignResources>();
Enter fullscreen mode Exit fullscreen mode

Choose transport by trust boundary. Inside one process, stdio is simpler and faster and needs no auth. Across a network — between tenants, between your service and a partner's assistant — you need HTTP with TLS and OAuth. Same AnalyticsTools, different wiring. (In prod, our 3 servers run as Azure Container Apps: read p95 120 ms, report-enqueue p95 90 ms, streaming first-token p95 ~300 ms.)

When the full architecture is overkill

  • A single co-located tool. Use stdio and tools only; resources/prompts/negotiation are ceremony for a one-tool dev helper.
  • Data you always need anyway. If the host attaches the same small record every time, a tool (or inlining) can beat a resource with a URI scheme.
  • Subscriptions / listChanged you won't honor. Don't advertise capabilities no client reacts to — unused capabilities are lies in the handshake.
  • Over-splitting servers. Three servers by domain is right; fourteen micro-servers re-create the N×M mess with extra deployment overhead.
  • stdio across a trust boundary. In production that usually means you co-located a tool you should have isolated.

The model to carry forward

Host owns the loop. Client owns the connection. Server owns the capability. Three roles, three primitives (tool, resource, prompt), one handshake. That sentence is the entire architecture, and almost every MCP bug we've hit was a violation of it.

  • Draw the host/client/server boundary before writing code. Most MCP confusion is role confusion.
  • Pick the primitive by control model. "Who decides to invoke this — model, app, or user?" names the primitive.
  • Choose transport by trust boundary. stdio inside a process, HTTP+SSE across a network — and put auth exactly where the network starts.

Originally published on PrepStack. Mapping your own host/client/server boundaries and want a sanity check? Reach me at randhir.jassal[at]gmail.com.

Top comments (0)