<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Veera Ravindra Divi</title>
    <description>The latest articles on DEV Community by Veera Ravindra Divi (@ravi3divi).</description>
    <link>https://dev.to/ravi3divi</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3999404%2Ffb3cbb8f-648d-4779-a267-41036c64d29e.jpeg</url>
      <title>DEV Community: Veera Ravindra Divi</title>
      <link>https://dev.to/ravi3divi</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/ravi3divi"/>
    <language>en</language>
    <item>
      <title>MCP just deleted the session: what the July 28 spec breaks in your server</title>
      <dc:creator>Veera Ravindra Divi</dc:creator>
      <pubDate>Mon, 29 Jun 2026 02:32:20 +0000</pubDate>
      <link>https://dev.to/ravi3divi/mcp-just-deleted-the-session-what-the-july-28-spec-breaks-in-your-server-49c</link>
      <guid>https://dev.to/ravi3divi/mcp-just-deleted-the-session-what-the-july-28-spec-breaks-in-your-server-49c</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;The July 28 MCP rewrite goes stateless. It fixes your load balancer and does nothing for the bill you actually pay: tool-schema context bloat.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;If you shipped an MCP server in the last year, you built it around a lie the spec told you: that a connection is a thing you can hold onto. On July 28, 2026, that assumption gets deleted.&lt;/p&gt;

&lt;p&gt;The 2026-07-28 release candidate locked on May 21, and it goes stateless. No more &lt;code&gt;initialize&lt;/code&gt;/&lt;code&gt;initialized&lt;/code&gt; handshake. No more &lt;code&gt;Mcp-Session-Id&lt;/code&gt; header pinning a client to one process. The whole "open a session, keep it warm, route everything back to the same box" model that every tutorial taught you is now legacy. (It's still an RC until July 28, so treat the wire details below as the candidate, not the carved-in-stone final — but the SDK teams are already migrating against it.)&lt;/p&gt;

&lt;p&gt;I want to make two arguments at once. First: going stateless is the right call, and it's overdue. Second: it fixes the operational pain everyone complained about while doing nothing for the thing that's actually expensive about MCP. Those aren't in tension. You can ship a beautiful round-robin deployment and still set your token budget on fire.&lt;/p&gt;

&lt;h2&gt;
  
  
  What actually changed on the wire
&lt;/h2&gt;

&lt;p&gt;Here's the old request path. A client connects, does the handshake, gets a session ID, and from then on every request carries that ID and has to land on the same server instance that minted it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight http"&gt;&lt;code&gt;&lt;span class="nf"&gt;POST&lt;/span&gt; &lt;span class="nn"&gt;/mcp&lt;/span&gt; &lt;span class="k"&gt;HTTP&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="m"&gt;1.1&lt;/span&gt;
&lt;span class="na"&gt;Content-Type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;application/json&lt;/span&gt;
&lt;span class="na"&gt;Mcp-Session-Id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;4e9c1a7f-2b3d-44a8-9f10-0c2d6a1b88ef&lt;/span&gt;

&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"jsonrpc"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"2.0"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nl"&gt;"method"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"tools/call"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
 &lt;/span&gt;&lt;span class="nl"&gt;"params"&lt;/span&gt;&lt;span class="p"&gt;:{&lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"search_repo"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nl"&gt;"arguments"&lt;/span&gt;&lt;span class="p"&gt;:{&lt;/span&gt;&lt;span class="nl"&gt;"q"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"retry"&lt;/span&gt;&lt;span class="p"&gt;}}}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That &lt;code&gt;Mcp-Session-Id&lt;/code&gt; is the load-balancer tax. It forces sticky sessions or a shared session store, because instance B has no idea what instance A negotiated during &lt;code&gt;initialize&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The new path drops the session entirely. Client metadata that used to be negotiated once at connection setup now rides per-request in a &lt;code&gt;_meta&lt;/code&gt; field, and two new required headers — &lt;code&gt;Mcp-Method&lt;/code&gt; and &lt;code&gt;Mcp-Name&lt;/code&gt; — let gateways and rate-limiters route without parsing the JSON body.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight http"&gt;&lt;code&gt;&lt;span class="nf"&gt;POST&lt;/span&gt; &lt;span class="nn"&gt;/mcp&lt;/span&gt; &lt;span class="k"&gt;HTTP&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="m"&gt;1.1&lt;/span&gt;
&lt;span class="na"&gt;Content-Type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;application/json&lt;/span&gt;
&lt;span class="na"&gt;Mcp-Method&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;tools/call&lt;/span&gt;
&lt;span class="na"&gt;Mcp-Name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;search_repo&lt;/span&gt;

&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"jsonrpc"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"2.0"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nl"&gt;"method"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"tools/call"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
 &lt;/span&gt;&lt;span class="nl"&gt;"params"&lt;/span&gt;&lt;span class="p"&gt;:{&lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"search_repo"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nl"&gt;"arguments"&lt;/span&gt;&lt;span class="p"&gt;:{&lt;/span&gt;&lt;span class="nl"&gt;"q"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"retry"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="nl"&gt;"_meta"&lt;/span&gt;&lt;span class="p"&gt;:{&lt;/span&gt;&lt;span class="nl"&gt;"client"&lt;/span&gt;&lt;span class="p"&gt;:{&lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"my-agent"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nl"&gt;"version"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"3.1.0"&lt;/span&gt;&lt;span class="p"&gt;}}}}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Capability exchange that used to happen in &lt;code&gt;initialize&lt;/code&gt; now happens through a &lt;code&gt;server/discover&lt;/code&gt; call any instance can answer. List and resource responses also pick up &lt;code&gt;ttlMs&lt;/code&gt; and &lt;code&gt;cacheScope&lt;/code&gt; so a gateway can cache them like HTTP. That last part matters more than it looks — I'll come back to it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Deleting the session store
&lt;/h2&gt;

&lt;p&gt;The migration most people will underestimate is server state. If your server looks like this, every line of it is now dead weight:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# BEFORE — stateful, 2025-11-25 style
&lt;/span&gt;&lt;span class="n"&gt;sessions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;  &lt;span class="c1"&gt;# in-memory, or worse, Redis you now have to operate
&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;on_initialize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;sid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;uuid4&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;sessions&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;sid&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;caps&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;capabilities&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
                     &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;client&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;clientInfo&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sessionId&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;sid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;capabilities&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;SERVER_CAPS&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;on_tools_call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sessions&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Mcp-Session-Id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;  &lt;span class="c1"&gt;# KeyError on the wrong box
&lt;/span&gt;    &lt;span class="bp"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After, there is no session map and no &lt;code&gt;on_initialize&lt;/code&gt;. Each handler is pure with respect to the request — it reads what it needs from &lt;code&gt;_meta&lt;/code&gt; and answers:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# AFTER — stateless, 2026-07-28 RC style
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;on_discover&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;capabilities&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;SERVER_CAPS&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;   &lt;span class="c1"&gt;# any instance can answer
&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;on_tools_call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;_meta&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;client&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;  &lt;span class="c1"&gt;# travels with the request
&lt;/span&gt;    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;run_tool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;arguments&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The payoff is exactly what the spec authors advertised: you can put this behind "a plain round-robin load balancer" with no sticky routing and no shared store. Server-to-client prompts that used to need a persistent SSE stream now return an &lt;code&gt;InputRequiredResult&lt;/code&gt; carrying echoed state, so any instance can resume the round trip. Kill your Redis session cache. Kill your sticky-session annotations. This is a genuinely smaller system to operate.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/images%2Fdiagram-1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/images%2Fdiagram-1.png" alt="Stateful vs stateless MCP request path" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Tasks got demoted, and the lifecycle flipped
&lt;/h2&gt;

&lt;p&gt;Tasks were an experimental core feature. In the RC they're reclassified as an opt-in extension, and &lt;code&gt;tasks/list&lt;/code&gt; is gone — the spec is blunt that it "can't be scoped safely without sessions." If you can't pin a client to a box, you can't safely enumerate "its" tasks.&lt;/p&gt;

&lt;p&gt;The lifecycle inverts. Instead of the server tracking long-running work against a session, &lt;code&gt;tools/call&lt;/code&gt; returns a task handle and the &lt;em&gt;client&lt;/em&gt; becomes the driver, polling and steering with &lt;code&gt;tasks/get&lt;/code&gt;, &lt;code&gt;tasks/update&lt;/code&gt;, and &lt;code&gt;tasks/cancel&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json-doc"&gt;&lt;code&gt;&lt;span class="c1"&gt;// 1. client calls a long-running tool&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"jsonrpc"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"2.0"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nl"&gt;"method"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"tools/call"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
 &lt;/span&gt;&lt;span class="nl"&gt;"params"&lt;/span&gt;&lt;span class="p"&gt;:{&lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"reindex"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nl"&gt;"arguments"&lt;/span&gt;&lt;span class="p"&gt;:{&lt;/span&gt;&lt;span class="nl"&gt;"scope"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"all"&lt;/span&gt;&lt;span class="p"&gt;}}}&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="c1"&gt;// 2. server hands back a task handle (illustrative shape)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"jsonrpc"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"2.0"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
 &lt;/span&gt;&lt;span class="nl"&gt;"result"&lt;/span&gt;&lt;span class="p"&gt;:{&lt;/span&gt;&lt;span class="nl"&gt;"task"&lt;/span&gt;&lt;span class="p"&gt;:{&lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"tsk_91af"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"working"&lt;/span&gt;&lt;span class="p"&gt;}}}&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="c1"&gt;// 3. client drives it from any instance&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"jsonrpc"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"2.0"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nl"&gt;"method"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"tasks/get"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
 &lt;/span&gt;&lt;span class="nl"&gt;"params"&lt;/span&gt;&lt;span class="p"&gt;:{&lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"tsk_91af"&lt;/span&gt;&lt;span class="p"&gt;}}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you leaned on &lt;code&gt;tasks/list&lt;/code&gt; to rebuild a dashboard of in-flight work, that pattern is over. You now own task identity — persist the handles client-side (or in a real datastore your tools write to), because the protocol won't remember them for you.&lt;/p&gt;

&lt;h2&gt;
  
  
  The error code and auth changes that'll trip your tests
&lt;/h2&gt;

&lt;p&gt;Two smaller changes will fail your integration suite quietly if you don't grep for them.&lt;/p&gt;

&lt;p&gt;The resource-not-found code moves from the MCP-specific &lt;code&gt;-32002&lt;/code&gt; to the standard JSON-RPC &lt;code&gt;-32602&lt;/code&gt;. If you have assertions or client branches matching on &lt;code&gt;-32002&lt;/code&gt;, they'll silently stop firing.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight diff"&gt;&lt;code&gt;&lt;span class="gd"&gt;- if (err.code === -32002) handleMissingResource();
&lt;/span&gt;&lt;span class="gi"&gt;+ if (err.code === -32602) handleMissingResource();
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;On auth, six SEPs pull MCP toward plain OAuth/OIDC. The one that bites first is mandatory &lt;code&gt;iss&lt;/code&gt; validation per RFC 9207: the authorization response now has to carry the issuer identifier and your client has to check it matches the server you started the flow with. If you hand-rolled an OAuth client that ignored &lt;code&gt;iss&lt;/code&gt;, it was technically exploitable (mix-up attacks) and is now non-conformant. Roots, Sampling, and Logging also enter deprecation on 12-month removal clocks — not gone in July, but don't build anything new on them.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why none of this fixes the actual disease
&lt;/h2&gt;

&lt;p&gt;Here's my contrarian read. Everything above is plumbing. It makes MCP cheaper to &lt;em&gt;operate&lt;/em&gt; and does almost nothing about what makes MCP expensive to &lt;em&gt;run&lt;/em&gt;: the model pays for your tool schemas on every single turn.&lt;/p&gt;

&lt;p&gt;MCP co-creator David Soria Parra put it plainly in April 2026 — "a significant portion of the context window is consumed before the model does any actual reasoning." Statelessness doesn't touch that. Arguably it makes the framing worse: by deleting the session you delete the one place a server could have remembered "this client already saw these 40 tool definitions," so the obvious answer becomes re-sending schemas and re-sending context on more requests, not fewer.&lt;/p&gt;

&lt;p&gt;I pulled traces on one of my own agents after reading the RC. The tool-schema block plus re-sent prior context regularly ate well over half the prompt before a single user token showed up. That's not a spec bug — it's the cost model. And the stateless rewrite's headline win is operational, not token-economic.&lt;/p&gt;

&lt;p&gt;The one lever the RC actually hands you here is the new &lt;code&gt;ttlMs&lt;/code&gt; and &lt;code&gt;cacheScope&lt;/code&gt; on list/resource responses. That's the part to obsess over. A gateway that caches tool listings instead of re-fetching them per call is the closest thing in this release to addressing context cost — and it's buried under the routing-headers announcement. Use it. Cache aggressively at the edge, trim your tool surface to what the agent actually calls, and stop shipping 40 tools when the agent uses 6.&lt;/p&gt;

&lt;h2&gt;
  
  
  The takeaway
&lt;/h2&gt;

&lt;p&gt;You have roughly ten weeks. The work is mechanical: delete &lt;code&gt;initialize&lt;/code&gt;, delete the session store, move client info into &lt;code&gt;_meta&lt;/code&gt;, add &lt;code&gt;Mcp-Method&lt;/code&gt;/&lt;code&gt;Mcp-Name&lt;/code&gt;, rewrite Tasks as client-driven, fix the &lt;code&gt;-32002&lt;/code&gt; → &lt;code&gt;-32602&lt;/code&gt; assertions, and validate &lt;code&gt;iss&lt;/code&gt;. Do it, and your deployment gets simpler and cheaper to operate.&lt;/p&gt;

&lt;p&gt;But don't confuse a simpler deployment with a cheaper agent. Statelessness solved the load balancer's problem. The model's problem — paying rent on your whole tool catalog every turn — is still sitting there, and the spec just made it slightly harder to cache your way out of. Migrate for the operations win. Then go fight the real bill.&lt;/p&gt;




&lt;p&gt;Are you migrating before July 28 or waiting for the final spec to land? And has anyone actually measured what their tool schemas cost per turn — what's your number? Drop it in the comments.&lt;/p&gt;

</description>
      <category>mcp</category>
      <category>ai</category>
      <category>webdev</category>
      <category>architecture</category>
    </item>
  </channel>
</rss>
