DEV Community

Armaan Sandhu
Armaan Sandhu

Posted on

The MCP SDK's EventStore Lives in Memory. Here's What Happens When Your Server Restarts.

I Built a Python Package to Fix SSE Resumability in the MCP SDK

Your MCP server crashed. Your client reconnected. Every event from that session? Gone.


The Gap

The Model Context Protocol Python SDK ships with a built-in EventStore that powers SSE stream resumability — when a client reconnects with a Last-Event-ID header, the server replays the events it missed. This works great in development.

The catch: that store lives entirely in memory.

Restart the process, roll a new deployment, or — in a multi-worker setup — have the reconnecting client land on a different pod, and the session is gone. The store was local to the process that died. Resumability silently returns nothing.

This isn't a bug in the SDK. It's a scope decision — the in-memory store is a correct, useful default for single-process development. But the moment you deploy to production, you need something durable.

That's the gap mcp-persist fills.


What It Does

mcp-persist adds three drop-in EventStore backends — SQLite, Redis, and PostgreSQL — that survive process restarts and work across multi-worker deployments. Pick the one that fits your infrastructure; the API is identical across all three.

pip install "mcp-persist[sqlite]"   # no external service needed
pip install "mcp-persist[redis]"    # for multi-worker deployments
pip install "mcp-persist[postgres]" # for teams already running Postgres
Enter fullscreen mode Exit fullscreen mode

The Two-Line Setup

Wiring resumability by hand is tedious — you need a store, a StreamableHTTPSessionManager, a Starlette lifespan to open and close both, and a Mount. The with_persistence() helper collapses all of that. Pass your FastMCP instance, get back a runnable ASGI app:

import uvicorn
from mcp.server.fastmcp import FastMCP
from mcp_persist import with_persistence

mcp = FastMCP(name="MyServer")

app = with_persistence(mcp, backend="sqlite", url="events.db", ttl=3600)
uvicorn.run(app, host="127.0.0.1", port=8000)  # MCP endpoint at /mcp
Enter fullscreen mode Exit fullscreen mode

Switching to Redis is a one-word change:

app = with_persistence(mcp, backend="redis", url="redis://localhost:6379", ttl=3600)
Enter fullscreen mode Exit fullscreen mode

You can also configure the backend entirely from the environment — set MCP_PERSIST_BACKEND, MCP_PERSIST_URL, and MCP_PERSIST_TTL, then call with_persistence(mcp) with no arguments. Useful for keeping deployment config out of source.


Resumability Without Touching the Server: PersistenceProxy

Sometimes you can't modify the MCP server. It might be third-party, in a different language, or a binary you don't own. For those cases, mcp-persist ships a proxy you run in front of the server.

The proxy forwards requests upstream, intercepts SSE responses, persists every event to a store, and assigns its own event IDs. A client that reconnects with Last-Event-ID gets its missed events replayed from the store. The upstream needs no event store of its own.

# point at a running server
mcp-persist-proxy --upstream http://localhost:8001 \
    --backend sqlite --url events.db --port 8000

# or start the server as a subprocess and proxy it
mcp-persist-proxy --backend redis --url redis://localhost:6379 \
    --port 8000 --upstream-port 8001 -- uvicorn my_server:app --port 8001
Enter fullscreen mode Exit fullscreen mode

Clients connect to the proxy address instead of the server's. Nothing on the client side changes — resumability rides the standard SSE Last-Event-ID header.

One important caveat: the proxy adds resumability against a stable upstream. If the upstream server itself restarts, that's a clean break — the proxy can replay what it already stored, but can't carry the old connection over to the new server.


Choosing a Backend

Deployment Backend
Single process, no external service needed SQLite
Multiple workers or replicas behind a load balancer Redis
Already running Postgres, or need multi-node durability PostgreSQL
Serverless / ephemeral filesystem Redis or PostgreSQL

One nuance worth calling out: any replica count greater than one needs a shared store. A local SQLite file is only visible to the process that opened it. Behind a load balancer, a reconnecting client might land on a different pod that has never seen its events — the resume silently returns nothing. SQLite is for a genuine single process.


Other Features Worth Knowing

Real-time streaming. Each store can push new events to in-process consumers as they're written, using subscribe(). Redis uses pub/sub, Postgres uses LISTEN/NOTIFY, and SQLite falls back to polling.

Cross-backend migration. migrate(source, destination) copies events from one store to another — say, SQLite to Postgres as a deployment grows — streaming oldest-first and preserving per-stream ordering.

Compression. Pass compression="gzip" to compress large payloads before storage. Decompression on read is automatic and independent of the setting, so you can enable it on a rolling deploy without touching existing data.

Readiness probes. Every store exposes await store.ping() for health checks — Redis PING, Postgres/SQLite SELECT 1. Drop it into a /health endpoint.

Metrics. Pass any object with on_store_event, on_replay, and on_error methods as metrics= to emit to Prometheus, Datadog, or whatever you're already using.


Benchmark Snapshot

These were measured on a Ryzen AI 7 350 machine with Redis 8.8 and Postgres 18.4 running in local containers (--events 5000 --concurrency 500):

Backend store p50 store throughput replay 1k events
SQLite 57 µs 23,517 ev/s 6.5 ms
Redis 66 µs 7,857 ev/s 8.8 ms
Postgres 626 µs 7,427 ev/s 6.6 ms

SQLite's numbers look suspiciously good. The reason is structural — it runs in-process with no network hop, so every write skips a round-trip entirely. The tradeoff is that it's single-writer, so those numbers don't scale across processes.

You can run the benchmark yourself:

uv run python benchmarks/benchmark.py --events 5000 --concurrency 500
Enter fullscreen mode Exit fullscreen mode

Try It

pip install "mcp-persist[sqlite]"
Enter fullscreen mode Exit fullscreen mode
from mcp.server.fastmcp import FastMCP
from mcp_persist import with_persistence
import uvicorn

mcp = FastMCP(name="MyServer")

@mcp.tool()
def hello(name: str) -> str:
    return f"Hello, {name}!"

app = with_persistence(mcp, backend="sqlite", url="events.db", ttl=3600)
uvicorn.run(app, host="127.0.0.1", port=8000)
Enter fullscreen mode Exit fullscreen mode

Connect any MCP client to http://localhost:8000/mcp. Kill and restart the server. The client resumes exactly where it left off.


GitHub: github.com/Ar-maan05/mcp-persist

PyPI: pypi.org/project/mcp-persist

If this is useful to you, a star helps a lot. And if you're building on MCP and hitting edge cases around resumability — open an issue or send a PR. The contributing guide has everything you need to get started.


Top comments (0)