Raghava Chellu

Posted on Feb 23

Bringing Async MCP to Google Cloud Run — Introducing cloudrun-mcp

#programming #ai #python

When you design distributed AI or agentic workloads on Google Cloud’s Cloud Run, you often juggle three recurring problems:

How to authenticate workloads securely
How to maintain long-lived, event-driven sessions
How to stream model context data efficiently without blocking threads

cloudrun-mcp solves all three in one lightweight Python SDK.

What is MCP (Model Context Protocol)?

MCP — Model Context Protocol is an emerging open standard for exchanging context between AI models, tools, and environments.

Think of it as “WebSockets for AI knowledge.”

Instead of hardcoding API calls, your model connects to an MCP server and streams structured events such as:

context.create
document.attach
agent.reply

For developers deploying AI agents on Cloud Run, GKE, or hybrid workloads, an async client is essential for scalability.

Introducing cloudrun-mcp

Async MCP (Model Context Protocol) client for Cloud Run.

Built by Raghava Chellu (February 2026), cloudrun-mcp brings:

First-class async streaming
Automatic Cloud Run authentication
Agentic-AI-friendly APIs

to your production workloads.

How It Works

Under the hood:

The client uses aiohttp to maintain an HTTP/1.1 keep-alive streaming session.
Inside Cloud Run, it queries the metadata service to obtain a signed JWT:

http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/identity?audience=<your-audience>

Each event from the MCP server arrives as a Server-Sent Event (SSE).
The SDK yields events as a Python async iterator, ready for real-time AI reasoning loops.

Installation

pip install cloudrun-mcp

Requirements

Python ≥ 3.10
Deployed on GCP (Cloud Run / GKE / GCE) with metadata-server access

Usage Example

import asyncio
from cloudrun_mcp import MCPClient

async def main():
    client = MCPClient(base_url="https://your-mcp-server.run.app")

    async for event in client.events():
        print(event)

asyncio.run(main())

Typical Output Stream

{"event":"context.create","status":"ok"}
{"event":"model.response","content":"42"}
{"event":"model.done"}

That’s it — you’ve connected an async agent running on Cloud Run to an MCP backend and are receiving real-time context updates.

Why Async MCP Matters

AI workloads are evolving from simple request-response APIs to long-running reasoning graphs.

Synchronous I/O becomes a bottleneck.

cloudrun-mcp leverages Python’s asyncio to keep event loops responsive across:

Streaming token generation
Function-calling orchestration
Multi-model chains

It’s especially powerful for Agentic AI, where orchestrators consume continuous model context (tool outputs, planning updates, memory events).

Authentication Deep Dive

The SDK automatically:

Discovers the metadata endpoint
Retrieves an ID token targeting your MCP server
Injects it into request headers

Authorization: Bearer <token>

Refreshes tokens every ~55 minutes

No OAuth flows.
No key.json files.
Perfect for production micro-agents.

Streaming with Back-Pressure Control

async for event in client.events(buffer=32):
    await handle_event(event)

Typical Deployment Pattern

[MCP Clients] <--SSE--> [cloudrun-mcp SDK] <--Auth--> [Cloud Run Service]
         \
          ↳ [Agent Processors / Vector DB / PubSub Pipelines]

cloudrun-mcp acts as the async bridge between Cloud identity and AI reasoning streams.

Real-World Use Cases

Event-Driven AI Agents

Agents listening to MCP streams and triggering workflows automatically.

🔹 LLM Orchestration Pipelines

Streaming intermediate reasoning steps to dashboards.

🔹 IoT Telemetry Ingestion

Continuous SSE device streams pushed to Pub/Sub.

🔹 Hybrid Edge Inference

Bridge local MCP hubs with Cloud Run decision services.

Design Philosophy

The SDK follows three principles:

Async First — built entirely on asyncio
Zero Secrets — uses Workload Identity exclusively
Agentic Friendly — integrates with frameworks like LangChain or CrewAI

DEV Community

Bringing Async MCP to Google Cloud Run — Introducing cloudrun-mcp

What is MCP (Model Context Protocol)?

Introducing cloudrun-mcp

How It Works

Installation

Requirements

Usage Example

Typical Output Stream

Why Async MCP Matters

Authentication Deep Dive

Streaming with Back-Pressure Control

Typical Deployment Pattern

Real-World Use Cases

Design Philosophy

Top comments (0)