xbill for Google Developer Experts

Posted on Apr 15 • Originally published at xbill999.Medium on Apr 14

Building with the Google Cloud Run MCP Server for Multi-Agent A2A Deployment

#mcpserver #adk #a2aprotocol #googlecloudrun

Leveraging the Google Agent Development Kit (ADK) and the underlying Gemini LLM to build Multi-Agent Applications with A2A protocol support using the Python programming language. This A2A application was deployed to Google Cloud Run and managed with the official Google Cloud Run MCP server.

Aren’t There a Billion Python Agent Demos?

Yes there are.

Python has traditionally been the main coding language for ML and AI tools. The goal of this article is to provide a multi-agent test bed for building, debugging, and deploying multi-agent applications.

So is this the real Slim Shady?

So what is different about this lab compared to all the others out there?

This is one of the first deep dives into a Multi-Agent application leveraging the advanced tooling of Gemini CLI. The starting point for the demo was an existing Codelab- which was updated and re-engineered with Gemini CLI.

The original Codelab- is here:

Building a Multi-Agent System | Google Codelabs

Python Version Management

One of the downsides of the wide deployment of Python has been managing the language versions across platforms and maintaining a supported version.

The pyenv tool enables deploying consistent versions of Python:

GitHub - pyenv/pyenv: Simple Python version management

As of writing — the mainstream python version is 3.13. To validate your current Python:

python --version
Python 3.13.13

Google Cloud Run

Google Cloud Run is a fully managed, server less compute platform that enables you to run containerized applications and services without managing infrastructure. It automatically scales up or down — even to zero — based on traffic, allowing you to pay only for resources used, billed by the 100-millisecond.

More info is available here:

Cloud Run

Official Google MCP Servers

Google provides MCP servers for all the main components of GCP. The full details are here:

Google Cloud MCP servers overview | Google Cloud Documentation

More info is here:

Announcing official MCP support for Google services | Google Cloud Blog

Note this MCP server exposes MCP tools for the using underlying Cloud Run Services managed by Google. It is *not* just using Cloud Run to deploy your own MCP services.

Gemini CLI

If not pre-installed you can download the Gemini CLI to interact with the source files and provide real-time assistance:

npm install -g @google/gemini-cli

Testing the Gemini CLI Environment

Once you have all the tools and the correct Node.js version in place- you can test the startup of Gemini CLI. You will need to authenticate with a Key or your Google Account:

▝▜▄ Gemini CLI v0.33.1
    ▝▜▄
   ▗▟▀ Logged in with Google /auth
  ▝▀ Gemini Code Assist Standard /upgrade no sandbox (see /docs) /model Auto (Gemini 3) | 239.8 MB

Node Version Management

Gemini CLI needs a consistent, up to date version of Node. The nvm command can be used to get a standard Node environment:

GitHub - nvm-sh/nvm: Node Version Manager - POSIX-compliant bash script to manage multiple active node.js versions

Agent Development Kit

The Google Agent Development Kit (ADK) is an open-source, Python-based framework designed to streamline the creation, deployment, and orchestration of sophisticated, multi-agent AI systems. It treats agent development like software engineering, offering modularity, state management, and built-in tools (like Google Search) to build autonomous agents.

The ADK can be installed from here:

Agent Development Kit (ADK)

Agent Skills

Gemini CLI can be customized to work with ADK agents. Both an Agent Development MCP server, and specific Agent skills are available.

More details are here:

Agent Development Kit (ADK)

To get the Agent Skills in Gemini CLI:

> /skills list

and the ADK documentation:

> /mcp list
Configured MCP servers:
🟢 adk-docs-mcp (from adk-docs-ext) - Ready (2 tools)
  Tools:
  - mcp_adk-docs-mcp_fetch_docs
  - mcp_adk-docs-mcp_list_doc_sources

Where do I start?

The strategy for starting multi agent development is a incremental step by step approach.

First, the basic development environment is setup with the required system variables, and a working Gemini CLI configuration.

Then, ADK Multi-Agent is built, debugged, and tested locally. Finally — the entire solution is deployed to Google Cloud Run.

Setup the Basic Environment

At this point you should have a working Python environment and a working Gemini CLI installation. All of the relevant code examples and documentation is available in GitHub.

The next step is to clone the GitHub repository to your local environment:

cd ~
git clone https://github.com/xbill9/multi-agent

Then run init2.sh from the cloned directory.

The script will attempt to determine your shell environment and set the correct variables:

source init2.sh

If your session times out or you need to re-authenticate- you can run the set_env.sh script to reset your environment variables:

source set_env.sh

Variables like PROJECT_ID need to be setup for use in the various build scripts- so the set_env script can be used to reset the environment if you time-out.

Finally install the packages and dependencies:

make install

Verify The ADK Installation

To verify the setup, run the ADK CLI locally with the researcher agent:

xbill@penguin:~/multi-agent/agents$ adk run researcher
/home/xbill/.pyenv/versions/3.13.13/lib/python3.13/site-packages/google/adk/features/_feature_decorator.py:72: UserWarning: [EXPERIMENTAL] feature FeatureName.PLUGGABLE_AUTH is enabled.
  check_feature_enabled()
Log setup complete: /tmp/agents_log/agent.20260410_174725.log
To access latest log: tail -F /tmp/agents_log/agent.latest.log
{"asctime": "2026-04-10 17:47:25,496", "name": "root", "levelname": "INFO", "message": "Logging initialized for researcher", "filename": "logging_config.py", "lineno": 54, "service": "researcher", "log_level": "INFO"}
{"asctime": "2026-04-10 17:47:25,496", "name": "researcher.agent", "levelname": "INFO", "message": "Initialized researcher agent with model: gemini-2.5-flash", "filename": "agent.py", "lineno": 85}
{"asctime": "2026-04-10 17:47:25,497", "name": "google_adk.google.adk.cli.utils.envs", "levelname": "INFO", "message": "Loaded .env file for researcher at /home/xbill/multi-agent/agents/researcher/.env", "filename": "envs.py", "lineno": 83}
{"asctime": "2026-04-10 17:47:25,497", "name": "google_adk.google.adk.cli.utils.local_storage", "levelname": "INFO", "message": "Using per-agent session storage rooted at /home/xbill/multi-agent/agents", "filename": "local_storage.py", "lineno": 84}
{"asctime": "2026-04-10 17:47:25,497", "name": "google_adk.google.adk.cli.utils.local_storage", "levelname": "INFO", "message": "Using file artifact service at /home/xbill/multi-agent/agents/researcher/.adk/artifacts", "filename": "local_storage.py", "lineno": 110}
{"asctime": "2026-04-10 17:47:25,498", "name": "google_adk.google.adk.cli.utils.service_factory", "levelname": "INFO", "message": "Using in-memory memory service", "filename": "service_factory.py", "lineno": 266}
{"asctime": "2026-04-10 17:47:25,501", "name": "google_adk.google.adk.cli.utils.local_storage", "levelname": "INFO", "message": "Creating local session service at /home/xbill/multi-agent/agents/researcher/.adk/session.db", "filename": "local_storage.py", "lineno": 60}
Running agent researcher, type exit to exit.
[user]:

Test The ADK Web Interface

This tests the ADK agent interactions with a browser:

xbill@penguin:~/multi-agent/agents$ adk web --host 0.0.0.0
/home/xbill/.pyenv/versions/3.13.13/lib/python3.13/site-packages/google/adk/features/_feature_decorator.py:72: UserWarning: [EXPERIMENTAL] feature FeatureName.PLUGGABLE_AUTH is enabled.
  check_feature_enabled()
2026-04-10 17:49:11,850 - INFO - service_factory.py:266 - Using in-memory memory service
2026-04-10 17:49:11,850 - INFO - local_storage.py:84 - Using per-agent session storage rooted at /home/xbill/multi-agent/agents
2026-04-10 17:49:11,850 - INFO - local_storage.py:110 - Using file artifact service at /home/xbill/multi-agent/agents/.adk/artifacts
/home/xbill/.pyenv/versions/3.13.13/lib/python3.13/site-packages/google/adk/cli/fast_api.py:198: UserWarning: [EXPERIMENTAL] InMemoryCredentialService: This feature is experimental and may change or be removed in future versions without notice. It may introduce breaking changes at any time.
  credential_service = InMemoryCredentialService()
/home/xbill/.pyenv/versions/3.13.13/lib/python3.13/site-packages/google/adk/auth/credential_service/in_memory_credential_service.py:33: UserWarning: [EXPERIMENTAL] BaseCredentialService: This feature is experimental and may change or be removed in future versions without notice. It may introduce breaking changes at any time.
  super(). __init__ ()
INFO: Started server process [16063]
INFO: Waiting for application startup.

Then use the web interface — either on the local interface 127.0.0.1 or the catch-all web interface 0.0.0.0 -depending on your environment:

Special note for Google Cloud Shell Deployments- add a CORS allow_origins configuration exemption to allow the ADK agent to run:

adk web --host 0.0.0.0 --allow_origins 'regex:.*'

Configure Cloud Run MCP Server

Basic setup instructions are available here:

Authenticate to Google and Google Cloud MCP servers | Google Cloud Documentation

A sample Cloud Run MCP server script is here:

source cloudrun-mcp.sh

This will enable the Cloud Run MCP server:

--- Setting up Cloud Run MCP for project: aisprint-491218 ---
Enabling Services...
Operation "operations/acat.p2-289270257791-6c5ec831-6f9a-4d44-b221-9ad9f3734f5e" finished successfully.

Testing the MCP server:

gemini
/mcp list

🟢 mcp_cloudrun - Ready (5 tools, 1 prompt)
  Tools:
  - mcp_cloudrun_deploy_service_from_archive
  - mcp_cloudrun_deploy_service_from_file_contents
  - mcp_cloudrun_deploy_service_from_image
  - mcp_cloudrun_get_service
  - mcp_cloudrun_list_services
  Prompts:
  - deploy

Multi Agent Design

The multi-agent deployment consists of 5 agents:

Researcher
Judge
Orchestrator
Content Builder
Course Builder

This article provides a breakdown of the multi-agent architecture:

Multi-Agent A2A with the Agent Development Kit(ADK), Cloud Run, and Gemini CLI

Running/Testing/Debugging Locally

The main Makefile has been extended with extensive targets for managing the agents on the local development environment.

The key targets include:

xbill@penguin:~/multi-agent$ make help
Available commands:
  install - Install all dependencies for root, agents, and app
  start - Start all services locally (alias for start-local)
  stop - Stop all local services (alias for stop-local)
  run - Start all services locally (alias for start-local)
  local - Show local service URLs
  start-local - Start all local services in background
  stop-local - Stop all local processes
  test - Run all tests (pytest)
  test-researcher - Test the Researcher agent directly
  test-judge - Test the Judge agent directly
  test-orchestrator - Test the Orchestrator logic
  lint - Run linting checks (ruff)
  deploy - Deploy all services to Cloud Run
  destroy - Delete all Cloud Run services
  clean - Remove caches and logs

First check for local running agents:

xbill@penguin:~/multi-agent$ make status
Checking status of locally running agents and servers...
--- Network Status ---
No services listening on expected ports (8000-8004, 5173).
--- Process Status ---
No matching processes found.

Then all the agents can be started together:

xbill@penguin:~/multi-agent$ make start
Stopping any existing agent and server processes...
Starting all agents in background...
Waiting for sub-agents to start...
All agents started. Logs: researcher.log, judge.log, content_builder.log, orchestrator.log
Starting App Backend in background...
Starting Frontend dev server in background...
All services started. Logs: researcher.log, judge.log, content_builder.log, orchestrator.log, backend.log, frontend.log
Frontend: http://localhost:5173
Backend: http://localhost:8000

make status

--- Local Service URLs ---
Frontend: [http://localhost:5173](http://localhost:5173)
Backend: [http://localhost:8000](http://localhost:8000) (main app)
Agents:
  Researcher: [http://localhost:8001](http://localhost:8001)
  Judge: [http://localhost:8002](http://localhost:8002)
  Content Builder: [http://localhost:8003](http://localhost:8003)
  Orchestrator: [http://localhost:8004](http://localhost:8004)

The entire project can be linted and tested as unit:

> make lint
✦ I will run the linting checks again as requested.
✦ All 32 tests passed successfully.

Then connect to the front end locally:

And the entire agent system will run in the local environment:

Local Logging / Debugging

Gemini CLI has full access to the local agent logs for debugging and troubleshooting:

✦ The local logs confirm a successful multi-agent run for the topic "the history of the internet." Here is a summary of the execution flow:

   * Orchestrator (orchestrator.log): Successfully managed the pipeline. It captured the topic, coordinated the Research-Judge loop, and escalated to
     content generation after the first iteration.
   * Researcher (researcher.log): Gathered approximately 9.2KB of research data using the google_search tool. It completed its task with a 200 OK
     response.
   * Judge (judge.log): Evaluated the findings and gave a "pass" status. The judge noted that the research was "comprehensive, accurate, and
     well-structured," making it an "excellent foundation for educational content."
   * Content Builder (content_builder.log): Successfully transformed the validated research into a structured Markdown course module and finished
     execution.
   * Backend & Frontend (backend.log, frontend.log): The FastAPI backend (port 8000) and Vite frontend (port 5173) are running and communicating with
     the orchestrator as expected.

  Notable Observation:
   * A minor warning appeared in researcher.log: Researcher callback: 'response' not found or has no parts. However, this did not impact the
     successful completion of the research or the overall pipeline.

  All services are currently active and logged at the DEBUG level as specified in run_local.sh.

Deploying to Google Cloud Run

The project level Makefile has targets for managing the Agent deployment to serverless endpoints:

xbill@penguin:~/multi-agent$ make help
Available commands:
  install - Install all dependencies for root, agents, and app
  start - Start all services locally (alias for start-local)
  stop - Stop all local services (alias for stop-local)
  run - Start all services locally (alias for start-local)
  local - Show local service URLs
  start-local - Start all local services in background
  stop-local - Stop all local processes
  test - Run all tests (pytest)
  test-researcher - Test the Researcher agent directly
  test-judge - Test the Judge agent directly
  test-orchestrator - Test the Orchestrator logic
  lint - Run linting checks (ruff)
  deploy - Deploy all services to Cloud Run
  destroy - Delete all Cloud Run services
  clean - Remove caches and logs

A utility script check the deployment to Cloud Run:

make status

--- Cloud Deployment Status ---
Checking deployment status for AI Course Creator services...
   SERVICE REGION URL LAST DEPLOYED BY LAST DEPLOYED AT
✔ content-builder us-central1 [https://content-builder-1056842563084.us-central1.run.app](https://content-builder-1056842563084.us-central1.run.app) xbill@glitnir.com 2026-04-10T20:18:40.053541Z
✔ course-creator us-central1 [https://course-creator-1056842563084.us-central1.run.app](https://course-creator-1056842563084.us-central1.run.app) xbill@glitnir.com 2026-04-10T20:19:39.704879Z
✔ judge us-central1 [https://judge-1056842563084.us-central1.run.app](https://judge-1056842563084.us-central1.run.app) xbill@glitnir.com 2026-04-10T20:18:40.417046Z
✔ orchestrator us-central1 [https://orchestrator-1056842563084.us-central1.run.app](https://orchestrator-1056842563084.us-central1.run.app) xbill@glitnir.com 2026-04-10T20:19:01.850264Z
✔ researcher us-central1 [https://researcher-1056842563084.us-central1.run.app](https://researcher-1056842563084.us-central1.run.app) xbill@glitnir.com 2026-04-10T20:18:38.584952Z 0.0s 0.0s

You can submit the build for cloud deployment:

xbill@penguin:~/multi-agent$ make deploy
Building all images using Cloud Build for project comglitn...
gcloud builds submit --project "comglitn" --config cloudbuild.yaml .

Once the containers are deployed- you can then get the endpoint:

./deploy.sh orchestrator
Using project comglitn.
Using compute region us-central1.
Deploying orchestrator...
Deploying container to Cloud Run service [orchestrator] in project [comglitn] region [us-central1]
✓ Deploying... Done.                                                                                                                             
  ✓ Creating Revision...                                                                                                                         
  ✓ Routing traffic...                                                                                                                           
  ✓ Setting IAM Policy...                                                                                                                        
Done.                                                                                                                                            
Service [orchestrator] revision [orchestrator-00002-9jg] has been deployed and is serving 100 percent of traffic.
Service URL: https://orchestrator-1056842563084.us-central1.run.app
make[1]: Leaving directory '/home/xbill/multi-agent'
Deploying course-creator app...
make[1]: Entering directory '/home/xbill/multi-agent'
./deploy.sh course-creator
Using project comglitn.
Using compute region us-central1.
Deploying course-creator...
Deploying container to Cloud Run service [course-creator] in project [comglitn] region [us-central1]
✓ Deploying... Done.                                                                                                                             
  ✓ Creating Revision...                                                                                                                         
  ✓ Routing traffic...                                                                                                                           
  ✓ Setting IAM Policy...                                                                                                                        
Done.                                                                                                                                            
Service [course-creator] revision [course-creator-00002-f74] has been deployed and is serving 100 percent of traffic.
Service URL: https://course-creator-1056842563084.us-central1.run.app
make[1]: Leaving directory '/home/xbill/multi-agent'


xbill@penguin:~/multi-agent$ make endpoint
Service URLs:
NAME URL
content-builder [https://content-builder-fgasxpwzoq-uc.a.run.app](https://content-builder-fgasxpwzoq-uc.a.run.app)
course-creator [https://course-creator-fgasxpwzoq-uc.a.run.app](https://course-creator-fgasxpwzoq-uc.a.run.app)
judge [https://judge-fgasxpwzoq-uc.a.run.app](https://judge-fgasxpwzoq-uc.a.run.app)
orchestrator [https://orchestrator-fgasxpwzoq-uc.a.run.app](https://orchestrator-fgasxpwzoq-uc.a.run.app)
researcher [https://researcher-fgasxpwzoq-uc.a.run.app](https://researcher-fgasxpwzoq-uc.a.run.app)

The service will be visible in the Cloud Run console:

Running the Web Interface

Start a connection to the Cloud Run deployed app:

https://course-creator-fgasxpwzoq-uc.a.run.app

Then connect to the app :

Then use online course generator:

Google Cloud Run MCP Server with Gemini CLI

Once the entire agent system has been deployed. The Cloud Run MCP server can be used for visibility of the application directly from Gemini CLI:

🟢 mcp_cloudrun - Ready (5 tools, 1 prompt)

  Tools:
  - mcp_cloudrun_deploy_service_from_archive
  - mcp_cloudrun_deploy_service_from_file_contents
  - mcp_cloudrun_deploy_service_from_image
  - mcp_cloudrun_get_service
  - mcp_cloudrun_list_services
  Prompts:
  - deploy

The status can be checked:

✦ Using the mcp_cloudrun_list_services tool for project aisprint-491218 in us-central1, I've confirmed the following services are currently deployed:

   - course-creator: https://course-creator-wgcq55zbfq-uc.a.run.app
   - orchestrator: https://orchestrator-wgcq55zbfq-uc.a.run.app
   - judge: https://judge-wgcq55zbfq-uc.a.run.app
   - researcher: https://researcher-wgcq55zbfq-uc.a.run.app
   - content-builder: https://content-builder-wgcq55zbfq-uc.a.run.app

and in-depth service status:

> use the MCP call mcp_cloudrun_get_service course-creator
╭───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ Action Required │
│ │
│ ? get_service (mcp_cloudrun MCP Server) {"project":"aisprint-491218","region":"us-central1","name":"course-creator"} │
│ │
│ MCP Server: mcp_cloudrun │
│ Tool: get_service │
│ │
│ MCP Tool Details: │
│ (press Ctrl+O to expand MCP tool details) │
│ Allow execution of MCP tool "get_service" from server "mcp_cloudrun"? │
│ │
│ 1. Allow once │
│ 2. Allow tool for this session │
│ 3. Allow all server tools for this session │
│ ● 4. Allow tool for all future sessions │
│ 5. No, suggest changes (esc) │
│ │
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

✦ The details for the course-creator service in project aisprint-491218 (region us-central1) are as follows:

   - URI: https://course-creator-wgcq55zbfq-uc.a.run.app
   - Status: Ready (Succeeded)
   - Image: gcr.io/aisprint-491218/course-creator
   - Environment Variables:
     - AGENT_SERVER_URL: https://orchestrator-wgcq55zbfq-uc.a.run.app
     - AGENT_NAME: orchestrator
     - GOOGLE_CLOUD_PROJECT: aisprint-491218
   - Port: 8080 (http1)
   - Last Updated: 2026-04-12T17:03:11.304251Z

Final Gemini CLI Code Review

As a final step — Gemini CLI was used for a full code review of the project:

✦ This multi-agent project is a well-engineered application of ADK 2.5 and the A2A protocol. The separation of specialized agents (Researcher,
  Judge, Content Builder) coordinated by a central Orchestrator demonstrates a mature microservice-oriented design.

Key Strengths
   * Coordinated Orchestration: The use of SequentialAgent and LoopAgent creates a robust, iterative research-judge cycle.
   * Resilient State Flow: The strategy of passing data through event content and "recovering" it via before_agent_callback heuristics is a clever
     way to handle state across distributed, independent session stores.
   * Polished Streaming: The web app's SSE implementation, specifically the greedy overlap deduplication (merge_strings) and system message
     cleanup, ensures a high-quality user experience despite the inherent noise in multi-agent LLM streams.
   * Cloud-Native Readiness: Using Identity Tokens for authenticated service-to-service communication and middleware for dynamic A2A URL rewriting
     makes the system ready for production deployment on Cloud Run.

Summary

The Agent Development Kit (ADK) was used to build a multi-agent system with A2A support using the Gemini Flash LLM Model. This application was tested locally with Gemini CLI and then deployed to Google Cloud Run. The Official Google Cloud MCP server was used to review the deployment directly from Gemini CLI. Finally, Gemini CLI was used for a complete project code review.

Top comments (2)

Archit Mittal • Apr 20

Cloud Run as an MCP host makes a lot of sense — the cold-start economics line up well with bursty agent workloads. One operational thing I'd watch: MCP tool calls that internally trigger downstream HTTP calls can blow past Cloud Run's concurrency limits faster than you'd expect if the LLM aggressively parallelizes. I've started setting a semaphore per tool in the MCP server itself, independent of Cloud Run's request concurrency, so the model can't accidentally DoS a backing API. Have you hit anything similar under load?

xbill Google Developer Experts • Apr 25

I think there are a lot of open questions when these things start getting load. I did some basic A2A benchmarking last fall but a lot of these questions are still open. ideally cloud run should automanage this but I can see some cases where it doesn't have the right signalling to catch the load