DEV Community: xbill

Meet 'Devto-Blogger': The Hermes Agent Skill That Automatically Writes Your Technical Blog Posts

xbill — Fri, 29 May 2026 15:52:00 +0000

If you are an open-source maintainer, developer advocate, or builder, you know the cycle: you build an amazing tool, but writing the launch blog post, documentation, or tutorial takes hours.

For the Hermes Agent Challenge, I wanted to build something that solves this exact problem. I built devto-blogger, a custom, prompt-driven skill for the brand new Hermes Agent by Nous Research.

It autonomously scans any workspace or codebase, analyzes the architecture, and drafts a fully-structured, rich Markdown technical article ready for publication on DEV.

🚀 What is Hermes Agent?

Hermes Agent is an open-source agentic system built by Nous Research (the lab behind the famous Hermes LLM models). Unlike basic coding copilots or simple chatbot wrappers, Hermes is:

Environment-Aware: It runs sandboxed in Docker, Modal, Daytona, SSH, or locally.
Connected: It interfaces with Telegram, Discord, Slack, WhatsApp, and more.
Closed Learning Loop: It has persistent memory and creates custom skills on the fly from its own experience.

🔧 The Entry: The `devto-blogger` Skill

In Hermes Agent, a "skill" is defined by a simple, declarative Markdown file (SKILL.md) located in the ~/.hermes/skills/ directory. By utilizing a prompt-driven skill structure, we can guide the agent's behavior globally without writing complex Python orchestration scripts.

Here is the custom skill I designed and installed for this challenge:

---
name: devto-blogger
description: "Scan the codebase and generate a comprehensive Dev.to technical blog post draft."
version: 1.0.0
author: Hermes Agent Developer
license: MIT
platforms: [linux, macos, windows]
metadata:
  hermes:
    tags: [devto, blogging, documentation, markdown, technical-writing]
    related_skills: [plan, design-md]
---

# Dev.to Technical Blogger Skill

Use this skill when you need to write an in-depth technical post, review, or tutorial about the active workspace or codebase.

## Core Behavior
1. **Codebase Inspection**: Scan repository structure, configuration, and dependencies.
2. **Drafting Strategy**: Write a high-quality technical article suitable for DEV.
3. **Content Structure**: Outline introduction, the problem, technical walkthrough, and implementation code snippets.
4. **Output File**: Save the draft under `drafts/devto-submission.md`.

💡 How It Works Under the Hood

When I run Hermes in my workspace and instruct it to write a post:

hermes -z "Generate a blog post about our new project"

Activation: Hermes detects that the task matches the devto-blogger skill signature.
Analysis: The agent uses its built-in codebase tools to list directories, view file structures, and parse key files (like package.json, requirements.txt, or config files).
Drafting: It synthesizes the information and writes a custom, highly engaging Dev.to markdown draft directly to the /drafts folder.

🏆 Why This Matters for the Open Source Community

Writing documentation and developer outreach content is often the most neglected part of software development. By leveraging Hermes Agent:

No More Cold Starts: Get a comprehensive first draft in seconds.
Accurate Code Snippets: The agent reads the exact code from your workspace, ensuring snippets are correct.
Zero Effort Context: Because Hermes is fully sandboxed, it acts directly on your working environment.

🛠️ Try It Yourself!

To install Hermes Agent and configure your own custom skills:

# Install Hermes Agent
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash

# Run setup
hermes setup

Place the SKILL.md template above in ~/.hermes/skills/creative/devto-blogger/SKILL.md and start automating your developer outreach!

Built with love using Hermes Agent during the 2026 Hermes Agent Challenge.

Gemma 4 Deployment with NVIDIA L4, MCP, Cloud Run, and Antigravity CLI

xbill — Thu, 28 May 2026 20:13:09 +0000

This article provides a step by step deployment guide for Gemma 4 to a Google Cloud Run hosted GPU enabled system,. A suite of Python MCP tools is built to simplify management of the vLLM hosted Gemma 4 deployment with Antigravity CLI.

What is this project trying to Do?

This project is a DevOps/SRE assistant that uses a Gemma 4 model hosted on Cloud Run with GPU. It provides tools to provision the Docker container and deploy the model, as well as for observability and performance testing.

This project is similar to a previous project that targeted GPU hosted Gemma4 instances on GCP:

Gemma-SRE: Self-Hosted vLLM Infrastructure Agent

Antigravity CLI

Antigravity CLI is the follow-on successor to Gemini CLI- the terminal driven, agent assisted coding tool.

Full details on installing Antigravity CLI are here:

Getting Started with Antigravity CLI

Testing the Antigravity CLI Environment

Once you have all the tools in place- you can test the startup of Antigravity CLI.

You will need to authenticate with a Google Cloud Project or your Google Account:

agy

This will start the interface:

Full Installation Instructions

The detailed installation instructions for Antigravity CLI are here:

Getting Started with Antigravity CLI

Python MCP Documentation

The official GitHub Repo provides samples and documentation for getting started:

GitHub - modelcontextprotocol/python-sdk: The official Python SDK for Model Context Protocol servers and clients

Where do I start?

The strategy for starting MCP development for model management is a incremental step by step approach.

First, the basic development environment is setup with the required system variables, and a working Antigravity CLI configuration.

Then, a minimal Python MCP Server is built with stdio transport. This server is validated with Antigravity CLI in the local environment.

This setup validates the connection from Antigravity CLI to the local server via MCP. The MCP client (Antigravity CLI) and the Python MCP server both run in the same local environment.

Setup the Basic Environment

At this point you should have a working Python environment and a working Antigravity CLI installation. The next step is to clone the GitHub samples repository with support scripts:

cd ~
git clone https://github.com/xbill9/gemma4-tips

Then run init.sh from the cloned directory.

The script will attempt to determine your shell environment and set the correct variables:

gpu-vllm-devops-agent
source init.sh

If your session times out or you need to re-authenticate- you can run the set_env.sh script to reset your environment variables:

gpu-vllm-devops-agent
source set_env.sh

Variables like PROJECT_ID need to be setup for use in the various build scripts- so the set_env script can be used to reset the environment if you time-out.

Model Management Tool with MCP Stdio Transport

One of the key features that the standard MCP libraries provide is abstracting various transport methods.

The high level MCP tool implementation is the same no matter what low level transport channel/method that the MCP Client uses to connect to a MCP Server.

The simplest transport that the SDK supports is the stdio (stdio/stdout) transport — which connects a locally running process. Both the MCP client and MCP Server must be running in the same environment.

The connection over stdio will look similar to this:

# Initialize FastMCP server
mcp = FastMCP("Self-Hosted vLLM DevOps Agent")

Running the Python Code

First- switch the directory with the Python version of the MCP sample code:

~/gemma4-tips/gpu-vllm-devops-agent

Run the release version on the local system:

make install
Processing ./.

The project can also be linted:

xbill@penguin:~/gemma4-tips/gpu-vllm-devops-agent$ make lint
ruff check .
All checks passed!
ruff format --check .
6 files already formatted
mypy .
Success: no issues found in 6 source files

And a test run:

xbill@penguin:~/gemma4-tips/gpu-vllm-devops-agent$ make test
python test_agent.py
2026-05-28 12:34:44,071 - vllm-devops-agent - INFO - Initializing DevOps Agent MCP Server...
.........2026-05-28 12:34:44,110 - vllm-devops-agent - INFO - Querying Cloud Run model with prompt: 'Hello...'
2026-05-28 12:34:44,110 - vllm-devops-agent - INFO - Model response: 'Response from Gemma...'
.2026-05-28 12:34:44,112 - vllm-devops-agent - INFO - Querying model with stats with prompt: 'Hello...'
2026-05-28 12:34:44,112 - vllm-devops-agent - INFO - Model response with stats: TTFT=0.000s, TotalTime=0.000s
.......
----------------------------------------------------------------------
Ran 17 tests in 0.023s

OK

MCP stdio Transport

One of the key features that the MCP protocol provides is abstracting various transport methods.

The high level tool MCP implementation is the same no matter what low level transport channel/method that the MCP Client uses to connect to a MCP Server.

In this project Antigravity CLI is used as the MCP client to interact with the Python MCP server code.

Antigravity CLI mcp_config.json

A sample MCP server file is provided in the .agents directory:

{
  "mcpServers": {
    "gpu-devops-agent": {
      "command": "python3",
      "args": [
        "/home/xbill/gemma4-tips/gpu-vllm-devops-agent/server.py"
      ],
      "env": {
        "GOOGLE_CLOUD_PROJECT": "aisprint-491218",
        "GOOGLE_CLOUD_LOCATION": "us-east4",
        "VLLM_BASE_URL": "https://vllm-gemma-4-e4b-it-289270257791.us-east4.run.app",
        "MODEL_NAME": "/mnt/models/gemma-4-E4B-it"
      }
    }
  }
}

Validation with Antigravity CLI

The final connection test uses Antigravity CLI as a MCP client with the Python code providing the MCP server:

MCP Servers

Configured (~/.gemini/antigravity-cli/mcp_config.json)
> ✓ developer-knowledge Tools: search_documents, answer_query, get_documents

Plugins (~/.gemini/antigravity-cli/plugins)
   ✓ google-dev-knowledge Tools: search_documents, answer_query, get_documents
   ✓ gpu-devops-agent Tools: save_hf_token, get_vllm_endpoint, list_vertex_models, list_bucket_models,
                       analyze_cloud_logging, +21 more

Getting Started with Gemma 4 on GPU

Cloud Run GPU is tied to quota limits:

> check cloud run quota in us-east4

This will return the current quotas:

  Presenting Quotas Clearly
  Here are the Google Cloud Run quotas for project aisprint-491218 in the us-east4 region:

  ### 📋 Core Resource Quotas

  • Services ( ServicesPerProject ): 1,000 limit per region.
  • Jobs ( JobsPerProject ): 1,000 limit per region.
  • Worker Pools ( WorkerPoolsPerProject ): 1,000 limit per region.
  • Active Revisions ( ActiveRevisionsPerProject ): 4,000 limit.
  • Running Executions ( RunningExecutionsPerProject ): 1,000 limit.
  • Tags ( TagsPerProject ): 2,000 limit.

  ### ⚡ Compute & Storage Allocations

  • Total CPU Allocation ( CpuAllocPerProjectRegion ): 5,000,000 milli vCPUs (equivalent to 5,000 vCPUs).
  • Total Memory Allocation ( MemAllocPerProjectRegion ): 10,737,418,240,000 bytes (equivalent to 10 TiB).
  • Max Per-Instance Ephemeral Disk ( MaxPerInstanceEphemeralDiskSizePerProjectRegion ): 10,737,418,240 bytes 
  (equivalent to 10 GiB).
  • Direct VPC Egress Instance Limit ( MaxInstancesLimitWithDirectVpcEgressPerProjectRegion ): 100 instances.

  ### 🎮 GPU Quotas

  • Nvidia L4 GPUs without zonal redundancy ( NvidiaL4GpuAllocNoZonalRedundancyPerProjectRegion ): 3 GPUs.
  • Nvidia L4 GPUs with zonal redundancy ( NvidiaL4GpuAllocPerProjectRegion ): No specific limits applied.
  • Nvidia RTX Pro 6000 GPUs ( NvidiaRtxPro6000GpuAllocNoZonalRedundancyPerProjectRegion ): No specific limits applied.

  ### ⏱️ API Rate Limits (per minute)

  • Read Requests ( ReadPerMinutePerProjectRegion ): 3,000 requests/min.
  • Write Requests ( WritePerMinutePerProjectRegion ): 180 requests/min.
  • Job Run Requests ( JobRunPerMinutePerProjectRegion ): 180 requests/min.
  • Inlined Source Deployments ( InlinedSourceDeployPerMinutePerProjectRegion ): 5 requests/min.
  • Source Deploy Bytes ( SourceDeployBytesPerMinutePerProjectRegion ): 2,000,000,000 bytes (equivalent to 2 GB /
  min).

You can also use the Antigravity CLI to submit a quota increase:

> request RTX Pro 6000 GPUs cloud run quota of 3 in us-east4
● google-dev-knowledge/search_documents(Quota increase command search)
● Read(/home/xbill/.gemini/antigravity-cli/brain/5a6...4e9cc0a/.system_generated/steps/32/output.txt) (ctrl+o to expand)

▸ Thought for 1s, 511 tokens
  Clarifying The Scope

● Search(Searching for L4 GPU metric)
● Read(/home/xbill/.gemini/antigravity-cli/brain/5a65b024-1b52-4d0b-b816-652984e9cc0a/scratch/run_quotas.yaml)
● Search(Grep search for RTX Pro 6000)
● Read(/home/xbill/.gemini/antigravity-cli/brain/5a65b024-1b52-4d0b-b816-652984e9cc0a/scratch/run_quotas.yaml)
● Bash(gcloud beta quotas preferences create --service=run.googleapis.com --project=aisprint-491218 --quota-
id=NvidiaRtx...)
● Bash(gcloud beta quotas preferences create --service=run.googleapis.com --project=aisprint-491218 -...) (ctrl+o to
expand)

  I have submitted requests to increase both RTX Pro 6000 GPU quotas to 3 GPUs (represented as 3000 milli GPUs in the
  API) in us-east4 for project aisprint-491218 .

  Here are the details of the submitted quota preferences:

  ### 1. With Zonal Redundancy

  • Quota ID: NvidiaRtxPro6000GpuAllocPerProjectRegion 
  • Service: run.googleapis.com 
  • Region: us-east4 
  • Requested Limit: 3000 milli GPUs (3 GPUs)
  • Preference Name: projects/aisprint-491218/locations/global/quotaPreferences/bdf815c3-a947-4069-91ae-8b6c6c0dd345 
  • Status: Submitted (reconciling)

  ### 2. Without Zonal Redundancy

  • Quota ID: NvidiaRtxPro6000GpuAllocNoZonalRedundancyPerProjectRegion 
  • Service: run.googleapis.com 
  • Region: us-east4 
  • Requested Limit: 3000 milli GPUs (3 GPUs)
  • Preference Name: projects/aisprint-491218/locations/global/quotaPreferences/2f6ccda1-76a5-4867-8f89-78948cd725b8 
  • Status: Submitted (reconciling)

The Official vLLM repo also has Gemma4 specific information:

Releases · vllm-project/vllm

Model Lifecycle Management via MCP

The MCP tools provide a complete suite of agent-oriented operations for managing vLLM deployment on Cloud Run or a TPU.

Overview of MCP tools :

 ### ⚙️ Active Environment Configuration

  • Project ID: aisprint-491218 
  • Region: us-east4 
  • Model Name: /mnt/models/gemma-4-E4B-it 
  • Active Serving Endpoint: Cloud Run Service https://vllm-gemma-4-e4b-it-289270257791.us-east4.run.app
  • GCS Storage Bucket: aisprint-491218-bucket 
  ──────
  ### 🧰 Summary of Available MCP Tools

  #### 🐳 Infrastructure & Deployment

  • deploy_vllm : Deploys vLLM to Cloud Run GPU (NVIDIA L4 in us-east4 ).
  • destroy_vllm : Deletes the Cloud Run vLLM service.
  • status_vllm : Checks the status of the Cloud Run vLLM service.
  • update_vllm_scaling : Configures min/max instances.
  • get_vllm_deployment_config / get_vllm_gpu_deployment_config : Outputs deployment configs for Cloud Run and GKE
  (NVIDIA L4).
  • check_gpu_quotas : Checks GPU resource availability and quotas.

  #### 📦 Model Management

  • list_vertex_models / list_bucket_models : Lists available model registries and GCS weight locations.
  • save_hf_token : Secures your Hugging Face API token in Secret Manager.
  • get_vertex_ai_model_copy_instructions / get_huggingface_model_copy_instructions : Guide commands to ingest
  models.

  #### 📊 Monitoring, Diagnostics & Benchmarks

  • get_system_status : Pulls high-level dashboard metrics and active settings.
  • verify_model_health : Verifies latency and connectivity.
  • run_benchmark : Performs concurrency sweeps and latency profiling.
  • analyze_cloud_logging / analyze_gpu_logs : Diagnoses SRE errors and parses system logs using the self-hosted
  model.
  • suggest_sre_remediation : Automatically generates remediation steps for detected errors.

Deploy The Model

The MCP server tools are used to deploy the model to Cloud Run with TPU:

deploy the model

  The container logs show that the vLLM engine initialized correctly and began loading the model checkpoint files from
  the GCS FUSE mount ( /mnt/models/gemma-4-E4B-it ) at 18:25:53 .

  Since the weights are read on-demand over the network via FUSE and the engine has to compile CUDA graphs, the
  initialization phase takes several minutes before the startup health probe gets a 200 OK at /health . I've set
  another 90-second status timer and will monitor its progress.

Checking System status

The status can be checked with an MCP tool:

> get_system_status

● gpu-devops-agent/get_system_status(System status check) (ctrl+o to expand)

  The system status is fully online:

  ### 🌀 GPU Cloud Run System Status

  • vLLM Health: 🟢 Online (Endpoint https://vllm-gemma-4-e4b-it-289270257791.us-east4.run.app)
  • Cloud Run Service Status: 🟢 Ready

Cross Check The Deployed Model

Once the model starts locally- the MCP tool allows for model verification:

> verify_model_health

● gpu-devops-agent/verify_model_health(Model health check) (ctrl+o to expand)

  The model health check has successfully passed:

  ### 🏥 Model Health Verification

  • Status: ✅ PASSED
  • Model Path: /mnt/models/gemma-4-E4B-it 
  • Latency: 1.35 seconds 
  • Response Snippet: "Hello! Yes, I am working. I am Gemma 4, a large la..."

──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
>

and model details:

> get_active_models

● gpu-devops-agent/get_model_details(Model details retrieval) (ctrl+o to expand)

  The active model details from the running vLLM service are as follows:

  ### 🧩 Active Model details

  • Endpoint URL: https://vllm-gemma-4-e4b-it-289270257791.us-east4.run.app 
  • Active Model ID: /mnt/models/gemma-4-E4B-it 
  • Health Status: Healthy ✅

and the endpoint:

> get_endpoint

● gpu-devops-agent/get_endpoint(Endpoint retrieval) (ctrl+o to expand)

  The active endpoint URL for the Cloud Run vLLM deployment is:

  • Endpoint URL: https://vllm-gemma-4-e4b-it-289270257791.us-east4.run.app (🟢 Online)

──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
>

and status:

> get_system_details

● Bash(gcloud run services describe vllm-gemma-4-e4b-it --region us-east4 --project aisprint-491218 -...) (ctrl+o to
expand)

  Here are the detailed configuration and system specs of the active Cloud Run vLLM deployment:

  ### ⚙️ Hardware & Compute Resources

  • Accelerator: NVIDIA L4 GPU (1 unit)
  • CPU Limit: 8 vCPUs
  • Memory Limit: 32 GiB 
  • Execution Environment: Gen2
  • Storage Mount: aisprint-491218-bucket (GCS Bucket mounted at /mnt/models via GCS Fused volume driver, read-
  only)

  ### 🚀 vLLM Engine Configuration

  • Container Image: vllm/vllm-openai:latest 
  • Entrypoint: python3 -m vllm.entrypoints.openai.api_server 
  • Startup Arguments:
      • --model=/mnt/models/gemma-4-E4B-it 
      • --dtype=bfloat16 
      • --max-model-len=16384 
      • --gpu-memory-utilization=0.95 
      • --kv-cache-dtype=fp8 
      • --limit-mm-per-prompt={} 
      • --tensor-parallel-size=1 
      • --max-num-seqs=8 
      • --max-num-batched-tokens=4096 
      • --enable-auto-tool-choice 
      • --tool-call-parser=gemma4 
      • --reasoning-parser=gemma4 

  ### 📈 Network & Autoscaling

  • Container Concurrency: 4 simultaneous requests
  • Autoscaling Target: Max Revision Scale = 1 
  • Startup Probe: HTTP GET /health on port 8000 (Initial delay: 180s , Timeout: 60s )

──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
>

Check the Cloud Run deployment status:

> show the cloud run docker status

● Bash(gcloud run services describe vllm-gemma-4-e4b-it --region us-east4 --project aisprint-491218 -...) (ctrl+o to
expand)

  In Cloud Run, the "docker container status" is tracked via its Service Conditions and Revisions.

  Here is the status of the container deployment:

  ### 📦 Container Service Status

  • Service URL: https://vllm-gemma-4-e4b-it-wgcq55zbfq-uk.a.run.app 
  • Active Container Image: vllm/vllm-openai:latest 
  • Latest Revision: vllm-gemma-4-e4b-it-00023-n4c (healthy and serving 100% of traffic)

  ### 🏥 System Status Conditions

   Condition Type | Status | Last Transition Time | Description
  -----------------------|-----------|------------------------|--------------------------------------------------------
    Ready | 🟢 True | 2026-05-28T18:32:41Z | The container is healthy, initialized, and accepting
                         | | | incoming requests.
    ConfigurationsReady | 🟢 True | 2026-05-28T18:23:35Z | Service parameters, environment variables, volumes,
                         | | | and hardware specs are valid.
    RoutesReady | 🟢 True | 2026-05-28T18:32:41Z | Traffic routing configuration and network load
                         | | | balancing are ready.

──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
>

Check the Cloud Run Service

Once Deployed — the model is visible as a standard Cloud Run Service:

Review the Model

Check the model details:

> what are the model details?

● gpu-devops-agent/get_model_details(Model details retrieval) (ctrl+o to expand)

  The deployed model details are as follows:

  ### 🧩 Active Serving Model

  • Model ID / Name: /mnt/models/gemma-4-E4B-it 
  • Object Type: model (OpenAI-compatible /v1/models format)
  • Serving Platform: vLLM 
  • Health Status: Healthy ✅

  ### ℹ️ About Gemma 4 (E4B-it)

  • Model Family: Gemma 4 (Google's lightweight, state-of-the-art open models built from the same research and
  technology used to create the Gemini models).
  • Variant: E4B-it (Experimental ~4 Billion parameter Instruction-Tuned model).
  • Optimization: Configured to run in bfloat16 precision with FP8 KV caching to optimize memory usage and processing
  speed on the NVIDIA L4 GPU.
  • Context Window: Configured for up to 16,384 tokens.

Once everything is online — you can send queries to the Model:

> query_gemma4_with_stats what is gemma4?

● gpu-devops-agent/query_gemma4_with_stats(Query Gemma 4) (ctrl+o to expand)

  Here is the response from the model, along with performance statistics:

  ### 💬 Model Response

  │ I am Gemma 4. I am a Large Language Model developed by Google DeepMind. I am an open weights model.

  ### 📊 Serving Performance Stats

  • Model ID: /mnt/models/gemma-4-E4B-it 
  • Time to First Token (TTFT): 0.094s (Excellent startup latency)
  • Tokens per Second: 25.63 tokens/s 
  • Total Generation Time: 1.07s

Benchmark the Local Model

The local model was benchmarked:

 The benchmark sweep has completed successfully. Here are the compiled results across all combinations of context
  window sizes and concurrency levels:

  ### 📈 Concurrency & Context Window Benchmark Sweep (Gemma 4 E4B-it)

   Context Window | Concurrency | Success Rate | Avg Latency (s) | Throughput (tok/s)
  -------------------|------------------------------------|-------------------|------------------|---------------------
   8 | 1 / 2 / 4 / 8 / 16 / 32 / 64 / 128 | 100.0% | 0.69 - 1.88s | 18.90 - 89.64 tok/s
   16 | 1 / 2 / 4 / 8 / 16 / 32 / 64 / 128 | 100.0% | 0.68 - 1.84s | 23.34 - 89.40 tok/s
   32 | 1 / 2 / 4 / 8 / 16 / 32 / 64 / 128 | 100.0% | 0.69 - 1.85s | 22.99 - 88.48 tok/s
   64 | 1 / 2 / 4 / 8 / 16 / 32 / 64 / 128 | 100.0% | 0.15 - 0.49s | 13.48 - 42.61 tok/s
   128 | 1 / 2 / 4 / 8 / 16 / 32 / 64 / 128 | 100.0% | 0.14 - 0.50s | 13.69 - 44.76 tok/s
   256 | 1 / 2 / 4 / 8 / 16 / 32 / 64 / 128 | 100.0% | 0.14 - 0.51s | 13.62 - 43.54 tok/s
   512 | 1 / 2 / 4 / 8 / 16 / 32 / 64 / 128 | 100.0% | 0.15 - 0.51s | 13.03 - 42.63 tok/s
   1024 | 1 / 2 / 4 / 8 / 16 / 32 / 64 / 128 | 100.0% | 0.16 - 0.51s | 12.22 - 41.20 tok/s
   2048 | 1 / 2 / 4 / 8 / 16 / 32 / 64 / 128 | 100.0% | 0.18 - 0.51s | 10.96 - 42.17 tok/s
   4096 | 1 / 2 / 4 / 8 / 16 / 32 / 64 / 128 | 100.0% | 0.21 - 0.56s | 17.45 - 39.08 tok/s
   8192 | 1 / 2 / 4 / 8 / 16 / 32 / 64 / 128 | 100.0% | 0.23 - 0.66s | 17.21 - 34.89 tok/s
   16384 | 1 / 2 / 4 / 8 / 16 / 32 / 64 / 128 | 100.0% | 0.27 - 0.81s | 14.18 - 27.59 tok/s

  ### 🔍 Key Findings

  • 100% Success Rate: Every request across all context window tests completed successfully without a single OOM or
  timeout.
  • Throughput Scaling: High concurrency configurations (16 to 128 concurrent users) hit optimal throughput peaks of 80
  to 90 tok/s for smaller context sizes, and remained highly stable around 25 to 27 tok/s even at the maximum 16,384
  context window.

The analysis is reviewed:

## 🔍 Key Performance Insights

### 1. Throughput Scaling (Left Chart)
* **High Efficiency at Low Concurrency** : With single-user requests (concurrency = 1), throughput is highly consistent acr
oss smaller context sizes.
* **Concurrency Scaling** : As concurrency climbs from 1 to 16, throughput scales up significantly, peaking around **80 to 
90 tokens/second** at smaller context windows. This proves the efficacy of vLLM's continuous batching implementation.
* **Large Context Handling** : At the maximum context window size of **16,384 tokens** , throughput remains extremely stable
 around **25 to 27 tokens/second** , demonstrating that the GPU is fully capable of serving long-context requests under hig
h loads without degradation.

### 2. Latency Characteristics (Right Chart)
* **Sub-Second Response Times** : Across all tested context windows (from 8 to 16,384 tokens), average latency remains well
 below 1 second for standard workloads (concurrency levels 1 to 32).
* **Predictable Scaling** : Even at maximum concurrency (128 concurrent users) and largest context sizes, latency stays und
er **1 second** on average, making this stack highly suitable for real-time applications and SRE automated diagnostic work
flows.

This is visualized:

Summary

The strategy for using MCP for Gemma 4 GPU deployment with Antigravity CLI was validated with a incremental step by step approach.

A minimal stdio transport MCP Server was started from Python source code and validated with Antigravity CLI running as a MCP client in the same local environment. This Python server provided all of the management tools to deploy and troubleshoot Cloud Run Model deployments.

Local Gemma 4 Deployment with MCP and Antigravity CLI

xbill — Thu, 28 May 2026 15:53:51 +0000

This article provides a step by step deployment guide for Gemma 4 to a 13th Gen Intel i7–1360P running Chrome OS Flex. A suite of Python MCP tools is built to simplify management of the Ollama hosted Gemma 4 deployment with Antigravity CLI.

What is this project trying to Do?

This project is a DevOps/SRE assistant that uses a Gemma 4 model self-hosted locally. It provides tools to provision the Docker container and deploy the model, as well as for observability and performance testing.

This project is similar to a previous project that targeted TPU hosted Gemma4 instances on GCP:

Self-hosted Gemma 4 on TPU with vLLM, MCP, ADK, and Gemini CLI

Antigravity CLI

Antigravity CLI is the follow-on successor to Gemini CLI- the terminal driven, agent assisted coding tool.

Full details on installing Antigravity CLI are here:

Getting Started with Antigravity CLI

Testing the Antigravity CLI Environment

Once you have all the tools in place- you can test the startup of Antigravity CLI.

You will need to authenticate with a Google Cloud Project or your Google Account:

agy

This will start the interface:

Full Installation Instructions

The detailed installation instructions for Antigravity CLI are here:

Getting Started with Antigravity CLI

Python MCP Documentation

The official GitHub Repo provides samples and documentation for getting started:

GitHub - modelcontextprotocol/python-sdk: The official Python SDK for Model Context Protocol servers and clients

Where do I start?

The strategy for starting MCP development for model management is a incremental step by step approach.

First, the basic development environment is setup with the required system variables, and a working Antigravity CLI configuration.

Then, a minimal Python MCP Server is built with stdio transport. This server is validated with Antigravity CLI in the local environment.

This setup validates the connection from Antigravity CLI to the local server via MCP. The MCP client (Gemini CLI) and the Python MCP server both run in the same local environment.

Setup the Basic Environment

At this point you should have a working Python environment and a working Gemini CLI installation. The next step is to clone the GitHub samples repository with support scripts:

cd ~
git clone https://github.com/xbill9/gemma4-tips

Then run init.sh from the cloned directory.

The script will attempt to determine your shell environment and set the correct variables:

local-devops-agent
source init.sh

If your session times out or you need to re-authenticate- you can run the set_env.sh script to reset your environment variables:

cd local-devops-agent
source set_env.sh

Variables like PROJECT_ID need to be setup for use in the various build scripts- so the set_env script can be used to reset the environment if you time-out.

Model Management Tool with MCP Stdio Transport

One of the key features that the standard MCP libraries provide is abstracting various transport methods.

The high level MCP tool implementation is the same no matter what low level transport channel/method that the MCP Client uses to connect to a MCP Server.

The connection over stdio will look similar to this:

# Initialize FastMCP server
mcp = FastMCP("Self-Hosted vLLM DevOps Agent")

Running the Python Code

First- switch the directory with the Python version of the MCP sample code:

~/gemma4-tips/local-devops-agent

Run the release version on the local system:

make install
Processing ./.

The project can also be linted:

xbill@penguin:~/gemma4-tips/local-devops-agent$ make lint
ruff check .
All checks passed!
ruff format --check .
9 files already formatted
mypy .
Success: no issues found in 9 source files

And a test run:

xbill@penguin:~/gemma4-tips/local-devops-agent$ make test
python test_agent.py
.......2026-05-28 10:41:49,403 - vllm-devops-agent - INFO - Querying model with stats with prompt: 'Hi...'
2026-05-28 10:41:49,404 - vllm-devops-agent - INFO - Model response with stats: TTFT=0.000s, TotalTime=0.000s
.....
----------------------------------------------------------------------
Ran 12 tests in 0.038s

OK

Docker Interaction with MCP stdio Transport

One of the key features that the MCP protocol provides is abstracting various transport methods.

The high level tool MCP implementation is the same no matter what low level transport channel/method that the MCP Client uses to connect to a MCP Server.

In this project Gemini CLI is used as the MCP client to interact with the Python MCP server code.

Antigravity CLI mcp_config.json

A sample MCP server file is provided in the .agents directory:

{
  "mcpServers": {
    "local-devops-agent": {
      "command": "python3",
      "args": [
        "/home/xbill/gemma4-tips/local-devops-agent/server.py"
      ],
      "env": {
        "GOOGLE_CLOUD_PROJECT": "aisprint-491218",
        "MODEL_NAME": "google/gemma-4-E2B-it"
      }
    }
  }
}

Validation with Antigravity CLI

The final connection test uses Antigravity CLI as a MCP client with the Python code providing the MCP server:

agy
/mcp list
>
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
MCP Servers

Plugins (~/.gemini/antigravity-cli/plugins)
> ✓ google-dev-knowledge Tools: search_documents, answer_query, get_documents
   ✓ local-devops-agent Tools: verify_model_health, save_hf_token, manage_docker, get_system_status, get_endpoint, +9
                         more

Getting Started with Gemma 4 Locally

As most local deployments are constrained- the Ollama engine is used in a Docker container:

Docker - Ollama

The Official vLLM repo also has Gemma4 specific information:

Release v0.19.1 · vllm-project/vllm

Model Lifecycle Management via MCP

The MCP tools provide a complete suite of agent-oriented operations for managing vLLM deployment on Cloud Run or a TPU.

Overview of MCP tools :

> print out mcp tools

  Here is the updated list of available MCP tools for the local Gemma 4 agent:

   Tool Name | Description | Code Location
  ---------------------------|-----------------------------------------------------------------------|-----------------
    analyze_local_logs | Analyzes the local container logs using Gemma 4 to find SRE/DevOps | server.py
                             | errors. |
    get_active_models | Gets the currently loaded models in Ollama's memory. | server.py
    get_docker_logs | Retrieves startup and execution logs from the Docker container. | server.py
    get_endpoint | Checks endpoint connectivity and returns the service URL. | server.py
    get_help | Provides help text and summarizes the configuration options. | server.py
    get_model_show_details | Gets parameters, architecture, license, and config for a model. | server.py
    get_system_details | Retrieves detailed information about the running local model, engine, | server.py
                             | and versions. |
    get_system_status | Displays a dashboard of container and serving health status. | server.py
    manage_docker | Starts, stops, restarts, or removes the container. | server.py
    query_gemma4 | Standard prompt query interface for the local model. | server.py
    query_gemma4_with_stats | Prompt interface returning performance metrics (latency, TTFT). | server.py
    run_benchmark | Runs the benchmark suite or vLLM benchmarking script. | server.py
    save_hf_token | Securely caches a Hugging Face API token. | server.py
    verify_model_health | Runs a deep health check verifying latency and response correctness. | server.py

Checking System status

The status can be checked with an MCP tool:

  ### 1. High-Level Dashboard: get_system_status 

  • Purpose: Checks the container status (via docker ps ) and model server availability to present a dashboard.
  • Function: server.py
  • Current Status:
      • vLLM Health: 🟢 Online (http://localhost:8000)
      • Docker Container Status: 🟢 Running

  ### 2. Service Connectivity Endpoint: get_endpoint 

  • Purpose: Verifies model service endpoint connectivity and reports the URL.
  • Function: server.py
  • Current Output:
  │ 🟢 Local vLLM/Ollama is Online at: http://localhost:8000 

  ### 3. Active Models in Memory: get_active_models 
  • Purpose: Checks which models are currently loaded in RAM/VRAM, along with context size and processor status (Ollama
  backend only).
  • Function: server.py
  • Current Output:
    NAME ID SIZE PROCESSOR CONTEXT UNTIL
  (Note: The active model list is currently empty, meaning no models are loaded in memory yet)

  ### 4. Deep Model Properties: get_model_show_details 

  • Purpose: Queries parameter details, license, system prompt templates, and architecture layers for a specific model
  (Ollama backend only).
  • Function: server.py

Deploy The Model Locally

Once the base Docker environment has been checked- the Gemma model can be deployed:

> deploy gemma4

● Read(/home/xbill/gemma4-tips/local-devops-agent/server.py)
● Read(/home/xbill/.gemini/antigravity-cli/mcp/local-devops-agent/manage_docker.json)
● local-devops-agent/manage_docker(Start Docker container)
● local-devops-agent/get_system_status(System status check)
● Read(/home/xbill/.gemini/antigravity-cli/mcp/local-devops-agent/verify_model_health.json)
● local-devops-agent/verify_model_health(Model health check) (ctrl+o to expand)

  The Gemma 4 local container stack has been successfully started and verified:

  1. Docker Container Start: Called server.py with action="start" to run or start the gemma4 container and
  pull the model.
  2. Status Check: Verified that the container status via server.py is 🟢 Running and online.
  3. Health Validation: Executed server.py, which passed with the following metrics:
      • Response: "Hello. Yes, I am working and ready to assist you." 
      • Latency: 39.10 seconds 

  The inference service is fully online at http://localhost:8000 and ready to serve requests.

  ### Summary of Work

  • Started the gemma4 Docker container using the server.py tool.
  • Verified status and deep endpoint health using server.py and server.py.

Cross Check The Deployed Model

Once the model starts locally- the MCP tool allows for model verification:

> Performs a deep health check by querying the model with a simple prompt and measuring response
    latency.

● local-devops-agent/verify_model_health(Model health check) (ctrl+o to expand)

  The health check has completed:

  • Status: ✅ Model health check PASSED
  • Response: "Hello! Yes, I am working and ready to assist you." 
  • Latency: 17.51 seconds 

  You can inspect the definition of this check in server.py within server.py.

  ### Summary of Work

  • Ran the server.py tool.
  • Verified the local Gemma 4 endpoint response latency and output.

and model details:

> get_active_models

● local-devops-agent/get_active_models(Active models check)
● Read(/home/xbill/.gemini/antigravity-cli/mcp/local-devops-agent/query_gemma4.json)
● local-devops-agent/query_gemma4(Query local model)
● local-devops-agent/get_active_models(Active models check) (ctrl+o to expand)

  The local model has been loaded, and the resource details from server.py are shown below:

  ### 📊 Active Loaded Models:

    NAME ID SIZE PROCESSOR CONTEXT UNTIL
    gemma4:e2b 7fbdbf8f5e45 7.6 GB 100% CPU 4096 4 minutes from now

and the endpoint:

> get_endpoint

● local-devops-agent/get_endpoint(Local agent endpoint check) (ctrl+o to expand)

  The local Gemma 4 model server endpoint checked by server.py is active:

  │ 🟢 Local vLLM/Ollama is Online at: http://localhost:8000

and status:

> get_system_details

● Read(/home/xbill/.gemini/antigravity-cli/mcp/local-devops-agent/get_system_details.json)
● local-devops-agent/get_system_details(System details check) (ctrl+o to expand)

  Here is the output from the renamed server.py tool:

  ### 🧩 Model Details (http://localhost:8000)

  Model Information ( /v1/models ):

    {
      "object": "list",
      "data": [
        {
          "id": "gemma4:e2b",
          "object": "model",
          "created": 1779979910,
          "owned_by": "library"
        }
      ]
    }

  Health Status ( /v1/models ):

  • Status: Healthy ✅

Check the Docker status:

> manage_docker status
● local-devops-agent/manage_docker(Check Docker status) (ctrl+o to expand)
  Here is the status of the local Docker container as returned by server.py with action="status" :
    CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS
  NAMES
    5b5917a97e98 ollama/ollama:latest "/bin/ollama serve" 53 minutes ago Up 53 minutes 0.0.0.0:8000-
  >11434/tcp, [::]:8000->11434/tcp gemma4
  ### Summary of Work
  • Invoked server.py with action="status" to query container metrics directly.

Review the Model

Check the model details:

  Here are the deep parameters and configurations for the loaded model retrieved by server.py:

  ### 🧩 Model Details for gemma4:e2b :

    Model
        architecture gemma4
        parameters 5.1B
        context length 131072
        embedding length 1536
        quantization Q4_K_M
        requires 0.20.0

      Capabilities
        completion
        vision
        audio
        tools
        thinking

      Parameters
        temperature 1
        top_k 64
        top_p 0.95

      License
        Apache License
        Version 2.0, January 2004
        ...

Once everything is online — you can send queries to the Model:

> query_gemma4_with_stats what is gemma4?

● Read(/home/xbill/.gemini/antigravity-cli/mcp/local-devops-agent/query_gemma4_with_stats.json)
● local-devops-agent/query_gemma4_with_stats(Query local model with stats) (ctrl+o to expand)

  Here is the response from querying the local model along with performance statistics, using server.py:

  ### 📊 Performance Stats

  • Time to First Token (TTFT): 10.262s 
  • Total Generation Time: 96.890s 
  • Tokens per Second: 2.53 tokens/s 
  • Total Tokens (approx.): 219 

  ### 💬 Model Response

  Gemma 4 is a family of large language models developed by Google DeepMind.

  Here are the key things to know about Gemma 4:

  1. Nature: It is a family of open-weights models, meaning the weights (the parameters that define the model) are
  publicly available, allowing researchers, developers, and the community to freely use, modify, and build upon the
  technology.
  2. Purpose: Like other large language models, Gemma 4 is designed to understand, generate, and process human language.
  3. Development: It is part of Google DeepMind's efforts to create powerful, responsible, and accessible AI models.
  4. Use Cases: It can be used for a wide variety of NLP tasks, including:
      • Text generation (writing stories, code, emails)
      • Translation
      • Summarization
      • Question answering
      • Code assistance

  In short, Gemma models represent an effort to make advanced large language model technology more accessible to the
  wider community.

Benchmark the Local Model

The local model was benchmarked:

  I have started the custom benchmark sweep using the parameters you requested:

  • Tokens: 4 , 8 , 16 , 32 , 64 
  • Concurrency (Concurrent Users): 1 , 2 , 4 

  Analyzing Task Results
  The custom benchmark sweep has completed successfully! Here is the performance data generated across the requested
  token lengths ( 4 , 8 , 16 , 32 , 64 ) and concurrent user counts ( 1 , 2 , 4 ) using the server.py suite:

  ### 📈 Benchmark Results Table

   Concurrency | Max Tokens | Success Rate | Avg Latency (s) | P95 Latency (s) | Throughput (req/s) | Gen Speed (tokens
  -------------|------------|--------------|-----------------|-----------------|--------------------|------------------
        1 | 4 | 100% | 1.14s | 1.73s | 0.87 | 3.49
        2 | 4 | 100% | 1.09s | 1.42s | 1.62 | 6.46
        4 | 4 | 100% | 1.18s | 1.42s | 2.10 | 8.42
        1 | 8 | 100% | 1.33s | 2.36s | 0.75 | 6.00
        2 | 8 | 100% | 1.69s | 2.15s | 0.93 | 7.46
        4 | 8 | 100% | 3.53s | 5.17s | 0.58 | 4.64
        1 | 16 | 100% | 2.58s | 4.40s | 0.39 | 6.20
        2 | 16 | 100% | 5.37s | 7.06s | 0.33 | 5.30
        4 | 16 | 100% | 5.65s | 8.49s | 0.35 | 5.65
        1 | 32 | 100% | 3.91s | 5.73s | 0.26 | 8.19
        2 | 32 | 100% | 8.73s | 11.34s | 0.20 | 6.46
        4 | 32 | 100% | 7.53s | 10.94s | 0.27 | 8.77
        1 | 64 | 100% | 6.77s | 10.13s | 0.15 | 9.46
        2 | 64 | 100% | 19.89s | 23.76s | 0.08 | 5.34
        4 | 64 | 100% | 20.35s | 30.30s | 0.10 | 6.34

  ### 🔍 Key Takeaways:

  1. Concurrency Scaling for Short Tokens: At small token lengths (e.g., 4 tokens), increasing concurrency from 1 to 4
  improves throughput from 0.87 req/s to 2.10 req/s with negligible impact on average latency.
  2. Resource Contention on Larger Sweeps: For longer token generations (e.g., 64 tokens), higher concurrency causes
  severe queuing and latency spikes on CPU execution, with average latency jumping from 6.77s (concurrency 1) to 20.35s
  (concurrency 4).
  3. Generation Speed: The generation speed remains relatively stable across most tests, hovering between 3.5 tokens/s 
  and 9.5 tokens/s.

This is visualized:

Summary

The strategy for using MCP for local Gemma 4 deployment with Antigravity CLI was validated with a incremental step by step approach.

Cross Language A2A Agent Benchmarking with Antigravity CLI

xbill — Wed, 27 May 2026 15:12:00 +0000

Building a Benchmarking Agent with A2A and MCP

This tutorial aims to build and test benchmarking Agents using the A2A protocol across several popular programming languages. The Master Orchestrator Agent is exposed via MCP to allow Antigravity CLI to be used as a MCP client.

Deja Vu — What is Old is New!

This paper is a re-visiting of the original benchmark series with Gemini CLI over Node, GO, and Python:

Cross Language A2A Agent Benchmarking with Gemini 3 and Gemini CLI

In this updated version, the Antigravity CLI is used over Node, GO, Python, and Rust.

Why would I need Multi-Language Support? Can’t I just use Python?

Most mature Agent development tools and libraries are Python based. Python allows for rapid prototyping and evaluation of approaches. Python is also an interpreted language- which has trade-offs in memory safety, and performance. Other languages like GO and Rust offer high performance and memory safe operations. With a language neutral communication protocol — the actual Agent implementation of each Agent can be coded in the most appropriate language.

What is this Approach actually Benchmarking?

The high level goal was to measure the actual time spent running an algorithm in the native language code inside the A2A agent. Each language had a slightly different implementation due to the language syntax. After running the algorithm- each Agent was instructed to calculate and return the elapsed time for cross language comparison.

What is the Agent Development Kit?

The Agent Development Kit (ADK) is a flexible and modular framework for developing and deploying AI agents. While optimized for Gemini and the Google ecosystem, ADK is model-agnostic , deployment-agnostic , and is built for compatibility with other frameworks. The official ADK is only currently available in the Python, GO, and Java programming languages.

Google’s Agent Stack in Action: ADK, MCP, A2A on Google Cloud

Google provides full documentation on the Agent Development Kit (ADK) here:

Agent Development Kit (ADK)

What is the A2A protocol?

The Agent2Agent (A2A) protocol, an open communication standard for AI agents, was initially introduced by Google in April 2025. It is specifically engineered to facilitate seamless interoperability within multi-agent systems, enabling AI agents developed by diverse providers or built upon disparate AI agent frameworks to communicate and collaborate effectively.

A good overview of the A2A protocol can be found here:

A2A Protocol

Language Support For the A2A Protocol

The official ADK for Python, GO, and Java provide built-in support for working with the A2A protocol. For other programming languages like JS, Rust, and .NET — 3rd party libraries are available to add support for the protocol.

The main source for A2A Language support is the GitHub repo:

GitHub - a2aproject/A2A: An open protocol enabling communication and interoperability between opaque agentic applications.

Tool Chain Setup for A2A

This article targets the Python, GO, Rust, and JavaScript environments.

The build tools for each language environment need to be in place. For building with Python- a working Python environment with 3.12 or later along with package management tools like uv or pip is required.

For building with GO — a recent version of the GO compiler (1.24.1 or later) is required.

For building with Rust, the Rust tool chain is required.

For building with Node / JavaScript — a working Node.js environment with Node version 20 or later and a functional npm tool is needed.

Antigravity CLI

Antigravity CLI is the follow-on successor to Gemini CLI- the terminal driven, agent assisted coding tool.

Full details on installing Antigravity CLI are here:

Getting Started with Antigravity CLI

Testing the Antigravity CLI Environment

Once you have all the tools in place- you can test the startup of Antigravity CLI.

You will need to authenticate with a Google Cloud Project or your Google Account:

agy

This will start the interface:

Checking the Developer Environment

Verify that all the prerequisite packages and compilers are installed — and clone the sample Github repo:

git clone https://github.com/xbill9/a2a-benchmark
cd a2a-benchmark

Once you have your Google Cloud Project and preferred authentication method — run the init.sh script to validate the setup:

xbill@penguin:~/a2a-benchmark$ source init.sh
Project ID file found, skipping.
--- Authentication Method ---
Do you want to use a Gemini API Key for authentication? (y/n): n
WARNING: Your active project does not match the quota project in your local Application Default Credentials file. This might result in unexpected quota issues.

To update your Application Default Credentials quota project, use the `gcloud auth application-default set-quota-project` command.
Updated property [core/project].
--- Setup complete ---

The set_env.sh script is provided to set common ADK environment variables:

xbill@penguin:~/a2a-benchmark$ source set_env.sh
--- Setting Google Cloud Environment Variables ---
Checking gcloud authentication status...
gcloud is authenticated.
Exported PROJECT_ID=comglitn
Exported PROJECT_NUMBER=1056842563084
Exported SERVICE_ACCOUNT_NAME=1056842563084-compute@developer.gserviceaccount.com
Exported GOOGLE_CLOUD_PROJECT=comglitn
Exported GOOGLE_GENAI_USE_VERTEXAI=TRUE
Exported GOOGLE_CLOUD_LOCATION=us-central1
Exported REPO_NAME=
Exported REGION=us-central1
--- Environment setup complete ---
xbill@penguin:~/a2a-benchmark$

Debugging API Permission Errors

If your application default credentials expires or your Google Cloud Authentication expires you will get an error. The workaround is to re-authenticate:

gcloud auth login
gcloud auth application-default login

Another common error is that the environment variables are not set correctly. Go the the root directory and re-run the set_env.sh to set the variables:

cd ~/adk-hello-world-a2a
source set_env.sh

A2A Debugging Tools -A2A Inspector

The A2A Inspector is a standalone tool that provides low level visibility into the A2A protocol. The GitHub is available here:

GitHub - a2aproject/a2a-inspector: Validation Tools for A2A Agents

A summary of the features of the A2A inspector can be found here:

A2A Protocol Documentation

To install the A2A Inspector:

cd ~
git clone https://github.com/a2aproject/a2a-inspector

Then follow the build instructions — you need uv , and a recent version of node :

a2a-inspector/README.md at main · a2aproject/a2a-inspector

Staring the A2A Inspector

Once the A2A inspector has been installed- you can validate the installation by using this URL:

http://127.0.0.1:5001/

Extending ADK Agents with A2A

The ADK provides several key tools to allow standard ADK Agents to run as standalone A2A agents — without the ADK — either in A2A Client or A2A Server mode. The Python ADK includes libraries and samples to extend a standard ADK agent to enable A2A protocol features. Instead of running the agent inside the ADK web utility- the agents are dual purposed with A2A to be able to run as dedicated agents with their own embedded Uvicorn web server.

I am confused- How Does this all Work?

When Agents are run in ADK mode — the ADK CLI or Web interface is used to directly interact with the Agents. The ADK UI is started in a well known port — usually 8000 and the Agents are accessed in that environment.

The ADK does not automatically expose the agent as an A2A agent. The basic agent code from the ADK needs to be extended and enabled to run as a standalone A2A Agent. Without the additional A2A function calls and a active standalone web server- the Agents will not be usable in A2A mode.

Mersenne Primes

In mathematics, a Mersenne prime is a prime number that is one less than a power of two. As of 2025, 52 Mersenne primes are known. An interesting thing about Mersenne primes is that they are the easiest natural numbers to prove to be primes, so they make up the largest category on the list of known prime numbers. This example was chosen as a good problem as the computation is CPU bound and gets exponentially longer for each higher number.

A2A Cross Language Agents

The a2a_benchmark Repo has sample scripts for running the ADK and various types of agents across programming languages. Each agent implements a Mersenne Prime Number generator with only basic optimization. The root directory of the a2a-benchmark directory contains several common agent development languages. These include:

benchmark-go (GO) A2A Server
benchmark-python (Python) A2A Server
benchmark-node (JavaScript) A2A Server
benchmark-rust (Rust) A2A Server
benchmark-master (Python) A2A Client / MCP Server

A2A Mersenne Prime Number Agent (GO)

This agent provides a basic Agent that generates Mersenne prime numbers. To run the agent — run the bench-go.sh script:

Running the Go project...
{"time":"2025-11-25T13:47:18.071713651-05:00","level":"INFO","msg":"Using Model ","model":"gemini-2.5-flash"}
{"time":"2025-11-25T13:47:18.07195886-05:00","level":"INFO","msg":"Starting A2A mersenne prime server","port":"8102"}
{"time":"2025-11-25T13:47:18.072116547-05:00","level":"INFO","msg":"Starting the web server: &{port:8102 writeTimeout:15000000000 readTimeout:15000000000 idleTimeout:60000000000}"}
{"time":"2025-11-25T13:47:18.072119644-05:00","level":"INFO","msg":""}
{"time":"2025-11-25T13:47:18.072121732-05:00","level":"INFO","msg":"Web servers starts on http://localhost:8102"}
{"time":"2025-11-25T13:47:18.072123829-05:00","level":"INFO","msg":" a2a: you can access A2A using jsonrpc protocol: http://localhost:8102"}
{"time":"2025-11-25T13:47:18.072125402-05:00","level":"INFO","msg":""}

The A2A inspector can be used to validate the Agent:

"id": "adk-88a8aefa-3070-48ff-b8d1-631af65ea57e",
  "name": "generate_primes",
  "response": {
    "result": "Elapsed time: 307.932µs"
  }

A2A Mersenne Prime Number Agent (Python)

This agent provides Python agent that implements the Mersenne generation algorithm.

This Agent can be checked with the ADK web interface:

what do you do

I can calculate Mersenne primes using the Lucas-Lehmer primality test. I can find the list of the first N Mersenne primes.

find the first 10 primes

It took 0.0002677440643310547 seconds to find the first 10 Mersenne primes.

To start the dedicated A2A version of the Python prime number generator Agent use the bench-python.sh script:

xbill@penguin:~/a2a-benchmark$ source bench-python.sh

INFO: Started server process [10495]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8101 (Press CTRL+C to quit)

A2A Mersenne Prime Number Agent (Node)

This agent provides a minimal agent to generate a prime number. To run the Agent use the bench-node.sh script:

/home/xbill/a2a-benchmark/benchmark-node
staring a2a bench node generate prime

The Agent can be validated with the A2A inspector on port 8103:

A typical session will look something like this:

what do you do
message
Found first 5 Mersenne primes in 0.09ms.

✅
find 10 primes
message
Found first 10 Mersenne primes in 0.81ms.

✅

A2A Mersenne Prime Number Agent (Rust)

The Mersenne primes algorithm was newly implemented in Rust:

A sample test script verified the A2A Agent skill:

xbill@penguin:~/a2a-benchmark$ source test-rust.sh 
Checking if Rust A2A agent is running on port 8104...
Rust agent is already running.
Running Test 1: Fetching Agent Card...
Agent Card Response:
{
  "name": "Mersenne Prime Agent Rust",
  "description": "A rust agent that builds a list of the first n Mersenne primes and reports the elapsed time. Configured with model: Not specified.",
  "protocolVersion": "0.3.0",
  "version": "0.1.0",
  "url": "http://0.0.0.0:8104/",
  "skills": [
    {
      "id": "find-mersenne-rust",
      "name": "Find Mersenne Primes in rust",
      "description": "Finds the list of the first n Mersenne primes in Rust",
      "tags": [
        "math",
        "benchmark"
      ]
    }
  ],
  "capabilities": {},
  "defaultInputModes": [],
  "defaultOutputModes": []
}
✅ Test 1 Passed: Agent Card is valid.
Running Test 2: Sending message/send request to calculate 5 Mersenne primes...
Response:
{
  "id": 1,
  "jsonrpc": "2.0",
  "result": {
    "contextId": "test-context-123",
    "kind": "message",
    "messageId": "f20c704c-bafa-4081-a18d-8d7984d4213c",
    "parts": [
      {
        "kind": "text",
        "text": "Found first 5 Mersenne primes in 0.11ms."
      }
    ],
    "role": "agent"
  }
}
✅ Test 2 Passed: Successfully calculated primes.
Running Test 3: Sending invalid RPC method...
Error Response:
{
  "error": {
    "code": -32601,
    "message": "Method not found: invalid/method"
  },
  "id": 2,
  "jsonrpc": "2.0"
}
✅ Test 3 Passed: Invalid method was rejected successfully.
All tests passed successfully!

MCP Master/Orchestrator Agent (Python)

The final agent follows a slightly different pattern. It provides a minimal agent with several functions (Tools/Skills). It has one root_agent and 3 remote agents connected over A2A. This agent also exposes an interface as a MCP server- allowing Antigravity CLI to be used as a MCP client.

The Master Agent is started with a similar script:

╭──────────────────────────────────────────────────────────────────────────────╮                        
                       │ │                        
                       │ ▄▀▀ ▄▀█ █▀▀ ▀█▀ █▀▄▀█ █▀▀ █▀█ │                        
                       │ █▀ █▀█ ▄▄█ █ █ ▀ █ █▄▄ █▀▀ │                        
                       │ │                        
                       │ FastMCP 2.13.1 │                        
                       │ │                        
                       │ │                        
                       │ 🖥 Server name: benchmark │                        
                       │ │                        
                       │ 📦 Transport: HTTP │                        
                       │ 🔗 Server URL: http://127.0.0.1:8100/mcp │                        
                       │ │                        
                       │ 📚 Docs: https://gofastmcp.com │                        
                       │ 🚀 Hosting: https://fastmcp.cloud │                        
                       │ │                        
                       ╰──────────────────────────────────────────────────────────────────────────────╯                        


[11/25/25 14:17:36] INFO Starting MCP server 'benchmark' with transport 'http' on [http://127.0.0.1:8100/mcp](http://127.0.0.1:8100/mcp) server.py:2055
INFO: Started server process [21826]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on [http://127.0.0.1:8100](http://127.0.0.1:8100) (Press CTRL+C to quit)

Antigravity CLI with MCP Connection

The Master Agent starts a FastMCP server interface. This allows Antigravity CLI to be used as a MCP client:

xbill@penguin:~/a2a-benchmark/.agents$ more mcp_config.json 
{
  "mcpServers": {
    "bench": {
      "serverUrl": "http://127.0.0.1:8100/mcp",
      "serverURL": "http://127.0.0.1:8100/mcp"
    }
  }
}

ADK Validation of the Master Agent

Because it is Python based- the Master Agent can also be viewed with the ADK web interface:

So What is all this Doing?

All of the setup and configuration allows you to start debugging and visualizing Agent flows using the A2A protocol. The Agents can be validated using the A2A Inspector. The Python Agents can be checked with the ADK web interfaces. The Master Agent can be validated with Antigravity CLI as a MCP client.

Time to Starting Running some Numbers!

Now that the individual Agents have been validated — open several Terminal windows and start up all the remaining agents:

benchmark-go.sh — (port 8102)
benchmark-python.sh-(port 8101)
benchmark-node.sh- (port 8103)
benchmark-rust.sh- (port 8104)
benchmark-master.sh — Orchestrator Agent (port 8100)

Start the Benchmarking Session

Once all the agents have been started — the Master agent on port 8100 is connected over MCP with Antigravity CLI:

  I have completed the benchmark run across all four language agents for primes 1 to 22.

  ### Summary of Actions Taken:

  1. Agent Startup & Verification: Automatically checked and started all four target language agents (Node.
  js/TypeScript, Rust, Go, Python) and verified they were active and listening on their respective A2A JSON-RPC ports.
  2. Benchmark Scripting: Created a customized benchmarking script run_benchmark_1_22.py that targets prime calculations
from
  $N = 1$ to $N = 22$.
  3. Execution: Ran the benchmark and recorded the exact computation time (within the tool call) and the total Agent-to-
  Agent (A2A) network/orchestration latency (RTT) for every prime count $N$.
  4. Data Plotting: Generated visualizations for both calculation times ( prime_calculation_times_1_22.png ) and
  overall latency ( a2a_latency_times_1_22.png ).
  5. Reporting: Compiled all calculations and visualizations into the analysis_results.md artifact.

  Please view analysis_results.md to inspect the table and embedded plots showing the comparison of calculation speeds
and
  latency characteristics!

Review Performance

The results of the benchmark are graphed:

With latency times:

Final Results

Antigravity CLI summarized the results of the Agent Benchmarking:

  Formulating Key Findings
  Here is a summary of the key findings from the benchmark results for primes 1 to 22:

  ### 1. Core Computation Performance (Lucas-Lehmer Prime Calculation)

  • Rust is the fastest: Computes $N=22$ in 195.96 ms.
  • Node.js (TS) is extremely close: Computes $N=22$ in 214.06 ms.
  • Go performs well: Computes $N=22$ in 279.54 ms.
  • Python is the slowest: Computes $N=22$ in 572.48 ms (roughly 3x slower than Rust).

  ### 2. A2A Framework & Protocol Overhead

  • Minimal Overhead (Rust & Node.js):
      • Rust and Node.js use lightweight custom HTTP wrappers. Their framework overhead is practically negligible (< 2 
      ms). For example, Node's total round-trip time (RTT) for $N=22$ is 215.78 ms (only 1.72 ms higher than the raw
      calculation time).
  • Significant Base Overhead (Go & Python):
      • Both Go and Python use the official google-adk library. This introduces a fixed baseline overhead per request
      (schema validation, logging, and metadata context processing).
      • Go incurs a consistent baseline RTT of ~1.2s to 1.6s.
      • Python incurs a consistent baseline RTT of ~2.3s to 3.0s.

and a deeper dive:

  Analyzing Agent Initialization
  Based on an analysis of the codebase and execution paths, here is the breakdown of why the benchmark produced these
  results:
  ──────
  ### Factor 1: The Massive Latency (RTT) Difference (Framework vs. Mocking)

  The most striking difference is that Rust and Node.js have near-zero overhead, whereas Go takes ~1.5s and Python     
  takes
  ~2.5s. This is entirely due to how the agents process the benchmark messages:

  1. Rust and Node.js (Mocked Agent Logic / Direct Execution):
      • Both main.rs and server.ts implement custom JSON-RPC endpoints that intercept incoming queries and
      bypass the LLM entirely.
      • They use simple regex to extract the number of primes directly from the text "Calculate the first {N} Mersenne
      primes" , execute the math, and return a mock agent response message. Since there are no LLM API calls, the
      overhead is practically zero.
  2. Go and Python (Real Agentic Execution via ADK):
      • Both Go (main.go) and Python (agent.py) use the official Agent Development Kit ( google-adk ) and
      hook up the Gemini 2.5 Flash model.
      • When they receive a message, the ADK runner executes a full agent reasoning loop. This involves two sequential 
      remote LLM calls (one to determine which tool to call, and another to formulate the final natural language
      response).
      • This remote LLM invocation via the internet accounts for the 1.5–3.0 second latency overhead.

  ──────
  ### Factor 2: Mathematical Computation Speeds (Lucas-Lehmer Loop)

  At higher values (e.g., $N=22$, where the Mersenne number $2^{9941}-1$ has 2,993 decimal digits and the Lucas-Lehmer
  test requires 9,939 iterations of squaring and modular reductions), we see clear performance tiers:

  1. Rust (195.96 ms):
      • Rust is compiled to optimized native machine code (using --release ). The num-bigint crate executes the big
      integer arithmetic without garbage collection overhead, making it the most efficient compiled loop.
  2. Node.js (214.06 ms):
      • V8's JIT compiler is extremely good at compiling JavaScript loops to native machine instructions. Additionally,
      V8's native BigInt implementation is heavily optimized at the engine level (written in assembly/C++).
  3. Go (279.54 ms):
      • While Go compiles to native code, its implementation in main.go performs heap allocations inside the hot
      loop:
        a := new(big.Int).Rsh(s, uint(p))
        b := new(big.Int).And(s, m)
      Allocating new big.Int objects inside a loop that runs thousands of times puts extra pressure on Go's runtime
      allocator and garbage collector, dragging down performance compared to Rust.
  4. Python (572.48 ms):
      • Python handles arbitrary-precision integers natively, but executing the loop in interpreted bytecode has a lot
      of interpreter overhead compared to compiled or JIT-compiled languages. 9,939 iterations of the loop in bytecode
      is the bottleneck.

Core Findings

For low numbers of Mersenne primes that execute quickly- all four languages are comparable. An unusual result was the good performance in the mid range from Node. This could have been from lower overhead of the Node A2A toolkit or potentially from JavaScript engine optimizations. As expected, GO and Rust did break out and delivered the best performance as the complexity scaled.

Summary

The goal of the demo/article was to get basic Agents implemented across multiple programming languages and benchmark the Agent performance finding Mersenne primes.

The Google Agent development kit (ADK) was presented along with the complimentary A2A (Agent to Agent) protocol. Three basic agents were presented — covering various combinations of programming languages and Agent implementation approaches.

Finally — a Master/Orchestrator agent was started to connect and delegate to the other agents via the A2A protocol. Antigravity CLI was used to connect to the Master Agent over MCP and execute the benchmarks.

Revisiting Benchmarking- Building a Rust A2A Agent

xbill — Wed, 27 May 2026 02:55:27 +0000

This is a submission for the GitHub Finish-Up-A-Thon Challenge

Cross Language A2A Agent Benchmarking with Gemini 3 and Antigravity CLI

Building a Benchmarking Agent with A2A and MCP

Deja Vu — What is Old is New

This paper is a re-visiting of the original benchmark series with Gemini CLI over Node, GO, and Python:

Cross Language A2A Agent Benchmarking with Gemini 3 and Gemini CLI

_Building a Benchmarking Agent with A2A and MCP_medium.com

In this updated version, the Antigravity CLI is used over Node, GO, Python, and Rust.

Why would I need Multi-Language Support? Can’t I just use Python?

What is this Approach actually Benchmarking?

What is the Agent Development Kit?

The Agent Development Kit (ADK) is a flexible and modular framework for developing and deploying AI agents. While optimized for Gemini and the Google ecosystem, ADK is model-agnostic, deployment-agnostic, and is built for compatibility with other frameworks. The official ADK is only currently available in the Python, GO, and Java programming languages.

Google’s Agent Stack in Action: ADK, MCP, A2A on Google Cloud

Google provides full documentation on the Agent Development Kit (ADK) here:

Agent Development Kit

_Build powerful multi-agent systems with Agent Development Kit_google.github.io

What is the A2A protocol?

A good overview of the A2A protocol can be found here:

A2A Protocol

_The official documentation for the Agent2Agent (A2A) protocol. The A2A protocol is an open standard that allows…_a2a-protocol.org

Language Support For the A2A Protocol

The main source for A2A Language support is the GitHub repo:

GitHub — a2aproject/A2A: An open protocol enabling communication and interoperability between…

_An open protocol enabling communication and interoperability between opaque agentic applications. — a2aproject/A2A_github.com

Tool Chain Setup for A2A

This article targets the Python, GO, Rust, and JavaScript environments.

For building with GO — a recent version of the GO compiler (1.24.1 or later) is required.

For building with Rust, the Rust tool chain is required.

For building with Node / JavaScript — a working Node.js environment with Node version 20 or later and a functional npm tool is needed.

Antigravity CLI

Antigravity CLI is the follow-on successor to Gemini CLI- the terminal driven, agent assisted coding tool.

Full details on installing Antigravity CLI are here:

Getting Started with Antigravity CLI

_This article covers the initial setup and configuration for the Antigravity CLI on a stock Linux Environment._medium.com

Testing the Antigravity CLI Environment

Once you have all the tools in place- you can test the startup of Antigravity CLI.

You will need to authenticate with a Google Cloud Project or your Google Account:

agy

This will start the interface:

Checking the Developer Environment

Verify that all the prerequisite packages and compilers are installed — and clone the sample Github repo:

git clone https://github.com/xbill9/a2a-benchmark

cd a2a-benchmark

Once you have your Google Cloud Project and preferred authentication method — run the init.sh script to validate the setup:

xbill@penguin:~/a2a-benchmark$ source init.sh

Project ID file found, skipping.

--- Authentication Method ---

Do you want to use a Gemini API Key for authentication? (y/n): n

WARNING: Your active project does not match the quota project in your local Application Default Credentials file. This might result in unexpected quota issues.

To update your Application Default Credentials quota project, use the gcloud auth application-default set-quota-project command.

Updated property [core/project].

--- Setup complete ---

The set_env.sh script is provided to set common ADK environment variables:

xbill@penguin:~/a2a-benchmark$ source set_env.sh

--- Setting Google Cloud Environment Variables ---

Checking gcloud authentication status...

gcloud is authenticated.

Exported PROJECT_ID=comglitn

Exported PROJECT_NUMBER=1056842563084

Exported SERVICE_ACCOUNT_NAME=1056842563084-compute@developer.gserviceaccount.com

Exported GOOGLE_CLOUD_PROJECT=comglitn

Exported GOOGLE_GENAI_USE_VERTEXAI=TRUE

Exported GOOGLE_CLOUD_LOCATION=us-central1

Exported REPO_NAME=

Exported REGION=us-central1

--- Environment setup complete ---

xbill@penguin:~/a2a-benchmark$

Debugging API Permission Errors

If your application default credentials expires or your Google Cloud Authentication expires you will get an error. The workaround is to re-authenticate:

gcloud auth login

gcloud auth application-default login

Another common error is that the environment variables are not set correctly. Go the the root directory and re-run the set_env.sh to set the variables:

cd ~/adk-hello-world-a2a

source set_env.sh

A2A Debugging Tools -A2A Inspector

The A2A Inspector is a standalone tool that provides low level visibility into the A2A protocol. The GitHub is available here:

GitHub — a2aproject/a2a-inspector: Validation Tools for A2A Agents

_Validation Tools for A2A Agents. Contribute to a2aproject/a2a-inspector development by creating an account on GitHub._github.com

A summary of the features of the A2A inspector can be found here:

A2A Protocol Documentation

_Documentation for A2A Protocol_a2aprotocol.ai

To install the A2A Inspector:

cd ~

git clone https://github.com/a2aproject/a2a-inspector

Then follow the build instructions — you need uv, and a recent version of node:

a2a-inspector/README.md at main · a2aproject/a2a-inspector

_Validation Tools for A2A Agents. Contribute to a2aproject/a2a-inspector development by creating an account on GitHub._github.com

Staring the A2A Inspector

Once the A2A inspector has been installed- you can validate the installation by using this URL:

http://127.0.0.1:5001/

Extending ADK Agents with A2A

I am confused- How Does this all Work?

Mersenne Primes

A2A Cross Language Agents

benchmark-go (GO) A2A Server
benchmark-python (Python) A2A Server
benchmark-node (JavaScript) A2A Server
benchmark-rust (Rust) A2A Server
benchmark-master (Python) A2A Client / MCP Server

A2A Mersenne Prime Number Agent (GO)

This agent provides a basic Agent that generates Mersenne prime numbers. To run the agent — run the bench-go.sh script:

Running the Go project...

{"time":"2025-11-25T13:47:18.071713651-05:00","level":"INFO","msg":"Using Model ","model":"gemini-2.5-flash"}

{"time":"2025-11-25T13:47:18.07195886-05:00","level":"INFO","msg":"Starting A2A mersenne prime server","port":"8102"}

{"time":"2025-11-25T13:47:18.072116547-05:00","level":"INFO","msg":"Starting the web server: &{port:8102 writeTimeout:15000000000 readTimeout:15000000000 idleTimeout:60000000000}"}

{"time":"2025-11-25T13:47:18.072119644-05:00","level":"INFO","msg":""}

{"time":"2025-11-25T13:47:18.072121732-05:00","level":"INFO","msg":"Web servers starts on http://localhost:8102"}

{"time":"2025-11-25T13:47:18.072123829-05:00","level":"INFO","msg":" a2a: you can access A2A using jsonrpc protocol: http://localhost:8102"}

{"time":"2025-11-25T13:47:18.072125402-05:00","level":"INFO","msg":""}

The A2A inspector can be used to validate the Agent:

"id": "adk-88a8aefa-3070-48ff-b8d1-631af65ea57e",

"name": "generate_primes",

"response": {

"result": "Elapsed time: 307.932µs"

}

A2A Mersenne Prime Number Agent (Python)

This agent provides Python agent that implements the Mersenne generation algorithm.

This Agent can be checked with the ADK web interface:

what do you do

I can calculate Mersenne primes using the Lucas-Lehmer primality test. I can find the list of the first N Mersenne primes.

find the first 10 primes

It took 0.0002677440643310547 seconds to find the first 10 Mersenne primes.

To start the dedicated A2A version of the Python prime number generator Agent use the bench-python.sh script:

xbill@penguin:~/a2a-benchmark$ source bench-python.sh

INFO: Started server process [10495]

INFO: Waiting for application startup.

INFO: Application startup complete.

INFO: Uvicorn running on http://0.0.0.0:8101 (Press CTRL+C to quit)

A2A Mersenne Prime Number Agent (Node)

This agent provides a minimal agent to generate a prime number. To run the Agent use the bench-node.sh script:

/home/xbill/a2a-benchmark/benchmark-node

staring a2a bench node generate prime

The Agent can be validated with the A2A inspector on port 8103:

A typical session will look something like this:

what do you do

message

Found first 5 Mersenne primes in 0.09ms.

✅

find 10 primes

message

Found first 10 Mersenne primes in 0.81ms.

✅

A2A Mersenne Prime Number Agent (Rust)

The Mersenne primes algorithm was newly implemented in Rust:

A sample test script verified the A2A Agent skill:

xbill@penguin:~/a2a-benchmark$ source test-rust.sh

Checking if Rust A2A agent is running on port 8104...

Rust agent is already running.

Running Test 1: Fetching Agent Card...

Agent Card Response:

{

"name": "Mersenne Prime Agent Rust",

"description": "A rust agent that builds a list of the first n Mersenne primes and reports the elapsed time. Configured with model: Not specified.",

"protocolVersion": "0.3.0",

"version": "0.1.0",

"url": "http://0.0.0.0:8104/",

"skills": [

{

"id": "find-mersenne-rust",

"name": "Find Mersenne Primes in rust",

"description": "Finds the list of the first n Mersenne primes in Rust",

"tags": [

"math",

"benchmark"

]

}

],

"capabilities": {},

"defaultInputModes": [],

"defaultOutputModes": []

}

✅ Test 1 Passed: Agent Card is valid.

Running Test 2: Sending message/send request to calculate 5 Mersenne primes...

Response:

{

"id": 1,

"jsonrpc": "2.0",

"result": {

"contextId": "test-context-123",

"kind": "message",

"messageId": "f20c704c-bafa-4081-a18d-8d7984d4213c",

"parts": [

{

"kind": "text",

"text": "Found first 5 Mersenne primes in 0.11ms."

}

],

"role": "agent"

}

}

✅ Test 2 Passed: Successfully calculated primes.

Running Test 3: Sending invalid RPC method...

Error Response:

{

"error": {

"code": -32601,

"message": "Method not found: invalid/method"

},

"id": 2,

"jsonrpc": "2.0"

}

✅ Test 3 Passed: Invalid method was rejected successfully.

All tests passed successfully!

MCP Master/Orchestrator Agent (Python)

The Master Agent is started with a similar script:

╭──────────────────────────────────────────────────────────────────────────────╮

│ │

│ ▄▀▀ ▄▀█ █▀▀ ▀█▀ █▀▄▀█ █▀▀ █▀█ │

│ █▀ █▀█ ▄▄█ █ █ ▀ █ █▄▄ █▀▀ │

│ │

│ FastMCP 2.13.1 │

│ │

│ │

│ 🖥 Server name: benchmark │

│ │

│ 📦 Transport: HTTP │

│ 🔗 Server URL: http://127.0.0.1:8100/mcp │

│ │

│ 📚 Docs: https://gofastmcp.com │

│ 🚀 Hosting: https://fastmcp.cloud │

│ │

╰──────────────────────────────────────────────────────────────────────────────╯

[11/25/25 14:17:36] INFO Starting MCP server 'benchmark' with transport 'http' on http://127.0.0.1:8100/mcp server.py:2055

INFO: Started server process [21826]

INFO: Waiting for application startup.

INFO: Application startup complete.

INFO: Uvicorn running on http://127.0.0.1:8100 (Press CTRL+C to quit)

Antigravity CLI with MCP Connection

The Master Agent starts a FastMCP server interface. This allows Antigravity CLI to be used as a MCP client:

xbill@penguin:~/a2a-benchmark/.agents$ more mcp_config.json

{

"mcpServers": {

"bench": {

"serverUrl": "http://127.0.0.1:8100/mcp",

"serverURL": "http://127.0.0.1:8100/mcp"

}

}

}

ADK Validation of the Master Agent

Because it is Python based- the Master Agent can also be viewed with the ADK web interface:

So What is all this Doing?

Time to Starting Running some Numbers!

Now that the individual Agents have been validated — open several Terminal windows and start up all the remaining agents:

benchmark-go.sh — (port 8102)
benchmark-python.sh-(port 8101)
benchmark-node.sh- (port 8103)
benchmark-rust.sh- (port 8104)
benchmark-master.sh — Orchestrator Agent (port 8100)

Start the Benchmarking Session

Once all the agents have been started — the Master agent on port 8100 is connected over MCP with Antigravity CLI:

I have completed the benchmark run across all four language agents for primes 1 to 22.

### Summary of Actions Taken:

Agent Startup & Verification: Automatically checked and started all four target language agents (Node. js/TypeScript, Rust, Go, Python) and verified they were active and listening on their respective A2A JSON-RPC ports.
Benchmark Scripting: Created a customized benchmarking script run_benchmark_1_22.py that targets prime calculations from $N = 1$ to $N = 22$.
Execution: Ran the benchmark and recorded the exact computation time (within the tool call) and the total Agent-to- Agent (A2A) network/orchestration latency (RTT) for every prime count $N$.
Data Plotting: Generated visualizations for both calculation times ( prime_calculation_times_1_22.png ) and overall latency ( a2a_latency_times_1_22.png ).
Reporting: Compiled all calculations and visualizations into the analysis_results.md artifact.

Please view analysis_results.md to inspect the table and embedded plots showing the comparison of calculation speeds

and

latency characteristics!

Review Performance

The results of the benchmark are graphed:

With latency times:

Final Results

Antigravity CLI summarized the results of the Agent Benchmarking:

Formulating Key Findings

Here is a summary of the key findings from the benchmark results for primes 1 to 22:

### 1. Core Computation Performance (Lucas-Lehmer Prime Calculation)

• Rust is the fastest: Computes $N=22$ in 195.96 ms.

• Node.js (TS) is extremely close: Computes $N=22$ in 214.06 ms.

• Go performs well: Computes $N=22$ in 279.54 ms.

• Python is the slowest: Computes $N=22$ in 572.48 ms (roughly 3x slower than Rust).

### 2. A2A Framework & Protocol Overhead

• Minimal Overhead (Rust & Node.js):

• Rust and Node.js use lightweight custom HTTP wrappers. Their framework overhead is practically negligible (< 2

ms). For example, Node's total round-trip time (RTT) for $N=22$ is 215.78 ms (only 1.72 ms higher than the raw

calculation time).

• Significant Base Overhead (Go & Python):

• Both Go and Python use the official google-adk library. This introduces a fixed baseline overhead per request

(schema validation, logging, and metadata context processing).

• Go incurs a consistent baseline RTT of ~1.2s to 1.6s.

• Python incurs a consistent baseline RTT of ~2.3s to 3.0s.

and a deeper dive:

Analyzing Agent Initialization

Based on an analysis of the codebase and execution paths, here is the breakdown of why the benchmark produced these

results:

──────

### Factor 1: The Massive Latency (RTT) Difference (Framework vs. Mocking)

The most striking difference is that Rust and Node.js have near-zero overhead, whereas Go takes ~1.5s and Python

takes

~2.5s. This is entirely due to how the agents process the benchmark messages:

Rust and Node.js (Mocked Agent Logic / Direct Execution): • Both main.rs and server.ts implement custom JSON-RPC endpoints that intercept incoming queries and bypass the LLM entirely. • They use simple regex to extract the number of primes directly from the text "Calculate the first {N} Mersenne primes" , execute the math, and return a mock agent response message. Since there are no LLM API calls, the overhead is practically zero.
Go and Python (Real Agentic Execution via ADK): • Both Go (main.go) and Python (agent.py) use the official Agent Development Kit ( google-adk ) and hook up the Gemini 2.5 Flash model. • When they receive a message, the ADK runner executes a full agent reasoning loop. This involves two sequential remote LLM calls (one to determine which tool to call, and another to formulate the final natural language response). • This remote LLM invocation via the internet accounts for the 1.5–3.0 second latency overhead.

──────

### Factor 2: Mathematical Computation Speeds (Lucas-Lehmer Loop)

At higher values (e.g., $N=22$, where the Mersenne number $2^{9941}-1$ has 2,993 decimal digits and the Lucas-Lehmer

test requires 9,939 iterations of squaring and modular reductions), we see clear performance tiers:

Rust (195.96 ms): • Rust is compiled to optimized native machine code (using --release ). The num-bigint crate executes the big integer arithmetic without garbage collection overhead, making it the most efficient compiled loop.
Node.js (214.06 ms): • V8's JIT compiler is extremely good at compiling JavaScript loops to native machine instructions. Additionally, V8's native BigInt implementation is heavily optimized at the engine level (written in assembly/C++).
Go (279.54 ms): • While Go compiles to native code, its implementation in main.go performs heap allocations inside the hot loop: a := new(big.Int).Rsh(s, uint(p)) b := new(big.Int).And(s, m) Allocating new big.Int objects inside a loop that runs thousands of times puts extra pressure on Go's runtime allocator and garbage collector, dragging down performance compared to Rust.
Python (572.48 ms): • Python handles arbitrary-precision integers natively, but executing the loop in interpreted bytecode has a lot of interpreter overhead compared to compiled or JIT-compiled languages. 9,939 iterations of the loop in bytecode is the bottleneck.

Core Findings

Summary

The goal of the demo/article was to get basic Agents implemented across multiple programming languages and benchmark the Agent performance finding Mersenne primes.

Pokemon Battle Simulator Napkin Challenge!

xbill — Tue, 26 May 2026 23:07:48 +0000

  ▄▀▀▄        Antigravity CLI 1.0.2
 ▀▀▀▀▀▀       xbill9@gmail.com (Google AI Ultra)
▀▀▀▀▀▀▀▀      Gemini 3.5 Flash (Medium)

▄▀▀ ▀▀▄ ~/pokemon
▄▀▀ ▀▀▄
────────────────────────────────────────────────────────────

use @pokemon_api.jpg as the system architecture and build out a pokemon battle simulator

● ListDir(/home/xbill/pokemon)
● Read(/home/xbill/pokemon/pokemon_api.jpg) (ctrl+o to expand)
▸ Thought for 7s, 852 tokens
Observing the Image

● Bash(node -v && npm -v && python3 --version) (ctrl+o to expand)

▸ Thought for 6s, 2.5k tokens
Considering Language Choice

● Create(/home/xbill/.gemini/antigravity-cli/brain/42...abf-b329-894a07ec5075/implementation_plan.md) (ctrl+o to expand)

I have created the implementation plan for the Pokemon Battle Simulator matching the architecture in the diagram.

Please review the implementation_plan.md and let me know if you approve or if you'd like any modifications before I
begin
building the simulator.

────────────────────────────────────────────────────────────

yes i approve
▸ Thought for 6s, 304 tokens
Initiating Execution Phase

● Create(/home/xbill/.gemini/antigravity-cli/brain/42560c5f-0aec-4abf-b329-894a07ec5075/task.md)
● Create(/home/xbill/pokemon/package.json)
● Edit(/home/xbill/.gemini/antigravity-cli/brain/42560c5f-0aec-4abf-b329-894a07ec5075/task.md) (ctrl+o to expand)

▸ Thought for 7s, 364 tokens
Defining API Layer

● Create(/home/xbill/pokemon/src/api/pokeApi.js)
● Edit(/home/xbill/.gemini/antigravity-cli/brain/42560c5f-0aec-4abf-b329-894a07ec5075/task.md) (ctrl+o to expand)

▸ Thought for 7s, 1.4k tokens
Defining Move Database

● Create(/home/xbill/pokemon/src/api/moveDatabase.js)
● Edit(/home/xbill/.gemini/antigravity-cli/brain/42560c5f-0aec-4abf-b329-894a07ec5075/task.md) (ctrl+o to expand)

▸ Thought for 7s, 2.6k tokens
Defining Battle SDK Logic

● Create(/home/xbill/pokemon/src/sdk/battleSdk.js)
● Edit(/home/xbill/.gemini/antigravity-cli/brain/42560c5f-0aec-4abf-b329-894a07ec5075/task.md) (ctrl+o to expand)

▸ Thought for 1s, 373 tokens
Designing Color Utility

● Create(/home/xbill/pokemon/src/utils/colors.js)
● Edit(/home/xbill/.gemini/antigravity-cli/brain/42560c5f-0aec-4abf-b329-894a07ec5075/task.md) (ctrl+o to expand)

▸ Thought for 6s, 1.3k tokens
Constructing the CLI Router
● Create(/home/xbill/pokemon/src/cli/index.js)
● Edit(/home/xbill/.gemini/antigravity-cli/brain/42560c5f-0aec-4abf-b329-894a07ec5075/task.md)
● Create(/home/xbill/pokemon/index.js)
● Edit(/home/xbill/.gemini/antigravity-cli/brain/42560c5f-0aec-4abf-b329-894a07ec5075/task.md) (ctrl+o to expand)

▸ Thought for 8s, 228 tokens
Defining Test Runner Goals

● Create(/home/xbill/pokemon/src/utils/test_runner.js)
● Bash(node src/utils/test_runner.js)
● Edit(/home/xbill/.gemini/antigravity-cli/brain/42560c5f-0aec-4abf-b329-894a07ec5075/task.md) (ctrl+o to expand)

▸ Thought for 8s, 142 tokens
Drafting Walkthrough Documentation

● Create(/home/xbill/.gemini/antigravity-cli/brain/42560c5f-0aec-4abf-b329-894a07ec5075/walkthrough.md)
● Edit(/home/xbill/.gemini/antigravity-cli/brain/42560c5f-0aec-4abf-b329-894a07ec5075/task.md) (ctrl+o to expand)

▸ Thought for 5s, 517 tokens
I have built out the Pokemon Battle Simulator matching the system architecture diagram in @pokemon_api.jpg .

Here is a summary of the components created:

• API Client (pokeApi.js): Connects to PokeAPI with local filesystem caching.
Verifying Project Completion
• Move Registry (moveDatabase.js): Predefined type databases and dynamic move generator prioritizing STAB.
• Gameplay SDK (battleSdk.js): Performs combat math, turn ordering, level scaling (Lv. 50 for team, Lv. 100 with 3x
HP for Bosses), and state execution.
• Color & Formatter (colors.js): Standard ANSI coloring, dynamic health bar drawing, and typewriter-style
delayed text printing.
• CLI Shell & Entry Point (index.js & index.js): Command router supporting /pick team , /stats , /run
battle , /help , and /exit .

### Playing the Simulator

To launch the game:

node index.js

For full details on the codebase and how features were tested, please refer to the walkthrough.md.

────────────────────────────────────────────────────────────

node index.js

▸ Thought for 5s, 420 tokens
Considering Interactive CLI Commands

● Bash(node index.js)
● ManageTask(Task: 42560c5f-0aec-4abf-b329-894a07ec5075/task-56) (ctrl+o to expand)

The simulator has successfully started and is running in the background:

╔════════════════════════════╗
║  POKEMON BATTLE SIMULATOR  ║
╚════════════════════════════╝
Welcome, Trainer! Prepare for epic battles.
Type /help to see available commands.

poke-sim>

You can interact with it by typing commands directly. If you want me to input a command for you, just tell me what to
type (e.g. /help , /pick team pikachu charizard , or /run battle )!

The simulator is ready and waiting at the poke-sim> prompt. Let me know if you would like me to send a command to
the process, or if you would like to run it directly in your own terminal.

────────────────────────────────────────────────────────────

/pick team squirtle mew lucario

● ManageTask(Input sent to task "42560c5f-0aec-4abf-b329-894a07ec5075/task-56".)
● ManageTask(Task: 42560c5f-0aec-4abf-b329-894a07ec5075/task-56) (ctrl+o to expand)
The team has been successfully selected:
• Squirtle
• Mew
• Lucario

The simulator is ready for your next command (e.g. /stats or /run battle ).

The team is loaded! What would you like to do next? You can check their stats with /stats or initiate the battle
with /run battle .

────────────────────────────────────────────────────────────

/run battle

● ManageTask(Input sent to task "42560c5f-0aec-4abf-b329-894a07ec5075/task-56".)
● ManageTask(Task: 42560c5f-0aec-4abf-b329-894a07ec5075/task-56) (ctrl+o to expand)

The simulator is asking you to choose a Boss Pokemon to face:

Choose a Boss Pokemon (Default: Mewtwo):

Which Pokemon would you like to battle? (Or press Enter to face Mewtwo).

────────────────────────────────────────────────────────────

mewtwo

The battle has begun! Here is the current field state:

════════════════════════════════════════════════════════════
⚡ BOSS MEWTWO [Lv. 100]
Type:  PSYCHIC
HP:   [█████████████████████████] 1011/1011
────────────────────────────────────────────────────────────
🟢 YOUR SQUIRTLE [Lv. 50]
Type:  WATER
HP:   [█████████████████████████]   111/111
════════════════════════════════════════════════════════════

What will Squirtle do?
  1) Hydro Pump [ WATER ] Power: 110 (Spec)
  2) Surf [ WATER ] Power: 90 (Spec)
  3) Waterfall [ WATER ] Power: 80 (Phys)
  4) Tera Blast [ FIRE ] Power: 110 (Spec)
  5) Switch Pokemon
  6) Run Away

Select an action (1-6):

Please choose a move (1-4), switch Pokemon (5), or run (6).

────────────────────────────────────────────────────────────

1

● ManageTask(Input sent to task "42560c5f-0aec-4abf-b329-894a07ec5075/task-56".)
● ManageTask(Task: 42560c5f-0aec-4abf-b329-894a07ec5075/task-56) (ctrl+o to expand)

Oh no! Mewtwo was too fast:

• Mewtwo used Psychic! Dealt 465 damage.
• Squirtle fainted!

You are forced to send out your next Pokemon. Here are your remaining members:
You must send out another Pokemon!
Choose a Pokemon:
1) Squirtle [Lv. 50] - FAINTED (Active)
2) Mew [Lv. 50] - HP: 167/167
3) Lucario [Lv. 50] - HP: 137/137

Please choose either 2 (Mew) or 3 (Lucario):

Please choose 2 for Mew or 3 for Lucario.

────────────────────────────────────────────────────────────

2

Mew is now on the field:

════════════════════════════════════════════════════════════
⚡ BOSS MEWTWO [Lv. 100]
Type:  PSYCHIC
HP:   [█████████████████████████] 1011/1011
────────────────────────────────────────────────────────────
🟢 YOUR MEW [Lv. 50]
Type:  PSYCHIC
HP:   [█████████████████████████]   167/167
════════════════════════════════════════════════════════════

What will Mew do?
  1) Psychic [ PSYCHIC ] Power: 90 (Spec)
  2) Zen Headbutt [ PSYCHIC ] Power: 80 (Phys)
  3) Psybeam [ PSYCHIC ] Power: 65 (Spec)
  4) Thunder Punch [ FIRE ] Power: 110 (Phys)
  5) Switch Pokemon
  6) Run Away

Select option (1-6):
Please select an action from 1 to 6.
────────────────────────────────────────────────────────────

3

● ManageTask(Input sent to task "42560c5f-0aec-4abf-b329-894a07ec5075/task-56".)
● ManageTask(Task: 42560c5f-0aec-4abf-b329-894a07ec5075/task-56) (ctrl+o to expand)

Mewtwo is incredibly powerful and fast!

• Mewtwo went first and used Thunder Punch! Dealt 195 damage.
• Mew fainted!

You only have Lucario left:
Choose a Pokemon:
1) Squirtle [Lv. 50] - FAINTED
2) Mew [Lv. 50] - FAINTED (Active)
3) Lucario [Lv. 50] - HP: 137/137

Please select 3 (Lucario) to send it out:

Please select 3 to send out Lucario.

────────────────────────────────────────────────────────────

3

● ManageTask(Input sent to task "42560c5f-0aec-4abf-b329-894a07ec5075/task-56".)
● ManageTask(Task: 42560c5f-0aec-4abf-b329-894a07ec5075/task-56)
● ManageTask(Input sent to task "42560c5f-0aec-4abf-b329-894a07ec5075/task-56".)
● ManageTask(Task: 42560c5f-0aec-4abf-b329-894a07ec5075/task-56) (ctrl+o to expand)
Lucario is now on the field!

════════════════════════════════════════════════════════════
⚡ BOSS MEWTWO [Lv. 100]
Type:  PSYCHIC
HP:   [█████████████████████████] 1011/1011
────────────────────────────────────────────────────────────
🟢 YOUR LUCARIO [Lv. 50]
Type:  FIGHT   STEEL
HP:   [█████████████████████████]   137/137
════════════════════════════════════════════════════════════

What will Lucario do?
  1) Close Combat [ FIGHT ] Power: 120 (Phys)
  2) Aura Sphere [ FIGHT ] Power: 80 (Spec)
  3) Flash Cannon [ STEEL ] Power: 80 (Spec)
  4) Thunder Punch [ FIRE ] Power: 110 (Phys)
  5) Switch Pokemon
  6) Run Away

Select option (1-6):

Please select an option from 1 to 6.

────────────────────────────────────────────────────────────

4

Mewtwo was a bit too powerful for our team!

• Mewtwo went first and used Thunder Punch (Fire-type move)! Dealt 520 damage.
• Lucario fainted!

Since all your Pokemon fainted, you were defeated:

╔══════════╗
║  DEFEAT  ║
╚══════════╝

💀 All your Pokemon fainted. Boss Mewtwo wins.

The game has successfully returned to the main lobby shell:

poke-sim>

You can now:

Search and pick a larger or different team (e.g. up to 6 Pokemon with /pick team ) to try to take on Mewtwo again.
Select a weaker boss (e.g. /run battle and choose pikachu or bulbasaur as the boss) to test out your team!
Exit the simulator with /exit .

What would you like to do next? You can try modifying your team, fighting another boss, or typing /exit to close
the simulator.

AWS Cloud Shell with Antigravity CLI

xbill — Tue, 26 May 2026 19:07:25 +0000

This article covers the initial setup and configuration for the Antigravity CLI on the AWS Cloud Shell. Why? Sometimes you need more than one Cloud!

What is Antigravity CLI?

The Google Antigravity CLI is a lightweight Terminal User Interface (TUI) that allows developers to invoke, monitor, and interact with Google Antigravity AI agents directly from the command line. It pairs with Antigravity 2.0 to offer the same core agentic capabilities — like multi-step reasoning, tool calling, and sub-agents — without the overhead of a desktop application.

More information is available here:

Google Antigravity

Amazon Cloud Shell

AWS CloudShell is a browser-based, pre-authenticated terminal that allows users to securely manage, explore, and interact with AWS resources directly from the AWS Management Console. It comes pre-installed with the AWS CLI, popular programming languages (Python, Node.js), and developer tools, removing the need for local configuration.

The full background information on the Cloud Shell is here:

Browser-based Shell - AWS CloudShell - AWS

Why would I want Antigravity CLI on AWS? Isn’t that a Google Thing?

Yes- Antigravity CLI leverages the Google Gemini models but it is also platform agnostic. Many applications are already cross-cloud so this enables familiar tools to be run natively on Amazon AWS.

Activating Amazon CloudShell

The AWS CloudShell comes preinstalled on all AWS Accounts.

To access the shell — click the Terminal Icon in the upper right hand corner of the console:

Then activate the full shell:

Note- the disk space in the environment is limited to 1GB.

Download and Install agy CLI

The main Antigravity download page is here:

Google Antigravity

For a terminal window/ shell install:

curl -fsSL https://antigravity.google/cli/install.sh | bash

This will start the installation process:

⠋ Detecting system environment...
✓ Platform detected: linux_amd64
⠋ Querying release repository...
✓ Latest available version: 1.0.2
⠋ Downloading release package...
✓ Download complete and checksum verified.
⠋ Extracting binary from archive...
⠋ Configuring shell environment...
I0526 14:30:04.912941 241 installer.go:27] Running Antigravity CLI setup...
I0526 14:30:04.913030 241 installer.go:147] Appending PATH export to profile /home/cloudshell-user/.bashrc: export PATH="/home/cloudshell-user/.local/bin:$PATH"
I0526 14:30:04.913261 241 installer.go:184] Successfully updated shell profile: /home/cloudshell-user/.bashrc
I0526 14:30:04.913319 241 installer.go:147] Appending PATH export to profile /home/cloudshell-user/.zshrc: export PATH="/home/cloudshell-user/.local/bin:$PATH"
I0526 14:30:04.913665 241 installer.go:184] Successfully updated shell profile: /home/cloudshell-user/.zshrc
I0526 14:30:04.913726 241 installer.go:147] Appending PATH export to profile /home/cloudshell-user/.zprofile: export PATH="/home/cloudshell-user/.local/bin:$PATH"
I0526 14:30:04.913856 241 installer.go:184] Successfully updated shell profile: /home/cloudshell-user/.zprofile
I0526 14:30:04.913898 241 installer.go:147] Appending PATH export to profile /home/cloudshell-user/.bash_profile: export PATH="/home/cloudshell-user/.local/bin:$PATH"
I0526 14:30:04.914038 241 installer.go:184] Successfully updated shell profile: /home/cloudshell-user/.bash_profile
I0526 14:30:04.914070 241 installer_unix.go:40] PATH verification: ~/.local/bin is correctly configured in active PATH environment.

✅ Antigravity CLI installed successfully at /home/cloudshell-user/.local/bin/agy
Run 'agy' to start the CLI
~ $

Testing the Antigravity CLI Environment

Once you have the package installed — you can test the startup of Antigravity CLI. You will need to authenticate with a Google Project or your Google Account:

Then step through the process:

And finally the One time code:

Initial Configuration

Once you have installed and authenticated. Step through initial configuration:

Are we There Yet!?!?

If all the stars align and you do the correct sacrifices to the pagan Gods then at this point you should be able to start up the agy CLI and get down to business.

The startup command is agy:

agy

All Right Already! Can you actually do something?

Fire up that terminal window and get to work:

> what does the aws command line tool do?

  The AWS Command Line Interface (AWS CLI) is a unified tool that allows you to manage and control Amazon Web Services
  (AWS) directly from your terminal or command-line shell.

  Instead of clicking through the web-based AWS Management Console, you can interact with almost all AWS services by
  running commands.

  ### Key Capabilities

  • Service Management: You can provision, configure, and delete AWS resources across services like Amazon S3 (storage),
  Amazon EC2 (virtual servers), DynamoDB (databases), IAM (identity/access management), and more.
  • Automation and Scripting: Because it runs in the shell, you can write scripts (using Bash, Zsh, or PowerShell) to
  automate complex tasks, deployments, backups, and resource monitoring.
  • API Integration: The CLI is a wrapper around the AWS REST APIs. It handles request signing, authentication,
  pagination, and output formatting (supporting JSON, YAML, text, or tables).

  ### Basic Command Structure

  AWS CLI commands follow a standard pattern:

    aws <service> <operation> [options-and-parameters]

  ### Common Examples

  1. List files in an Amazon S3 bucket:
    aws s3 ls s3://my-bucket-name

  2. Upload a file to Amazon S3:
    aws s3 cp document.pdf s3://my-bucket-name/

  3. List running Amazon EC2 instances:
    aws ec2 describe-instances --filters "Name=instance-state-name,Values=running"

  4. Get the identity details of the current CLI user/role:
    aws sts get-caller-identity

──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
>

Describe the AWS environment:

  Here is the environment information retrieved using the AWS CLI tool:

  ### 1. AWS CLI Version Info

  • CLI Version: 2.34.50 
  • Python Version: 3.14.4 
  • OS Platform: Linux (Amazon Linux 2023, run from AWS CloudShell)

  ### 2. AWS CLI Configuration

  • Active Profile: None set (default profile is active)
  • Access & Secret Keys: Configured via IAM role credentials (container-role)
  • Default Region: us-east-1 (defined by the AWS_REGION or AWS_DEFAULT_REGION environment variables)

And AWS specific commands:

> aws s3 ls

● Bash(aws s3 ls --no-cli-pager) (ctrl+o to expand)

  Here is the list of Amazon S3 buckets in your account:

  • bedrock-agentcore-runtime-106059658660-us-west-2-0jicfccoj2 (Created: 2026-05-08 13:52:46)
  • cdk-hnb659fds-assets-106059658660-us-east-1 (Created: 2026-05-06 01:16:31)
  • elasticbeanstalk-us-east-1-106059658660 (Created: 2026-05-08 20:56:05)
  • my-adk-comic-artifacts (Created: 2026-04-21 16:11:53)
  • xbill-test (Created: 2026-03-10 14:22:51)

──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
>

Antigravity Commands

Get high level agy e nvironment info:

>
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Antigravity CLI general commands shortcuts (←/→ or tab to cycle)
Antigravity CLI understands your codebase, makes edits with your permission,
and executes commands — right from your terminal.

Version 1.0.2
xbill9@gmail.com (Google AI Ultra)
Workspace: ~
Project: /home/cloudshell-user
Conversation: 2182496d-3225-4864-a42d-af69db5684bc

You can switch between models:

>
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Switch Model

  Gemini 3.5 Flash (Medium) (current)
  Gemini 3.5 Flash (High)
> Gemini 3.5 Flash (Low)      
  Gemini 3.1 Pro (Low)
  Gemini 3.1 Pro (High)
  Claude Sonnet 4.6 (Thinking)
  Claude Opus 4.6 (Thinking)
  GPT-OSS 120B (Medium)

Get Context info:

>
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
└ Context Usage
◉ ◉ ◉ ◉ ◉ ◉ ◉ ◉ ◉ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ Gemini 3.5 Flash (Low) · 28.4k/1.0M tokens
□ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ (2.7%)
□ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ Token usage by category
□ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ ◉ User messages: 19 tokens (0.0%)
□ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ ◉ Agent responses: 4.5k tokens (0.4%)
□ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ ◉ Tool calls: 163 tokens (0.0%)
□ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ ⛁ System prompt: 7.7k tokens (0.7%)
□ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ ⛁ System tools: 15.4k tokens (1.5%)
□ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ ⛁ Subagents: 662 tokens (0.1%)
□ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ Free space: 1.0M (97.3%)
□ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ ⊠ Checkpoint buffer: 571 tokens (not counted in usage)

Checkpoints (1) · /rewind
└ Checkpoint 1 (active, in context): Initializing Coding Workspace

Check your quota:

>
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
└ Model Quota

  Gemini 3.5 Flash (Medium)
  ███████████ ███████████ ███████████ ███████████ ███████████ 100%
  Quota available

  Gemini 3.5 Flash (High)
  ███████████ ███████████ ███████████ ███████████ ███████████ 100%
  Quota available

  Gemini 3.5 Flash (Low)
  ███████████ ███████████ ███████████ ███████████ ███████████ 100%
  Quota available

  Gemini 3.1 Pro (Low)
  ███████████ ███████████ ███████████ ███████████ ███████████ 100%
  Quota available

Summary

Antigravity CLI was introduced as a compliment to the existing GUI based IDE versions. Antigravity CLI was then installed from the official Google sources to the AWS Cloud Shell. Sample AWS specific querys were tested and some common slash commands were highlighted.

AWS Cloud Shell with Antigravity CLI

xbill — Tue, 26 May 2026 14:48:49 +0000

This article covers the initial setup and configuration for the Antigravity CLI on the AWS Cloud Shell. Why? Sometimes you need more than one Cloud!

What is Antigravity CLI?

More information is available here:

Google Antigravity

Amazon Cloud Shell

The full background information on the Cloud Shell is here:

Browser-based Shell - AWS CloudShell - AWS

Why would I want Antigravity CLI on AWS? Isn’t that a Google Thing?

Yes- Antigravity CLI leverages the Google Gemini models but it is also platform agnostic. Many applications are already cross-cloud so this enables familiar tools to be run natively on Amazon AWS.

Activating Amazon CloudShell

The AWS CloudShell comes preinstalled on all AWS Accounts.

To access the shell — click the Terminal Icon in the upper right hand corner of the console:

Then activate the full shell:

Note- the disk space in the environment is limited to 1GB.

Download and Install agy CLI

The main Antigravity download page is here:

Google Antigravity

For a terminal window/ shell install:

curl -fsSL https://antigravity.google/cli/install.sh | bash

This will start the installation process:

⠋ Detecting system environment...
✓ Platform detected: linux_amd64
⠋ Querying release repository...
✓ Latest available version: 1.0.2
⠋ Downloading release package...
✓ Download complete and checksum verified.
⠋ Extracting binary from archive...
⠋ Configuring shell environment...
I0526 14:30:04.912941 241 installer.go:27] Running Antigravity CLI setup...
I0526 14:30:04.913030 241 installer.go:147] Appending PATH export to profile /home/cloudshell-user/.bashrc: export PATH="/home/cloudshell-user/.local/bin:$PATH"
I0526 14:30:04.913261 241 installer.go:184] Successfully updated shell profile: /home/cloudshell-user/.bashrc
I0526 14:30:04.913319 241 installer.go:147] Appending PATH export to profile /home/cloudshell-user/.zshrc: export PATH="/home/cloudshell-user/.local/bin:$PATH"
I0526 14:30:04.913665 241 installer.go:184] Successfully updated shell profile: /home/cloudshell-user/.zshrc
I0526 14:30:04.913726 241 installer.go:147] Appending PATH export to profile /home/cloudshell-user/.zprofile: export PATH="/home/cloudshell-user/.local/bin:$PATH"
I0526 14:30:04.913856 241 installer.go:184] Successfully updated shell profile: /home/cloudshell-user/.zprofile
I0526 14:30:04.913898 241 installer.go:147] Appending PATH export to profile /home/cloudshell-user/.bash_profile: export PATH="/home/cloudshell-user/.local/bin:$PATH"
I0526 14:30:04.914038 241 installer.go:184] Successfully updated shell profile: /home/cloudshell-user/.bash_profile
I0526 14:30:04.914070 241 installer_unix.go:40] PATH verification: ~/.local/bin is correctly configured in active PATH environment.

✅ Antigravity CLI installed successfully at /home/cloudshell-user/.local/bin/agy
Run 'agy' to start the CLI
~ $

Testing the Antigravity CLI Environment

Once you have the package installed — you can test the startup of Antigravity CLI. You will need to authenticate with a Google Project or your Google Account:

Then step through the process:

And finally the One time code:

Initial Configuration

Once you have installed and authenticated. Step through initial configuration:

Are we There Yet!?!?

If all the stars align and you do the correct sacrifices to the pagan Gods then at this point you should be able to start up the agy CLI and get down to business.

The startup command is agy:

agy

All Right Already! Can you actually do something?

Fire up that terminal window and get to work:

> what does the aws command line tool do?

  The AWS Command Line Interface (AWS CLI) is a unified tool that allows you to manage and control Amazon Web Services
  (AWS) directly from your terminal or command-line shell.

  Instead of clicking through the web-based AWS Management Console, you can interact with almost all AWS services by
  running commands.

  ### Key Capabilities

  • Service Management: You can provision, configure, and delete AWS resources across services like Amazon S3 (storage),
  Amazon EC2 (virtual servers), DynamoDB (databases), IAM (identity/access management), and more.
  • Automation and Scripting: Because it runs in the shell, you can write scripts (using Bash, Zsh, or PowerShell) to
  automate complex tasks, deployments, backups, and resource monitoring.
  • API Integration: The CLI is a wrapper around the AWS REST APIs. It handles request signing, authentication,
  pagination, and output formatting (supporting JSON, YAML, text, or tables).

  ### Basic Command Structure

  AWS CLI commands follow a standard pattern:

    aws <service> <operation> [options-and-parameters]

  ### Common Examples

  1. List files in an Amazon S3 bucket:
    aws s3 ls s3://my-bucket-name

  2. Upload a file to Amazon S3:
    aws s3 cp document.pdf s3://my-bucket-name/

  3. List running Amazon EC2 instances:
    aws ec2 describe-instances --filters "Name=instance-state-name,Values=running"

  4. Get the identity details of the current CLI user/role:
    aws sts get-caller-identity

──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
>

Describe the AWS environment:

  Here is the environment information retrieved using the AWS CLI tool:

  ### 1. AWS CLI Version Info

  • CLI Version: 2.34.50 
  • Python Version: 3.14.4 
  • OS Platform: Linux (Amazon Linux 2023, run from AWS CloudShell)

  ### 2. AWS CLI Configuration

  • Active Profile: None set (default profile is active)
  • Access & Secret Keys: Configured via IAM role credentials (container-role)
  • Default Region: us-east-1 (defined by the AWS_REGION or AWS_DEFAULT_REGION environment variables)

And AWS specific commands:

> aws s3 ls

● Bash(aws s3 ls --no-cli-pager) (ctrl+o to expand)

  Here is the list of Amazon S3 buckets in your account:

  • bedrock-agentcore-runtime-106059658660-us-west-2-0jicfccoj2 (Created: 2026-05-08 13:52:46)
  • cdk-hnb659fds-assets-106059658660-us-east-1 (Created: 2026-05-06 01:16:31)
  • elasticbeanstalk-us-east-1-106059658660 (Created: 2026-05-08 20:56:05)
  • my-adk-comic-artifacts (Created: 2026-04-21 16:11:53)
  • xbill-test (Created: 2026-03-10 14:22:51)

──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
>

Antigravity Commands

Get high level agy e nvironment info:

>
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Antigravity CLI general commands shortcuts (←/→ or tab to cycle)
Antigravity CLI understands your codebase, makes edits with your permission,
and executes commands — right from your terminal.

Version 1.0.2
xbill9@gmail.com (Google AI Ultra)
Workspace: ~
Project: /home/cloudshell-user
Conversation: 2182496d-3225-4864-a42d-af69db5684bc

You can switch between models:

>
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Switch Model

  Gemini 3.5 Flash (Medium) (current)
  Gemini 3.5 Flash (High)
> Gemini 3.5 Flash (Low)      
  Gemini 3.1 Pro (Low)
  Gemini 3.1 Pro (High)
  Claude Sonnet 4.6 (Thinking)
  Claude Opus 4.6 (Thinking)
  GPT-OSS 120B (Medium)

Get Context info:

>
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
└ Context Usage
◉ ◉ ◉ ◉ ◉ ◉ ◉ ◉ ◉ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ Gemini 3.5 Flash (Low) · 28.4k/1.0M tokens
□ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ (2.7%)
□ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ Token usage by category
□ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ ◉ User messages: 19 tokens (0.0%)
□ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ ◉ Agent responses: 4.5k tokens (0.4%)
□ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ ◉ Tool calls: 163 tokens (0.0%)
□ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ ⛁ System prompt: 7.7k tokens (0.7%)
□ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ ⛁ System tools: 15.4k tokens (1.5%)
□ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ ⛁ Subagents: 662 tokens (0.1%)
□ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ Free space: 1.0M (97.3%)
□ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ ⊠ Checkpoint buffer: 571 tokens (not counted in usage)

Checkpoints (1) · /rewind
└ Checkpoint 1 (active, in context): Initializing Coding Workspace

Check your quota:

>
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
└ Model Quota

  Gemini 3.5 Flash (Medium)
  ███████████ ███████████ ███████████ ███████████ ███████████ 100%
  Quota available

  Gemini 3.5 Flash (High)
  ███████████ ███████████ ███████████ ███████████ ███████████ 100%
  Quota available

  Gemini 3.5 Flash (Low)
  ███████████ ███████████ ███████████ ███████████ ███████████ 100%
  Quota available

  Gemini 3.1 Pro (Low)
  ███████████ ███████████ ███████████ ███████████ ███████████ 100%
  Quota available

Summary

Azure Cloud Shell with Antigravity CLI

xbill — Tue, 26 May 2026 14:18:48 +0000

This article covers the initial setup and configuration for the Antigravity CLI on the Azure Cloud Shell environment.

What is Antigravity CLI?

More information is available here:

Google Antigravity

Azure Cloud Shell

Azure Cloud Shell is a browser-based, authenticated terminal for managing Azure resources, accessible directly through the Azure portal, shell.azure.com, or the mobile app. It provides a preconfigured environment with Bash or PowerShell, eliminating local installation of tools. It runs on a temporary, secure, Linux-based host.

The full background information on the Cloud Shell is here:

https://azure.microsoft.com/en-us/get-started/azure-portal/cloud-shell

More information is available here:

Get started with Azure Cloud Shell

Why would I want Antigravity CLI on Azure? Isn’t that a Google Thing?

Yes- Antigravity CLI leverages the Google Cloud console and Gemini models but it is also platform agnostic. Many applications are already cross-cloud so this enables familiar tools to be run natively on Microsoft Azure.

Activating Azure Cloud Shell

The Azure Cloud Shell comes preinstalled on all Azure Accounts.

To access the shell — click the Terminal Icon in the upper right hand corner of the console:

Then activate the full shell:

Note- the disk space in the environment is limited to 5GB- but this is sufficient for Antigravity CLI, Google Cloud Tools, and some customization.

Azure CLI

The Azure Command-Line Interface (CLI) is a cross-platform tool used to connect to Azure and execute administrative commands on Azure resources. It allows you to manage services like virtual machines, databases, and networking through a terminal using interactive prompts or scripts. [1, 2, 3]

More details are available here:

What is the Azure Developer CLI?

Download and Install agy CLI

The main Antigravity CLI download page is here:

Google Antigravity

For a terminal window/ shell install:

curl -fsSL https://antigravity.google/cli/install.sh | bash

This will start the installation process:

william [~]$ curl -fsSL https://antigravity.google/cli/install.sh | bash
⠋ Detecting system environment...
✓ Platform detected: linux_amd64
⠋ Querying release repository...
✓ Latest available version: 1.0.2
⠋ Downloading release package...
✓ Download complete and checksum verified.
⠋ Extracting binary from archive...
⠋ Configuring shell environment...
I0526 14:06:50.576172 5501 installer.go:27] Running Antigravity CLI setup...
I0526 14:06:50.632909 5501 installer_unix.go:40] PATH verification: ~/.local/bin is correctly configured in active PATH environment.

✅ Antigravity CLI installed successfully at /home/william/.local/bin/agy
Run 'agy' to start the CLI
william [~]$

Testing the Antigravity CLI Environment

Once you have the package installed — you can test the startup of Antigravity CLI. You will need to authenticate with a Google Project or your Google Account:

Then step through the process:

And finally the One time code:

Initial Configuration

Once you have installed and authenticated. Step through initial configuration:

Are we There Yet!?!?

If all the stars align and you do the correct sacrifices to the pagan Gods then at this point you should be able to start up the agy CLI and get down to business.

The startup command is agy:

Run 'agy' to start the CLI
william [~]$ agy

      ▄▀▀▄ Antigravity CLI 1.0.2
     ▀▀▀▀▀▀ xbill9@gmail.com (Google AI Ultra)
    ▀▀▀▀▀▀▀▀ Gemini 3.5 Flash (Medium)
   ▄▀▀ ▀▀▄ ~
  ▄▀▀ ▀▀▄

────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
>
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
? for shortcuts

All Right Already! Can you actually do something with Antigravity CLI?

Fire up that terminal window and get to work:

> what is the azure cloud shell?

● WebSearch(Azure Cloud Shell definition features) (ctrl+o to expand)

  Azure Cloud Shell is a browser-accessible, authenticated, and pre-configured terminal environment hosted by Microsoft in the cloud.
  It is designed to let you manage and administer your Azure resources directly from your web browser without needing to install the
  Azure CLI or PowerShell locally.

  ### Key Features & Capabilities:

  1. Browser-Based and Accessible:
      • It requires no local software installations or complex setups.
      • You can access it from virtually anywhere via the Azure Portal https://portal.azure.com, directly at https://shell.azure.com,
      through the Azure Mobile App, or via integrations like the VS Code Azure Account extension.
  2. Choice of Environment:
      • Bash: A Linux-based environment (running on a container) ideal for standard bash scripting and using tools like git , make ,
      etc.
      • PowerShell: A Windows-friendly administrative shell that includes Azure Drive ( Azure: ), allowing you to navigate your Azure
      resources as if they were directories on a filesystem.
  3. Pre-configured Tools and Languages:
      • It comes pre-installed with common command-line utilities ( kubectl , terraform , ansible , docker , helm , git , etc.).
      • Includes programming runtimes and compilers (such as Python, Node.js, .NET, Go, and Java).
      • Features built-in text editors (such as nano , vim , and a built-in web-based graphical editor based on Monaco).
  4. Automatic Authentication:
      • Because it is authenticated through your Azure Portal session, you are automatically logged in when the terminal starts,
      allowing you to run management commands immediately without entering credentials.

Get detailed Azure Cloud Shell information:

> use the az command tool to get environment information
● ListDir(/home/william)
● Bash(az version)
● Bash(az account show) (ctrl+o to expand)

  I ran the az CLI to gather your environment information:

  ### Azure CLI Version ( az version )

    {
      "azure-cli": "2.86.0",
      "azure-cli-core": "2.86.0",
      "azure-cli-telemetry": "1.1.0",
      "extensions": {
        "ml": "2.42.0",
        "ssh": "2.0.8"
      }
    }

  ### Active Subscription and Account Information ( az account show )

    {
      "environmentName": "AzureCloud",
      "homeTenantId": "40482c55-d00d-4c6d-8903-643d76a74b9c",
      "id": "3db3ce66-50b6-4d11-91ef-5950cf4039ed",
      "isDefault": true,
      "managedByTenants": [],
      "name": "Azure subscription 1",
      "state": "Enabled",
      "tenantId": "40482c55-d00d-4c6d-8903-643d76a74b9c",
      "user": {
        "cloudShellID": true,
        "name": "live.com#xbill@glitnir.com",
        "type": "user"
      }
    }

──────────────────────────────────────────────────────────────────────────────────────────────────────────
>

Get high level agy e nvironment info:

>
──────────────────────────────────────────────────────────────────────────────────────────────────────────
Antigravity CLI general commands shortcuts (←/→ or tab to cycle)
Antigravity CLI understands your codebase, makes edits with your permission,
and executes commands — right from your terminal.

Version 1.0.2
xbill9@gmail.com (Google AI Ultra)
Workspace: ~
Project: /home/william

Quick Reference
  / Type / to see available commands

Keyboard: ↑/↓ Navigate ←/→ Switch View esc Close

esc to cancel

You can switch between models:

>
──────────────────────────────────────────────────────────────────────────────────────────────────────────
Switch Model

  Gemini 3.5 Flash (Medium)
  Gemini 3.5 Flash (High)
> Gemini 3.5 Flash (Low) (current)
  Gemini 3.1 Pro (Low)
  Gemini 3.1 Pro (High)
  Claude Sonnet 4.6 (Thinking)
  Claude Opus 4.6 (Thinking)
  GPT-OSS 120B (Medium)

Get Context info:

>
──────────────────────────────────────────────────────────────────────────────────────────────────────────
└ Context Usage
□ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ Gemini 3.5 Flash (Low) · 0/1.0M tokens
□ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ (0.0%)
□ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ Estimated usage (awaiting generation)
□ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ ◉ User messages: 0 tokens (0.0%)
□ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ ◉ Agent responses: 0 tokens (0.0%)
□ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ ◉ Tool calls: 0 tokens (0.0%)
□ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ Free space: 1.0M (100.0%)

Related: /artifact · /skill · /rewind

Check your quota:

>
──────────────────────────────────────────────────────────────────────────────────────────────────────────
└ Model Quota

  Gemini 3.5 Flash (Medium)
  ███████████ ███████████ ███████████ ███████████ ███████████ 100%
  Quota available

  Gemini 3.5 Flash (High)
  ███████████ ███████████ ███████████ ███████████ ███████████ 100%
  Quota available

  Gemini 3.5 Flash (Low)
  ███████████ ███████████ ███████████ ███████████ ███████████ 100%
  Quota available

  Gemini 3.1 Pro (Low)
  ███████████ ███████████ ███████████ ███████████ ███████████ 100%
  Quota available

  Gemini 3.1 Pro (High)
  ███████████ ███████████ ███████████ ███████████ ███████████ 100%
  Quota available

  Claude Sonnet 4.6 (Thinking)
  ███████████ ███████████ ███████████ ███████████ ███████████ 100%
  Quota available

  Claude Opus 4.6 (Thinking)
  ███████████ ███████████ ███████████ ███████████ ███████████ 100%
  Quota available

  GPT-OSS 120B (Medium)
  ███████████ ███████████ ███████████ ███████████ ███████████ 100%
  (1–30 of 33 lines)

Summary

Antigravity CLI was introduced as a compliment to the existing GUI based IDE versions. Antigravity CLI was then installed to Azure Cloud Shell from the official Google sources. A sample query was tested and some common slash commands were highlighted.

Deploying a Rust A2A Agent to Azure Container Apps (ACA)

xbill — Fri, 22 May 2026 15:46:31 +0000

This article covers building Agent applications with the Agent to Agent (A2A) protocol using the Rust programming language. A minimally viable A2A Rust Agent application was debugged and validated locally. Then- the entire solution is deployed to Azure Container Apps (ACA).

Rust A2A? Isn’t that a Python Thing?

The bulk of A2A Agents are in Python. The A2A protocol is language independent.

Python has traditionally been the main coding language for ML and AI tools. The goal of this article is to provide a test bed for building, debugging, and deploying cross language applications.

So where is the Beef?

So what is different about this lab compared to all the others out there?

This is one of the first deep dives into a Rust A2A agent leveraging the standard Rust crates.

What is Rust?

Rust is a high performance, memory safe, compiled language:

Rust

Rust provides memory safe operations beyond C/C++ and also can provide exceptional performance gains as it is compiled directly to native binaries.

Rust Setup

Instructions to install Rust are available here:

Getting started

For a Linux like environment the command looks like this:

curl — proto ‘=https’ — tlsv1.2 -sSf https://sh.rustup.rs | sh

Rust also depends on a working C compiler and OpenSSL setup. For a Debian 12/13 system — install the basic tools for development:

sudo apt install build-essential
sudo apt install libssl-dev
sudo apt install pkg-config
sudo apt-get install libudev-dev
sudo apt install make
sudo apt install git

Antigravity CLI

Antigravity CLI is the follow-on successor to Gemini CLI- the terminal driven, agent assisted coding tool.

Full details on installing Antigravity CLI are here:

Getting Started with Antigravity CLI

Testing the Antigravity CLI Environment

Once you have all the tools in place- you can test the startup of Antigravity CLI.

You will need to authenticate with a Google Cloud Project or your Google Account:

agy

This will start the interface:

Azure Container Apps

Azure Container Apps is a fully managed, serverless Kubernetes-based application platform designed for building and deploying modern, containerized apps without managing complex infrastructure. It enables scaling from zero to high demand, supports microservices, and handles event-driven processing with built-in HTTPS and observability.

Full details are available here:

https://azure.microsoft.com/en-us/products/container-apps

Why would I want A2A with Azure? Isn’t that a Google Thing?

A2A is open source and platform agnostic. Many applications are already cross-cloud so this enables familiar tools to be run natively on Microsoft Azure.

Where do I start?

The strategy for starting Rust A2A development is a incremental step by step approach.

First, the basic development environment is setup with the required system variables, and a working Antigravity CLI configuration.

Then, a Rust A2A agent is built, debugged, and tested locally.

Setup the Basic Environment

At this point you should have a working Python environment and a working Antigravity CLI installation. All of the relevant code examples and documentation is available in GitHub.

The next step is to clone the GitHub repository to your local environment:

cd ~
git clone https://github.com/xbill9/a2a-hello-world
cd poly-aca-rust-azure

Then run init.sh from the cloned directory.

The script will attempt to determine your shell environment and set the correct variables:

source init.sh

If your session times out or you need to re-authenticate- you can run the set_env.sh script to reset your environment variables:

source set_env.sh

Variables like PROJECT_ID need to be setup for use in the various build scripts- so the set_env script can be used to reset the environment if you time-out.

Rust A2A Libraries

There are several crates that provide A2A support. This project uses the a2a-rs crate:

crates.io: Rust Package Registry

Here is a sample Cargo.TOML:

[dependencies]
tokio = { version = "^1.37.0", features = ["full"] }
anyhow = "1.0.86"
a2a-rs = { version = "0.2.0", features = ["full"] }
futures = "0.3"
async-trait = "0.1.80"

Minimal System Information Tool Build

The first step is to build the basic tool directly with Rust. This allows the tool to be debugged and tested locally before adding the MCP layer.

First build the tool locally:

xbill@penguin:~/a2a-hello-world/poly-aca-rust-azure$ make build
Building the Rust project...
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.32s
xbill@penguin:~/a2a-hello-world/poly-aca-rust-azure$

then lint check the code:

xbill@penguin:~/a2a-hello-world/poly-aca-rust-azure$ make lint
Linting code...
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.27s
Checking formatting...

and run local tests:

xbill@penguin:~/a2a-hello-world/poly-aca-rust-azure$ make test
Running tests...
   Compiling a2a-server-rust v0.2.0 (/home/xbill/a2a-hello-world/poly-aca-rust-azure)
    Finished `test` profile [unoptimized + debuginfo] target(s) in 1.39s
     Running unittests src/main.rs (target/debug/deps/a2a_server_rust-5d07a7739a477350)

running 2 tests
test tests::test_simple_agent_handler_creation ... ok
test tests::test_task_creation ... ok

test result: ok. 2 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

The last step is to build the production version:

xbill@penguin:~/a2a-hello-world/poly-aca-rust-azure$ make release
Building Release...
    Finished `release` profile [optimized] target(s) in 0.26s
xbill@penguin:~/a2a-hello-world/poly-aca-rust-azure$

The A2A server can be started locally:

xbill@penguin:~/a2a-hello-world/poly-aca-rust-azure$ make start
Starting the A2A Rust server on port 8080...
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.17s
     Running `target/debug/a2a-server-rust`
🚀 Starting A2A Rust Server
==============================
🌐 Starting HTTP server on 0.0.0.0:8080...
🔗 HTTP server listening on http://0.0.0.0:8080
2026-05-21T20:01:10.860135Z INFO main ThreadId(01) start{server.address=0.0.0.0:8080 server.has_auth=false}: a2a_rs::adapter::transport::http::server: Starting HTTP server
2026-05-21T20:01:10.860358Z INFO main ThreadId(01) start{server.address=0.0.0.0:8080 server.has_auth=false}: a2a_rs::adapter::transport::http::server: HTTP server listening on 0.0.0.0:8080

Check The Local Agent Status

The project has a target to verify that the A2A server started:

xbill@penguin:~/a2a-hello-world/poly-aca-rust-azure$ make status
Checking Azure Container App status for a2a-app-penguin...
Name ProvisioningState FQDN
--------------- ------------------- ---------------------------------------------------------------
a2a-app-penguin Succeeded a2a-app-penguin.icyplant-a768d75c.westus2.azurecontainerapps.io
--- Service Status ---
Local (8080): ONLINE (A2A Rust Agent)

A2A Inspector

The A2A Inspector provides a tool to verify A2A operations.

Background information is here:

Announcing the A2A Inspector: A UI tool for A2A protocol development

GitHub Repo is here:

GitHub - a2aproject/a2a-inspector: Validation Tools for A2A Agents

Verify The Local A2A Installation

Start the A2A Inspector and use localhost:8080:

You should see the details of the Agent Card:

{
  "capabilities": {
    "pushNotifications": false,
    "stateTransitionHistory": false,
    "streaming": true
  },
  "defaultInputModes": [
    "text"
  ],
  "defaultOutputModes": [
    "text"
  ],
  "description": "An A2A agent using the a2a-rs crate",
  "documentationUrl": "https://example.org/docs",
  "name": "A2A Rust Agent",
  "preferredTransport": "JSONRPC",
  "protocolVersion": "0.3.0",
  "provider": {
    "organization": "Example Organization",
    "url": "https://example.org"
  },
  "skills": [
    {
      "description": "Echoes back the user's message",
      "examples": [
        "Echo: Hello World"
      ],
      "id": "echo",
      "inputModes": [
        "text"
      ],
      "name": "Echo Skill",
      "outputModes": [
        "text"
      ],
      "tags": [
        "echo",
        "respond"
      ]
    }
  ],
  "url": "http://0.0.0.0:8080",
  "version": "1.0.0"
}

Test the Local A2A Connection Locally

This step tests the A2A agent interactions with a test script:

xbill@penguin:~/a2a-hello-world/poly-aca-rust-azure$ make card
Fetching local agent card...
{
    "name": "A2A Rust Agent",
    "description": "An A2A agent using the a2a-rs crate",
    "url": "http://0.0.0.0:8080",
    "provider": {
        "organization": "Example Organization",
        "url": "https://example.org"
    },
    "version": "1.0.0",
    "protocolVersion": "0.3.0",
    "preferredTransport": "JSONRPC",
    "documentationUrl": "https://example.org/docs",
    "capabilities": {
        "streaming": false,
        "pushNotifications": false,
        "stateTransitionHistory": false
    },
    "defaultInputModes": [
        "text"
    ],
    "defaultOutputModes": [
        "text"
    ],
    "skills": [
        {
            "id": "echo",
            "name": "Echo Skill",
            "description": "Echoes back the user's message",
            "tags": [
                "echo",
                "respond"
            ],
            "examples": [
                "Echo: Hello World"
            ],
            "inputModes": [
                "text"
            ],
            "outputModes": [
                "text"
            ]
        }
    ]
}

and the A2A can be tested:

xbill@penguin:~/a2a-hello-world/poly-aca-rust-azure$ make a2a-local
Running local A2A echo test...
🚀 Testing A2A Echo Skill at http://localhost:8080
💬 Sending message: 'Hello from the test program!'
✅ Received echo: 'Echo: Hello from the test program!'
🌟 Success! The echo skill is working correctly.
xbill@penguin:~/a2a-hello-world/poly-aca-rust-azure$

So What Just Happened?

The Rust A2A agent was started locally. This agent provided a standard A2A agent card. Then, test scripts performed a A2A skills call against the locally running Rust A2A server. Because the A2A server in Rust provides standard tools- the A2A inspector could connect. The actual implementation language of the A2A code does not matter — as long as standard services are exposed.

Deploy to ACA Azure

Once the Agent has been validated and tested locally- the solution can be deployed to Azure ACA. Run the deploy target in the Makefile:

xbill@penguin:~/a2a-hello-world/poly-aca-rust-azure$ make deploy

Once the build is done — you can check the status:

xbill@penguin:~/a2a-hello-world/poly-aca-rust-azure$ make status
Checking Azure Container App status for a2a-app-penguin...
Name ProvisioningState FQDN
--------------- ------------------- ---------------------------------------------------------------
a2a-app-penguin Succeeded a2a-app-penguin.icyplant-a768d75c.westus2.azurecontainerapps.io
--- Service Status ---
Local (8080): ONLINE (A2A Rust Agent)
Remote (Cloud): ONLINE (A2A Rust Agent) - https://a2a-app-penguin.icyplant-a768d75c.westus2.azurecontainerapps.io
xbill@penguin:~/a2a-hello-world/poly-aca-rust-azure$

and get the endpoint:

xbill@penguin:~/a2a-hello-world/poly-aca-rust-azure$ make endpoint
a2a-app-penguin.icyplant-a768d75c.westus2.azurecontainerapps.io

Verify the Cloud Run Service

The Rust A2A service will be visible from the Azure console:

Testing Azure Deployment

The Makefile has several tools for validating the remote A2A server.

First — you can get the remote Agent card:

xbill@penguin:~/a2a-hello-world/poly-aca-rust-azure$ make card-remote
Fetching remote agent card...
{
    "name": "A2A Rust Agent",
    "description": "An A2A agent using the a2a-rs crate",
    "url": "http://0.0.0.0:8080",
    "provider": {
        "organization": "Example Organization",
        "url": "https://example.org"
    },
    "version": "1.0.0",
    "protocolVersion": "0.3.0",
    "preferredTransport": "JSONRPC",
    "documentationUrl": "https://example.org/docs",
    "capabilities": {
        "streaming": false,
        "pushNotifications": false,
        "stateTransitionHistory": false
    },
    "defaultInputModes": [
        "text"
    ],
    "defaultOutputModes": [
        "text"
    ],
    "skills": [
        {
            "id": "echo",
            "name": "Echo Skill",
            "description": "Echoes back the user's message",
            "tags": [
                "echo",
                "respond"
            ],
            "examples": [
                "Echo: Hello World"
            ],
            "inputModes": [
                "text"
            ],
            "outputModes": [
                "text"
            ]
        }
    ]
}

and run a basic test:

xbill@penguin:~/a2a-hello-world/poly-aca-rust-azure$ make a2a-remote
Running remote A2A echo test...
🚀 Testing A2A Echo Skill at https://a2a-app-penguin.icyplant-a768d75c.westus2.azurecontainerapps.io
💬 Sending message: 'Hello from the test program!'
✅ Received echo: 'Echo: Hello from the test program!'
🌟 Success! The echo skill is working correctly.
xbill@penguin:~/a2a-hello-world/poly-aca-rust-azure$

Crates.io

The full package is available on GitHub and crates.io:

https://crates.io/crates/a2a-server-rust-azure

The package details are here:

Summary

A complete A2A server was built using Rust. Basic validation was done with the A2A inspector. Next, test scripts were built that directly called the Rust A2A server. Finally, the complete solution was deployed to Azure ACA.

Deploying a Rust A2A Agent to AWS Lightsail

xbill — Fri, 22 May 2026 02:17:27 +0000

Rust A2A? Isn’t that a Python Thing?

The bulk of A2A Agents are in Python. The A2A protocol is language independent.

Python has traditionally been the main coding language for ML and AI tools. The goal of this article is to provide a test bed for building, debugging, and deploying cross language applications.

So is this the Real Deal(TM)?

So what is different about this lab compared to all the others out there?

This is one of the first deep dives into a Rust A2A agent leveraging the standard Rust crates.

What is Rust?

Rust is a high performance, memory safe, compiled language:

Rust

Rust provides memory safe operations beyond C/C++ and also can provide exceptional performance gains as it is compiled directly to native binaries.

Rust Setup

Instructions to install Rust are available here:

Getting started

For a Linux like environment the command looks like this:

curl — proto ‘=https’ — tlsv1.2 -sSf https://sh.rustup.rs | sh

Rust also depends on a working C compiler and OpenSSL setup. For a Debian 12/13 system — install the basic tools for development:

sudo apt install build-essential
sudo apt install libssl-dev
sudo apt install pkg-config
sudo apt-get install libudev-dev
sudo apt install make
sudo apt install git

Antigravity CLI

Antigravity CLI is the follow-on successor to Gemini CLI- the terminal driven, agent assisted coding tool.

Full details on installing Antigravity CLI are here:

Getting Started with Antigravity CLI

Testing the Antigravity CLI Environment

Once you have all the tools in place- you can test the startup of Antigravity CLI.

You will need to authenticate with a Google Cloud Project or your Google Account:

agy

This will start the interface:

AWS CLI

The AWS CLI provides a command line tool to directly access AWS services from your current environment. Full details on the CLI are available here:

Install Docker, AWS CLI, and the Lightsail Control plugin for containers

You can version check the tool after installation:

xbill@penguin:~/gemini-cli-aws/mcp-lightsai-rust-aws$ aws --version
aws-cli/2.34.43 Python/3.14.4 Linux/6.6.99-09128-g14e87a8a9b71 exe/x86_64.debian.12

Amazon Lightsail

Amazon Lightsail is an easy-to-use virtual private server (VPS) provider and cloud platform designed by AWS for simpler workloads, offering developers pre-configured compute, storage, and networking for a low, predictable monthly price. It is ideal for hosting small websites, simple web apps, or creating development environments.

More information is available on the official site here:

Amazon's Simple Cloud Server | Amazon Lightsail

And this is the direct URL to the console:

https://lightsail.aws.amazon.com/ls/webapp/home/containers

The Lightsail console will look similar to:

Where do I start?

The strategy for starting Rust A2A development is a incremental step by step approach.

First, the basic development environment is setup with the required system variables, and a working Antigravity CLI configuration.

Then, a Rust A2A agent is built, debugged, and tested locally.

Setup the Basic Environment

At this point you should have a working Python environment and a working Antigravity CLI installation. All of the relevant code examples and documentation is available in GitHub.

The next step is to clone the GitHub repository to your local environment:

cd ~
git clone https://github.com/xbill9/a2a-hello-world
cd poly-lightsail-rust-aws

Then run init.sh from the cloned directory.

The script will attempt to determine your shell environment and set the correct variables:

source init.sh

If your session times out or you need to re-authenticate- you can run the set_env.sh script to reset your environment variables:

source set_env.sh

Variables like PROJECT_ID need to be setup for use in the various build scripts- so the set_env script can be used to reset the environment if you time-out.

Rust A2A Libraries

There are several crates that provide A2A support. This project uses the a2a-rs crate:

crates.io: Rust Package Registry

Here is a sample Cargo.TOML:

[dependencies]
tokio = { version = "^1.37.0", features = ["full"] }
anyhow = "1.0.86"
a2a-rs = { version = "0.2.0", features = ["full"] }
futures = "0.3"
async-trait = "0.1.80"

Minimal System Information Tool Build

The first step is to build the basic tool directly with Rust. This allows the tool to be debugged and tested locally before adding the MCP layer.

First build the tool locally:

xbill@penguin:~/a2a-hello-world/poly-lightsail-rust-aws$ make
Building the Rust project...
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.32s
xbill@penguin:~/a2a-hello-world/poly-lightsail-rust-aws$

then lint check the code:

xbill@penguin:~/a2a-hello-world/poly-lightsail-rust-aws$ make
Building the Rust project...
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.25s
xbill@penguin:~/a2a-hello-world/poly-lightsail-rust-aws$

and run local tests:

xbill@penguin:~/a2a-hello-world/poly-lightsail-rust-aws$ make test
Running tests...
   Compiling a2a-server-rust v0.2.0 (/home/xbill/a2a-hello-world/poly-lightsail-rust-aws)
    Finished `test` profile [unoptimized + debuginfo] target(s) in 2.31s
     Running unittests src/main.rs (target/debug/deps/a2a_server_rust-5d07a7739a477350)

running 2 tests
test tests::test_task_creation ... ok
test tests::test_simple_agent_handler_creation ... ok

test result: ok. 2 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

The last step is to build the production version:

xbill@penguin:~/a2a-hello-world/poly-lightsail-rust-aws$ make release
Building Release...
    Finished `release` profile [optimized] target(s) in 0.29s
xbill@penguin:~/a2a-hello-world/poly-lightsail-rust-aws$

The A2A server can be started locally:

xbill@penguin:~/a2a-hello-world/poly-lightsail-rust-aws$ make start
Starting the A2A Rust server on port 8080...
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.19s
     Running `target/debug/a2a-server-rust`
🚀 Starting A2A Rust Server
==============================
🌐 Starting HTTP server on 0.0.0.0:8080...
🔗 HTTP server listening on http://0.0.0.0:8080
2026-05-21T17:05:41.115301Z INFO main ThreadId(01) start{server.address=0.0.0.0:8080 server.has_auth=false}: a2a_rs::adapter::transport::http::server: Starting HTTP server
2026-05-21T17:05:41.115593Z INFO main ThreadId(01) start{server.address=0.0.0.0:8080 server.has_auth=false}: a2a_rs::adapter::transport::http::server: HTTP server listening on 0.0.0.0:8080

Check The Local Agent Status

The project has a target to verify that the A2A server started:

xbill@penguin:~/a2a-hello-world/poly-lightsail-rust-aws$ make status
Checking AWS Lightsail service status for a2a-lightsail-rust-aws...
--------------------------------------------
| GetContainerServices |
+-------------+--------+-----------+-------+
| Deployment | Power | State | URL |
+-------------+--------+-----------+-------+
| ACTIVE | nano | RUNNING | None |
+-------------+--------+-----------+-------+
--- Service Status ---
Local (8080): ONLINE (A2A Rust Agent)
Remote (Cloud): ONLINE (A2A Rust Agent) - https://a2a-lightsail-rust-aws.6wpv8vensby5c.us-east-1.cs.amazonlightsail.com

A2A Inspector

The A2A Inspector provides a tool to verify A2A operations.

Background information is here:

Announcing the A2A Inspector: A UI tool for A2A protocol development

GitHub Repo is here:

GitHub - a2aproject/a2a-inspector: Validation Tools for A2A Agents

Verify The Local A2A Installation

Start the A2A Inspector and use localhost:8080:

You should see the details of the Agent Card:

{
  "capabilities": {
    "pushNotifications": false,
    "stateTransitionHistory": false,
    "streaming": true
  },
  "defaultInputModes": [
    "text"
  ],
  "defaultOutputModes": [
    "text"
  ],
  "description": "An A2A agent using the a2a-rs crate",
  "documentationUrl": "https://example.org/docs",
  "name": "A2A Rust Agent",
  "preferredTransport": "JSONRPC",
  "protocolVersion": "0.3.0",
  "provider": {
    "organization": "Example Organization",
    "url": "https://example.org"
  },
  "skills": [
    {
      "description": "Echoes back the user's message",
      "examples": [
        "Echo: Hello World"
      ],
      "id": "echo",
      "inputModes": [
        "text"
      ],
      "name": "Echo Skill",
      "outputModes": [
        "text"
      ],
      "tags": [
        "echo",
        "respond"
      ]
    }
  ],
  "url": "http://0.0.0.0:8080",
  "version": "1.0.0"
}

Test the Local A2A Connection Locally

This step tests the A2A agent interactions with a test script:

xbill@penguin:~/a2a-hello-world/poly-lightsail-rust-aws$ make card
Fetching local agent card...
{
    "name": "A2A Rust Agent",
    "description": "An A2A agent using the a2a-rs crate",
    "url": "http://0.0.0.0:8080",
    "provider": {
        "organization": "Example Organization",
        "url": "https://example.org"
    },
    "version": "1.0.0",
    "protocolVersion": "0.3.0",
    "preferredTransport": "JSONRPC",
    "documentationUrl": "https://example.org/docs",
    "capabilities": {
        "streaming": false,
        "pushNotifications": false,
        "stateTransitionHistory": false
    },
    "defaultInputModes": [
        "text"
    ],
    "defaultOutputModes": [
        "text"
    ],
    "skills": [
        {
            "id": "echo",
            "name": "Echo Skill",
            "description": "Echoes back the user's message",
            "tags": [
                "echo",
                "respond"
            ],
            "examples": [
                "Echo: Hello World"
            ],
            "inputModes": [
                "text"
            ],
            "outputModes": [
                "text"
            ]
        }
    ]
}

and the A2A can be tested:

xbill@penguin:~/a2a-hello-world/poly-lightsail-rust-aws$ make a2a-local
Running local A2A echo test...
🚀 Testing A2A Echo Skill at http://localhost:8080
💬 Sending message: 'Hello from the test program!'
✅ Received echo: 'Echo: Hello from the test program!'
🌟 Success! The echo skill is working correctly.
xbill@penguin:~/a2a-hello-world/poly-lightsail-rust-aws$

So What Just Happened?

Deploy to AWS Lightsail

Once the Agent has been validated and tested locally- the solution can be deployed to AWS Lightsail. Run the deploy target in the Makefile:

xbill@penguin:~/a2a-hello-world/poly-lightsail-rust-aws$ make deploy

Once the build is done — you can check the status:

xbill@penguin:~/a2a-hello-world/poly-lightsail-rust-aws$ make status
Checking AWS Lightsail service status for a2a-lightsail-rust-aws...
--------------------------------------------
| GetContainerServices |
+-------------+--------+-----------+-------+
| Deployment | Power | State | URL |
+-------------+--------+-----------+-------+
| ACTIVE | nano | RUNNING | None |
+-------------+--------+-----------+-------+
--- Service Status ---
Local (8080): ONLINE (A2A Rust Agent)
Remote (Cloud): ONLINE (A2A Rust Agent) - [https://a2a-lightsail-rust-aws.6wpv8vensby5c.us-east-1.cs.amazonlightsail.com](https://a2a-lightsail-rust-aws.6wpv8vensby5c.us-east-1.cs.amazonlightsail.com)
xbill@penguin:~/a2a-hello-world/poly-lightsail-rust-aws$

and get the endpoint:

xbill@penguin:~/a2a-hello-world/poly-lightsail-rust-aws$ make endpoint
https://a2a-lightsail-rust-aws.6wpv8vensby5c.us-east-1.cs.amazonlightsail.com/

Verify the Cloud Run Service

The Rust A2A service will be visible from the Lightsail console:

Testing AWS Deployment

The Makefile has several tools for validating the remote A2A server.

First — you can get the remote Agent card:

xbill@penguin:~/a2a-hello-world/poly-lightsail-rust-aws$ make card-remote
Fetching remote agent card...
{
    "name": "A2A Rust Agent",
    "description": "An A2A agent using the a2a-rs crate",
    "url": "http://0.0.0.0:8080",
    "provider": {
        "organization": "Example Organization",
        "url": "https://example.org"
    },
    "version": "1.0.0",
    "protocolVersion": "0.3.0",
    "preferredTransport": "JSONRPC",
    "documentationUrl": "https://example.org/docs",
    "capabilities": {
        "streaming": false,
        "pushNotifications": false,
        "stateTransitionHistory": false
    },
    "defaultInputModes": [
        "text"
    ],
    "defaultOutputModes": [
        "text"
    ],
    "skills": [
        {
            "id": "echo",
            "name": "Echo Skill",
            "description": "Echoes back the user's message",
            "tags": [
                "echo",
                "respond"
            ],
            "examples": [
                "Echo: Hello World"
            ],
            "inputModes": [
                "text"
            ],
            "outputModes": [
                "text"
            ]
        }
    ]
}

and run a basic test:

xbill@penguin:~/a2a-hello-world/poly-lightsail-rust-aws$ make a2a-remote
Running remote A2A echo test...
🚀 Testing A2A Echo Skill at https://a2a-lightsail-rust-aws.6wpv8vensby5c.us-east-1.cs.amazonlightsail.com
💬 Sending message: 'Hello from the test program!'
✅ Received echo: 'Echo: Hello from the test program!'
🌟 Success! The echo skill is working correctly.

Crates.io

The full package is available on GitHub and crates.io:

crates.io: Rust Package Registry

The package details are here:

Summary

Local MCP Development with Python and Antigravity CLI

xbill — Thu, 21 May 2026 15:35:20 +0000

This article re-visits and updates an existing deployment originally built with Gemini CLI and re-tools the setup to leverage the new Antigravity CLI to build Model Context Protocol (MCP) AI applications in Python with a local development environment.

Is this Deja Vu?

There is no spoon.

This article updates an earlier deployment built out with Gemini CLI and re-tools the entire setup for Antigravity CLI.

The original article is here:

Local MCP Development with Python and Gemini CLI

Aren’t There a Quadrillion Python MCP Demos?

Yes there are.

Python has traditionally been the main coding language for ML and AI tools. The goal of this article is to provide a minimal viable basic working MCP stdio server that can be run locally without any unneeded extra code or extensions.

Antigravity CLI

Antigravity CLI is the follow-on successor to Gemini CLI- the terminal driven, agent assisted coding tool.

Full details on installing Antigravity CLI are here:

Getting Started with Antigravity CLI

Testing the Antigravity CLI Environment

Once you have all the tools in place- you can test the startup of Antigravity CLI.

You will need to authenticate with a Google Cloud Project or your Google Account:

agy

This will start the interface:

Python MCP Documentation

The official GitHub Repo provides samples and documentation for getting started:

GitHub - modelcontextprotocol/python-sdk: The official Python SDK for Model Context Protocol servers and clients

The most common MCP Python deployment path uses the FASTMCP library:

Welcome to FastMCP - FastMCP

Where do I start?

The strategy for starting MCP development is a incremental step by step approach.

First, the basic development environment is setup with the required system variables, and a working Antigravity CLI configuration.

Then, a minimal Hello World Style Python MCP Server is built with stdio transport. This server is validated with Antigravity CLI in the local environment.

This setup validates the connection from Antigravity CLI to the local process via MCP. The MCP client (Antigravity CLI) and the Python MCP server both run in the same local environment.

Next- the basic MCP server is extended with Antigravity CLI to add several new tools in standard Python code.

Setup the Basic Environment

At this point you should have a working Python interpreter and a working Antigravity CLI installation. The next step is to clone the GitHub samples repository with support scripts:

cd ~
git clone https://github.com/xbill9/gemini-cli-codeassist

Then run init.sh from the cloned directory.

The script will attempt to determine your shell environment and set the correct variables:

cd gemini-cli-codeassist
source init.sh

If your session times out or you need to re-authenticate- you can run the set_env.sh script to reset your environment variables:

cd gemini-cli-codeassist
source set_env.sh

Variables like PROJECT_ID need to be setup for use in the various build scripts- so the set_env script can be used to reset the environment if you time-out.

Hello World with STDIO Transport

One of the key features that the standard MCP libraries provide is abstracting various transport methods.

The high level MCP tool implementation is the same no matter what low level transport channel/method that the MCP Client uses to connect to a MCP Server.

The connection over stdio will look similar to this:

if __name__ == " __main__":
    # Explicitly use stdio transport
    mcp.run(transport="stdio")

Running the Python Code

First- switch the directory with the Python MCP sample code:

cd ~/gemini-cli-codeassist/mcp-stdio-python-agy

Run the release version on the local system:

cd ~/gemini-cli-codeassist/mcp-stdio-python-agy
make release

You can validate the final result by checking the messages:

Requirement already satisfied: cryptography>=3.4.0 in /home/xbill/.local/lib/python3.13/site-packages (from pyjwt[crypto]>=2.10.1->mcp->-r requirements.txt (line 1)) (48.0.0)
Requirement already satisfied: cffi>=2.0.0 in /home/xbill/.local/lib/python3.13/site-packages (from cryptography>=3.4.0->pyjwt[crypto]>=2.10.1->mcp->-r requirements.txt (line 1)) (2.0.0)
Requirement already satisfied: pycparser in /home/xbill/.local/lib/python3.13/site-packages (from cffi>=2.0.0->cryptography>=3.4.0->pyjwt[crypto]>=2.10.1->mcp->-r requirements.txt (line 1)) (3.0)
Requirement already satisfied: click>=7.0 in /home/xbill/.pyenv/versions/3.13.13/lib/python3.13/site-packages (from uvicorn>=0.31.1->mcp->-r requirements.txt (line 1)) (8.1.8)
xbill@penguin:~/gemini-cli-codeassist/mcp-stdio-python-agy$

Antigravity CLI settings.json

Settings.json is where MCP servers are setup in Gemini CLI. Antigravity CLI moves the MCP configuration to a seperate file: mcp_config.json.

xbill@penguin:~/gemini-cli-codeassist/mcp-stdio-python-agy$ cd .agents
xbill@penguin:~/gemini-cli-codeassist/mcp-stdio-python-agy/.agents$ d
total 4
-rw-r--r-- 1 xbill xbill 184 May 21 10:03 mcp_config.json
drwxr-xr-x 1 xbill xbill 30 May 21 10:03 .
drwxr-xr-x 1 xbill xbill 242 May 21 10:03 ..

This file contains the configuration for connecting to external tools and services:

{
  "mcpServers": {
    "hello-stdio-python": {
      "command": "python",
      "args": [
        "/home/xbill/gemini-cli-codeassist/mcp-stdio-python-agy/main.py"
      ]
    }
  }
}

Validation with Antigravity CLI

Finally- agy CLI is restarted and the MCP connection over stdio to the Python Code is validated, The full session will start:

> /mcp list
────────────────────────────────────────────────────────────────────────────────────────────────────
MCP Servers

Plugins (~/.gemini/antigravity-cli/plugins)
> ✓ hello-stdio-python Tools: greet

And you can then connect to the MCP Server over stdio:

> greet Two Snakes!

● ListDir(/home/xbill/.gemini/antigravity-cli/mcp/hello-stdio-python)
● Read(/home/xbill/.gemini/antigrav...ello-stdio-python/greet.json)
● hello-stdio-python/greet(Calling greet tool via MCP) (ctrl+o to expand)

  Hello! I've invoked the greet tool from the hello-stdio-python server with the parameter   
  "Two Snakes!" , and it responded with:

  │ Two Snakes!

Extending the Python MCP Server with Antigravity CLI

Now that the basic MCP functionality has been tested and verified with Antigravity CLI- the Python server code can be extended with new MCP tools:

> with @main.py add 2 new MCP tools - one that returns the current system time and the other
  returns the current system info like OS,
    memory, etc

● ListDir(/home/xbill/gemini-cli-codeassist/mcp-stdio-python-agy)
● Read(/home/xbill/gemini-cli-codea...mcp-stdio-python-agy/main.py)
● Read(/home/xbill/gemini-cli-codea...-python-agy/requirements.txt) (ctrl+o to expand)
▸ Thought for 2s, 977 tokens
  Clarifying Instructions

● Read(/home/xbill/gemini-cli-codea...cp-stdio-python-agy/Makefile)
● Read(/home/xbill/gemini-cli-codea...dio-python-agy/test_greet.py)
● Read(/home/xbill/gemini-cli-codea...o-python-agy/test_logging.py) (ctrl+o to expand)

▸ Thought for 2s, 1.4k tokens
  Considering Time Return Formats
● ListPermissions(You have read and write access to the following workspace(s)...) (ctrl+o to
expand)

▸ Thought for 1s, 1.5k tokens
  Proposing the Execution

● Edit(/home/xbill/gemini-cli-codea...mcp-stdio-python-agy/main.py) (ctrl+o to expand)

▸ Thought for 2s, 589 tokens
  Initiating Unit Tests

● Edit(/home/xbill/gemini-cli-codea...dio-python-agy/test_greet.py)
● Bash(make test) (ctrl+o to expand)

Restart Antigravity CLI and check for the new MCP tools:

MCP Servers

Plugins (~/.gemini/antigravity-cli/plugins)
> ✓ hello-stdio-python Tools: greet, get_system_time, get_system_info

Get the current time:

> get_system_time

● hello-stdio-python/get_system_time(Executing get_system_time via MCP) (ctrl+o to expand)

  I have successfully invoked the get_system_time tool from the local MCP server, and it
  returned the following current system times:

    Local Time: 2026-05-21T10:15:38.715899 (2026-05-21 10:15:38)
    UTC Time: 2026-05-21T15:15:38.715912+00:00

And system Info:

> get_system_info

● hello-stdio-python/get_system_info(Executing get_system_info via MCP) (ctrl+o to expand)

  I have successfully invoked the get_system_info tool from the local MCP server. Here is the  
  current system information retrieved:

    Operating System: Linux
    Release/Kernel: 6.6.99-09128-g14e87a8a9b71
    Version: #1 SMP PREEMPT_DYNAMIC Fri, 6 Feb 2026 20:45:19 -0800
    Architecture: x86_64
    CPU Cores: 12
    Memory Info: Total: 6530980 kB, Free: 332012 kB, Available: 1558252 kB                   
    Python Version: 3.13.13

The information from the MCP tool execution comes back into the current context. You can use the information returned from the tools along with the underlying Gemini Models:

> can i play Cyberpunk 2077 on this machine

● WebSearch(Cyberpunk 2077 minimum system requirements) (ctrl+o to expand)

▸ Thought for 2s, 494 tokens
  Assessing Game Requirements
  Based on the system information retrieved, no, you cannot play Cyberpunk 2077 on this machine  
  in its current state.                                                                          

  Here is a detailed breakdown of why, comparing your system specs to the game's official minimum
  requirements (after Update 2.0):                                                               

  ### 1. RAM (Critical Bottleneck) ❌                                                            

  • Minimum Requirement: 12 GB RAM                                                               
  • Your Machine: ~6.2 GB RAM ( Total: 6530980 kB )                                              
  • Verdict: This is a severe bottleneck. With only 6.2 GB of RAM, the game will either fail to  
  launch, crash constantly, or suffer from unplayable stuttering as your system runs out of      
  memory.                                                                                        

  ### 2. Operating System ⚠️                                                                      

  • Minimum Requirement: 64-bit Windows 10/11                                                    
  • Your Machine: Linux (Kernel 6.6)                                                             
  • Verdict: While Cyberpunk 2077 does not have a native Linux version, this is actually not a   
  dealbreaker. The game has a "Platinum" rating on ProtonDB, meaning it runs beautifully on Linux
  via Steam Proton (compatibility layer). However, the hardware limits below still apply.        

  ### 3. CPU Cores

  • Minimum Requirement: Intel Core i7-6700 (4 cores) or AMD Ryzen 5 1600 (6 cores)
  • Your Machine: 12 Cores
  • Verdict: Excellent. Your CPU core count is more than capable of handling the game.           

  ### 4. Graphics Card & Storage (Unknown) ❓

  • Minimum Requirement: Dedicated GPU with at least 6 GB of VRAM (e.g., GeForce GTX 1060 6GB /  
  Radeon RX 580) and 70 GB SSD (an SSD is strictly required).
  • Our system info tools do not expose your GPU model or storage type, but you would need to    
  meet these specifications as well.
  ──────
  ### Conclusion

  Even though your CPU is powerful and Linux compatibility is excellent via Proton, the 6.2 GB   
  RAM limit will prevent you from playing the game. You would need to upgrade the system's RAM to
  at least 12 GB (ideally 16 GB) and ensure you have a gaming-grade dedicated graphics card.

But you can still play Doom!

> can I play Classic Doom on this machine?

● WebSearch(classic doom 1993 system requirements) (ctrl+o to expand)

  Absolutely, 100% yes! In fact, your machine could easily run thousands of instances of Classic 
  Doom (1993) at the exact same time.                                                            

  Here is how your current system compares to the original 1993 requirements:

   Component │ Classic Doom (│ Your Machine │ How Much Better Is It?
  ───────────────┼───────────────┼────────────────────┼──────────────────────────────────────────
   Operating Syst│ MS-DOS 5.0 │ Linux (Kernel 6.6) │ Native Linux source ports (like Chocolate
   Processor (CPU│ Intel 386DX (3│ 12 CPU Cores (Multi│ Thousands of times faster with massive mu
   Memory (RAM) │ 4 MB (Megabyte│ 6,530 MB (6.2 GB) │ Over 1,600 times more RAM than required!
   Storage │ 12 MB │ Gigabytes available│ Plenty of space to store every mod ever c

  ### How to play it on your Linux machine right now:

  Because Doom's engine code was open-sourced in 1997, Linux has excellent native support through
  modern "source ports" which run the game flawlessly with widescreen and high-resolution support.

  You can install a lightweight classic source port like Chocolate Doom directly via your package
  manager:

    # On Debian/Ubuntu-based systems
    sudo apt install chocolate-doom

  Then, you just need a game file (like the free shareware DOOM1.WAD or a retail DOOM.WAD     
  from Steam/GOG) to start slaying demons!

One Last Code Review

Lets see how good of a Python coder I am -

> code review this Python MCP project

● Read(/home/xbill/gemini-cli-codea...p-stdio-python-agy/README.md)
● Bash(git status)
● Bash(find . -maxdepth 2) (ctrl+o to expand)

▸ Thought for 4s, 732 tokens
  Investigating File Absence
  Here is a comprehensive code review of the Python MCP (Model Context Protocol) project.        

  Overall, the project is exceptionally well-structured, lightweight, and employs industry best  
  practices for MCP stdio-based servers.                                                         
  ──────                                                                                         
   📊 Project-Wide Code Review                                                                   

  ## 1. Architectural Highlights & Best Practices                                                

  ### A. Protocol Stream Safety (Logging to Stderr)                                              

  • The Design: MCP servers communicating via the stdio transport use stdout exclusively for 
  JSON-RPC protocol messages. Any standard print statements or unstructured logs written to      
  stdout will corrupt the protocol stream and cause client-side connection failures.            
  • The Implementation: In main.py, the root logger is configured to send all logs to sys.
  stderr using the structured JsonFormatter from pythonjsonlogger .                          
  • Verdict: Excellent. This guarantees that all logs are safe, structured (JSON format), and    
  completely isolated from the MCP communication channel.                                        

  ### B. Developer Experience (DX)                                                               

  • The Makefile provides full life-cycle automation for installation, manual execution,   
  testing, linting ( flake8 ), formatting ( black ), and type checking ( mypy ).                 
  • The codebase maintains 100% strict compliance with PEP8/flake8 (79-character line limits) and
  passes static type checks under mypy without any warning or exception suppression.           
  ──────                                                                                         
  ## 2. File-by-File Detailed Review                                                             

  ### 📁 main.py                                                                          

  • Initialization: FastMCP simplifies server setup beautifully:                               
    mcp = FastMCP("hello-world-server")                                                          

  • Exposing Tools: The @mcp.tool() decorator is perfectly utilized. Docstrings are parsed     
  automatically by FastMCP and transmitted to MCP clients as the official description of the   
  tool:                                                                                          
    @mcp.tool()                                                                                  
    def get_system_time() -> str:                                                                
        """                                                                                      
        Get the current system time in ISO-8601 format and human-readable format.                
        """                                                                                      

  • Modern Python Features: The get_system_time tool avoids the deprecated utcnow() method,  
  utilizing standard timezone-aware UTC retrieval:                                               
    now_utc = datetime.now(timezone.utc)                                                         

  • Platform Portability: get_system_info gracefully detects whether it's running on Linux     
  (reading /proc/meminfo directly) and falls back to POSIX sysconf APIs if running on          
  macOS/BSD/other UNIX variants, ensuring maximum reliability.                                   
  ──────                                                                                         
  ### 📁 test_logging.py and test_greet.py                                                       

  • Logging Verification: test_logging.py employs a custom caplog_stream Pytest fixture that 
  intercepts root logging handlers to assert that debug, info, and error outputs generate valid  
  JSON strings with correct levels. This is highly robust!                                       
  • Input Robustness: test_greet.py thoroughly verifies greet behavior on empty strings and  
  special characters ( !@#$%^&*() ).                                                             
  • New Tools Coverage: The new system tools are fully covered, verifying output structures and  
  ensuring the expected system/time labels are present.                                          
  ──────                                                                                         
  ## 3. Areas for Improvement & Recommendations                                                  

  While the project is in a production-ready state, addressing the following points will make the
  project even more robust:                                                                      

  ### Recommendation 1: Update README Virtual Environment Instructions                           

  • File: README.md                                                                         
  • Issue: The README still suggests setting up a local .venv under # Installation . However, 
  as documented in your GEMINI.md, direct virtual environment setups can conflict with the  
  way MCP client launchers handle stdio Python subprocesses.                                     
  • Fix: Add a note in README.md warning developers that if the server is executed via an MCP  
  client (such as Claude Desktop), they must ensure their client configuration points explicitly 
  to the python executable inside that virtual environment (e.g., /path/to/project/.            
  venv/bin/python ), rather than just a global python command.                                 

  ### Recommendation 2: Avoid Swallowing Key Failures in get_system_info                        

  • File: main.py                                                                         
  • Issue: In get_system_info , the else block (POSIX fallback) catches all exceptions using a
  bare except Exception: pass . If os.sysconf fails or is missing, mem_details remains      
  "Memory information not available." without any traceback or warning captured in the logs.    
  • Fix: Log the exception details at DEBUG or WARNING level so that issues on unsupported   
  systems can be diagnosed:                                                                      
    except Exception as e:
        logger.debug(f"Failed to query POSIX sysconf memory: {str(e)}")

  ### Recommendation 3: Explicitly Set Pytest Rootdir

  • File: Makefile
  • Issue: Running pytest without a dedicated pytest.ini config file means it defaults to    
  scanning the current working directory recursively, which might include .venv files or parent
  folders if executed from outside.
  • Fix: Create a minimal pytest.ini file in the root of the project to lock down test         
  detection parameters:
    [pytest]
    testpaths = .
    python_files = test_*.py

  ──────
  ### Final Verdict

  Score: 9.8 / 10
  The project is extremely clean, PEP8 compliant, well-tested, and demonstrates an outstanding   
  understanding of the Model Context Protocol's stdio transport constraints.

Summary

The strategy for using Python for MCP development with Antigravity CLI was validated with an incremental step by step approach.

A minimal stdio transport MCP Server was started from Python source code and validated with Antigravity CLI running as a MCP client in the same local environment.

Antigravity CLI was then used to extend the sample Python code with several MCP tools and use these tools inside the context for the underlying LLM.

This approach can be extended to more complex deployments using other MCP transports and Cloud based options.

DEV Community: xbill

Meet 'Devto-Blogger': The Hermes Agent Skill That Automatically Writes Your Technical Blog Posts

🚀 What is Hermes Agent?

🔧 The Entry: The devto-blogger Skill

💡 How It Works Under the Hood

🏆 Why This Matters for the Open Source Community

🛠️ Try It Yourself!

Gemma 4 Deployment with NVIDIA L4, MCP, Cloud Run, and Antigravity CLI

What is this project trying to Do?

Antigravity CLI

Testing the Antigravity CLI Environment

Full Installation Instructions

Python MCP Documentation

Where do I start?

Setup the Basic Environment

Model Management Tool with MCP Stdio Transport

Running the Python Code

MCP stdio Transport

Antigravity CLI mcp_config.json

Validation with Antigravity CLI

Getting Started with Gemma 4 on GPU

Model Lifecycle Management via MCP

Deploy The Model

Checking System status

Cross Check The Deployed Model

Check the Cloud Run Service

Review the Model

Benchmark the Local Model

Summary

Local Gemma 4 Deployment with MCP and Antigravity CLI

What is this project trying to Do?

Antigravity CLI

Testing the Antigravity CLI Environment

Full Installation Instructions

Python MCP Documentation

Where do I start?

Setup the Basic Environment

Model Management Tool with MCP Stdio Transport

Running the Python Code

Docker Interaction with MCP stdio Transport

Antigravity CLI mcp_config.json

Validation with Antigravity CLI

Getting Started with Gemma 4 Locally

Model Lifecycle Management via MCP

Checking System status

Deploy The Model Locally

Cross Check The Deployed Model

Review the Model

Benchmark the Local Model

Summary

Cross Language A2A Agent Benchmarking with Antigravity CLI

Building a Benchmarking Agent with A2A and MCP

Deja Vu — What is Old is New!

Why would I need Multi-Language Support? Can’t I just use Python?

What is this Approach actually Benchmarking?

What is the Agent Development Kit?

Google’s Agent Stack in Action: ADK, MCP, A2A on Google Cloud

What is the A2A protocol?

Language Support For the A2A Protocol

Tool Chain Setup for A2A

Antigravity CLI

Testing the Antigravity CLI Environment

Checking the Developer Environment

Debugging API Permission Errors

A2A Debugging Tools -A2A Inspector

Staring the A2A Inspector

Extending ADK Agents with A2A

I am confused- How Does this all Work?

Mersenne Primes

A2A Cross Language Agents

A2A Mersenne Prime Number Agent (GO)

A2A Mersenne Prime Number Agent (Python)

A2A Mersenne Prime Number Agent (Node)

A2A Mersenne Prime Number Agent (Rust)

MCP Master/Orchestrator Agent (Python)

Antigravity CLI with MCP Connection

ADK Validation of the Master Agent

So What is all this Doing?

Time to Starting Running some Numbers!

Start the Benchmarking Session

🔧 The Entry: The `devto-blogger` Skill