This is a submission for the Google Cloud NEXT Writing Challenge
Read enough keynote recaps and the shape of them becomes familiar: model names, benchmark numbers, and a CEO quote about whatever "era" we are in. You close the tab, write a Jira ticket, and wonder if any of it was actually about your job.
Today's Google Cloud NEXT '26 opening keynote had all of that. Thomas Kurian in Las Vegas. Sundar Pichai on video. Apple's logo unexpectedly behind the Google CEO's head. The big reveal of the Gemini Enterprise Agent Platform .
But amidst the applause for Gemini 3.1 Pro and its new "Deep Think" mode, I kept coming back to the announcements that weren't flashy.
The ones that will actually affect what we ship this year are the boring ones. The plumbing.
What I Mean By Plumbing
The boring infrastructure almost always decides whether a technology ships at scale. HTTP wasn't exciting. TCP/IP wasn't a keynote moment. Nobody clapped for DNS. But that's the layer where things either work reliably or don't work at all.
AI agents are at exactly that point right now. Everyone agrees on what they want agents to do. The part that has quietly killed hundreds of enterprise AI projects is different: getting agents to talk to each other across systems, hold context between sessions, and do it without becoming a security nightmare your team has to clean up later.
That's what Google actually shipped today. Dressed up in model demos and stage lighting, but the substance is infrastructure.
Thomas Kurian titled the keynote "The Agentic Cloud" and drew a deliberate contrast with competitors: other vendors, he said, are "handing you the pieces, not the platform," leaving teams to integrate components themselves .
Hardware for the AI Age: The TPU v8 Family
Agentic workflows are computationally expensive. When an agent enters a reasoning loop, it performs thousands of operations in the background. Standard hardware can't keep up with that demand without massive latency.
Enter the TPU v8 family. For the first time, Google has bifurcated its chips into two distinct versions :
TPU 8t (Training — codenamed "Sunfish")
- 216 GB HBM memory | 128 MB on-chip SRAM | 12.6 FP4 petaflops
- Scales to 9,600-chip superpods via Virgo fabric
- 3x the processing of seventh-generation Ironwood TPU
- 2x the performance per watt
- New SparseCore accelerator handles "irregular memory access patterns" like embedding lookups
- Built with Broadcom on TSMC's 2nm process
TPU 8i (Inference — codenamed "Zebrafish")
This is the more important chip for most developers because it reflects where cloud margins and customer retention will actually be decided: not in rare pretraining runs, but in the relentless economics of serving reasoning models and agents under latency SLAs .
- 288 GB HBM memory | 384 MB on-chip SRAM (3x more than previous generations!)
- 10.1 FP4 petaflops of computing capacity
- 80% better performance per dollar than Ironwood
- New Collectives Acceleration Engine (CAE) reduces latency by 5x for chain-of-thought reasoning
- Boardfly ICI topology cuts network hops from 16 to 7 (50% latency improvement for communication-intensive workloads)
For my project, CrowdCommand — a crowd-safety platform for stadiums — the TPU 8i is vital. If an agent takes seconds to process a camera feed during a potential crush, it's useless. The 8i's on-chip KV cache and reduced latency make Gemini 3.1 Flash responses feel instantaneous.
Virgo Network: The Fabric That Ties It Together
Google also announced Virgo, a new high-bandwidth, low-latency interconnect fabric for its AI Hypercomputer :
- Connects up to 134,000 TPU 8t processors
- 47 petabits per second of bi-directional bandwidth
- 1.6 million ExaFlops of capacity with "near-linear" scaling
- 40% lower unloaded fabric latency for TPUs
- Uses a flat, two-layer non-blocking topology with high-radix switches
This isn't just a network upgrade — it's a complete reimagining of how AI chips communicate. For multi-agent systems where agents need to stay in constant sync, Virgo prevents the lag that breaks distributed reasoning.
The A2A Protocol v1.0: Breaking the Vendor Lock-in
The Agent-to-Agent (A2A) protocol reaching v1.2 and moving to the Linux Foundation's Agentic AI Foundation is a landmark moment . It solves the "Multi-Agent Discovery" problem that has made distributed agent architectures genuinely painful.
How does an agent on Platform A discover, trust, and delegate to an agent on Platform B?
The answer is Signed Agent Cards.
What Agent Cards Actually Are
Every A2A-compliant agent publishes an Agent Card: a JSON manifest served at /.well-known/agent.json that declares what the agent can do, what inputs it accepts, what auth schemes it supports, and how to reach it :
{
"name": "Procurement Agent",
"version": "1.2.0",
"capabilities": ["create_purchase_order", "check_vendor_status", "approve_spend"],
"input_schema": {
"type": "object",
"properties": {
"vendor_id": { "type": "string" },
"amount_usd": { "type": "number" }
}
},
"auth": { "schemes": ["oauth2", "api_key"] },
"endpoint": "https://procurement-agent.acme.com/a2a"
}
This is discovery. This is what lets a general-purpose orchestrator find and call a specialist agent without prior integration work. It's DNS + OpenAPI, applied to agents.
The Production Signal That Matters
The number that should get your attention: 150 organizations in production, not pilot .
Google announced A2A at Google I/O 2025 with 50 partners on paper. Twelve months later:
- 150 organizations running real workloads between agents built on different vendors' stacks
- Microsoft, AWS, Salesforce, SAP, and ServiceNow are running A2A in production environments
- The Linux Foundation now governs it — removing the "what if Google gets bored?" question
Native A2A support now ships in ADK v1.0, LangGraph, CrewAI, LlamaIndex Agents, Semantic Kernel, and AutoGen . That's not a Google-curated list of close partners. That's where developers are actually building agent systems.
A2A vs. MCP: The Two Layers
A2A is designed to complement rather than compete with Anthropic's Model Context Protocol (MCP) :
| Protocol | Layer | What it does |
|---|---|---|
| MCP | Tool/Data Access | How an agent connects to tools and data sources |
| A2A | Agent Orchestration | How agents communicate with each other across platforms |
You need both. MCP connects agents to your databases and APIs. A2A lets agents talk to each other. Google now supports both natively.
Code Example: Building an A2A Agent
Here's a minimal A2A agent server using the new ADK v1.0 (Python, stable release) :
from google.adk.agents import LlmAgent
from google.adk.a2a import A2AServer, AgentCard, Capability
# Define what this agent can do
card = AgentCard(
name="inventory-checker",
version="1.0.0",
capabilities=[
Capability(
name="check_stock",
description="Returns current inventory level for a given SKU",
input_schema={"sku": "string"},
output_schema={"quantity": "integer", "warehouse": "string"}
)
]
)
# Create the agent
agent = LlmAgent(
model="gemini-3-flash",
system_prompt="You check inventory. Be precise and fast.",
tools=[check_inventory_db]
)
# Start the server
server = A2AServer(agent=agent, card=card, port=8080)
server.start()
# → GET /.well-known/agent.json (Agent Card discovery)
# → POST /a2a/tasks/send (Task endpoint)
And calling it from an orchestrator:
from google.adk.a2a import A2AClient
client = A2AClient()
# Auto-fetches the Agent Card
inventory_agent = await client.discover("https://inventory.acme.com")
# Send a task
task = await inventory_agent.send_task({
"capability": "check_stock",
"input": {"sku": "WIDGET-42"}
})
# Stream progress in real time
async for update in task.stream():
print(f"Status: {update.status} | {update.message}")
result = await task.result()
print(f"Stock: {result['quantity']} units at {result['warehouse']}")
This runs cross-platform. The inventory agent could be on Agent Engine. The orchestrator could be LangGraph, CrewAI, or AutoGen. A2A bridges them without custom serialization or SDK lock-in.
ADK v1.0: What "Stable" Actually Buys You
The Agent Development Kit (ADK) hit stable v1.0 releases today across Python, Go, and Java, with TypeScript also available .
The 0.x releases were experimentally useful — people shipped real things with them. But "production-ready" means something specific when your agents take autonomous actions: stable APIs you can actually depend on, predictable versioning, and a security model you can explain to your CISO.
Model Armor: Security at the Protocol Level
The standout security feature is Model Armor . It defends against:
| Attack Type | Description |
|---|---|
| Prompt Injection | Malicious commands hidden in user input |
| Jailbreak Attempts | Instructions to bypass safety restrictions |
| Session Poisoning | Injecting harmful content into conversation history |
| Tool Output Poisoning | External tools return malicious instructions |
| Sensitive Data Leakage | Unintended exposure of PII or secrets |
Model Armor works by applying pre-trained classifiers to every agent interaction :
User Input → Model Armor → Clean Input → Agent → Model Armor → Safe Output
↓ ↓
Block/Flag Block/Flag
Performance characteristics vs. LLM-as-a-Judge :
| Feature | Model Armor | LLM-as-a-Judge |
|---|---|---|
| Latency | 100-300ms | 500-1000ms |
| Cost | Lower (optimized classifiers) | Higher (LLM inference) |
| Setup | Requires Cloud config | Easy (SDK only) |
| Context Awareness | Good | Excellent |
Best practice: Use both — Model Armor for fast baseline filtering, LLM-as-a-Judge for context-aware validation on critical operations.
MCP Servers: The Announcement Nobody's Talking About
While everyone focused on Gemini and A2A, Google quietly launched managed MCP servers running natively inside Google Cloud .
Before this: every time you wanted an AI agent to talk to an external service — a database, security dashboard, calendar — you had to build the bridge yourself. Custom API calls. Auth tokens stored somewhere sketchy. Error handling that breaks at 2 AM.
MCP is like USB-C for AI agents. It's the standardized port that lets AI agents plug into data sources without custom wiring every time.
Google's managed MCP servers now cover :
| Service | What it enables |
|---|---|
| Google Security Operations | Agents query threat data without custom auth |
| Google Workspace | Agents read docs, calendar, email securely |
| BigQuery | Agents run analytics queries as natural conversation |
| Cloud Storage | Read/write/analyze data using MCP |
| Maps, Compute Engine, Kubernetes Engine | Fully managed remote MCP servers |
The old way — days of work:
# 1. Build OAuth flow for Google Workspace
# 2. Set up token refresh logic
# 3. Write endpoint wrappers for Docs, Calendar, Gmail
# 4. Handle errors, retries, rate limits
# 5. Deploy and monitor forever
# -- 3 days minimum. Ongoing maintenance. --
The new way — one line:
agent.connect(mcp_server="google-workspace")
agent.ask("Summarize all unread emails from the last 48 hours and add any deadlines to my calendar")
# Done. In production. Today.
That is the removal of an entire category of work.
Governance: Active Directory for the AI Era
The Gemini Enterprise Agent Platform is organized around four pillars: Build, Scale, Govern, Optimize . The most important for enterprises is Govern.
Agent Identity
Every agent gets a unique cryptographic ID with an auditable trail mapped to authorization policies. If an agent takes an action, you know which agent, under which policy, at what time .
Agent Registry
A central catalog of every agent and approved tool across your organization — the equivalent of a container registry, but for agents. Whether the agent was built internally on ADK or sourced from the partner marketplace (Atlassian, Box, Salesforce, ServiceNow, Workday all launched agents at Next), it has one identity and one index.
Agent Gateway
Described by Kurian as "air traffic control for your agent ecosystem" :
- Routes all agent traffic
- Speaks both MCP and A2A natively
- Applies Model Armor inline — prompt injection scanning happens at the network layer
- Surfaces Agent Anomaly Detection — monitoring for tool misuse, unauthorized data access, and reasoning drift in production
Memory Bank
Persistent state for up to seven days, allowing agents to maintain high-accuracy context across sessions mapped to internal CRM and database records via Custom Session IDs . Stateful agents are no longer an edge case — they're the runtime's default assumption.
The Agentic Data Cloud: Grounding Agents in Reality
Agents are only as good as the data they can access. Google announced new capabilities for the Agentic Data Cloud :
Knowledge Catalog (formerly Dataplex)
A unified map of your data landscape across AlloyDB, BigQuery, Bigtable, Cloud SQL, and Spanner. Provides a single, governed source of truth needed to build and scale reliable agents.
Reverse ETL for BigQuery (Preview)
One-click solution to push analytical insights from BigQuery back into AlloyDB, Bigtable, or Spanner, enabling agents to serve them with sub-millisecond latency .
Spanner Columnar Engine (GA)
Analytical queries run up to 200x faster with zero impact on production transactional workloads.
For CrowdCommand, this means my safety agents can query live stadium sensor data while also accessing historical crowd flow patterns — all at conversational speed.
Where I'm Skeptical
The word "open" appears a lot. A2A is an open protocol. ADK is open source. The Model Garden includes 200+ models from multiple vendors, including Anthropic Claude .
All true.
And also: the smoothest path through every one of these tools runs directly through Google Cloud — Agent Engine for managed hosting, Apigee as the API-to-agent gateway, Vertex AI as the deployment target.
The protocol is portable. The operational infrastructure is not.
This isn't necessarily a problem — Google's runtime is genuinely good. But developers should be clear with themselves about what "open" covers here. The code you write on ADK travels with you. The observability tooling, managed hosting, and audit trail — those are Google Cloud products. That's a real dependency. Know what you're choosing.
Also worth watching:
- MCP governance — Who controls access? Where are the logs? For regulated industries (healthcare, finance, legal), these aren't minor concerns .
- Pricing — "Managed" usually means "metered." Unknown yet if this becomes expensive at scale.
The Developer Keynote: Code, Not Slides (April 23)
Today's Developer Keynote (10:30 AM PT) took a different approach. No polished slides. No rehearsed demos. Live coding. Real terminals. Real bugs.
Who Spoke
Stephanie Wong hosted, joined by:
- Michele Catasta (President & Head of AI at Replit) — live-building agentic workflows
- Harrison Chase (LangChain) — discussing multi-agent orchestration
- Ankur Kotwal & Salman Ladha (Wiz) — security deep dive on agent isolation
- Kevin Moore & Ines Envid — ADK v1.0 live demo
- Sarah Kennedy & Ricky Robinett — "hot off the press" breakdown
The most valuable moment: watching them hit a production bug live, debug it, fix it, and redeploy. That's transparency documentation can't give you.
What They Covered
| Topic | Key takeaway |
|---|---|
| ADK v1.0 live | Building agents with Python, streaming responses |
| MCP integration | Connecting agents to BigQuery in 3 lines |
| Agent Gateway preview | Real-time traffic management |
| Security | Model Armor + Wiz integration |
| LangGraph + A2A | Cross-framework agent communication |
New Codelabs Released
55+ new codelabs. Start here: Codelab 9 — Developer Keynote: Building Agents with Skills
- Build Rich Agent Experiences (ADK + A2UI)
- Building a Multi-Agent System
- Building Secure Agents (Model Armor + IAM)
- Deploy and Scale Agents on Agent Engine
Google Workspace Studio: No-Code Agents
Also announced in Day 2: Google Workspace Studio lets business users build agents without code [citation:3].
Type: "Every Friday, ping me to update my tracker" → Gemini creates the automation.
Connects to Asana, Jira, Mailchimp, Salesforce via webhooks or Apps Script. Rolling out to Workspace business and enterprise customers.
Project Mariner: Web-Browsing Agents
Project Mariner scores 83.5% on WebVoyager — better than most human benchmarks [citation:3].
Handles 10 concurrent tasks on cloud VMs: shopping, research, form-filling. Available now to Google AI Ultra subscribers in the US.
Roadmap:
- Q2 2026: Mariner Studio (visual builder)
- Q3 2026: Cross-device sync
- Q4 2026: Agent marketplace
The "Open" Question (Addressed Day 2)
During the keynote panel, Harrison Chase asked directly: "How open is A2A really?"
Google's response: The protocol is governed by the Linux Foundation. Microsoft, AWS, and Salesforce are all running it in production. The spec is public. The code is portable.
But: The smoothest path — Agent Engine, Apigee, Vertex AI — runs through Google Cloud. That's not lock-in. That's differentiation. Know the difference.
What to Actually Do With This
If you're building agents right now:
- Read the A2A spec before the SDK docs. Understanding Agent Cards — what goes into them, how signing works, what a well-defined skill description looks like — shapes how you design agents from the start .
- Try MCP first — Pick one workflow that involves pulling data from somewhere and summarizing it. That's your first MCP experiment .
If you're choosing a multi-agent framework:
- A2A v1.0 in production at 150 organizations, across every major framework, is a meaningful signal about where multi-agent interoperability is actually converging.
If you're speccing an enterprise AI project:
- Look at Memory Bank and Agent Identity before you finalize the architecture. Persistent agent state and proper credential management are the two things that most demo architectures quietly skip.
The Part That's Easy to Miss
The keynote demo that got the biggest reaction showed a Gemini agent pulling data from thousands of PDFs, catching a buried allergen hidden in one of them, then calling research agents to build a full market projection — autonomously, while the presenter talked.
That's a real capability and it's impressive. But it works because of things that weren't in the demo :
- Agents that can find each other by capability (Agent Registry)
- Agents that verify each other's identity (Agent Identity)
- Agents that maintain context between calls (Memory Bank)
- Agents operating inside an auditable security boundary (Agent Gateway)
That's the plumbing. And it's what makes the magic possible.
Conclusion: From Prompter to Orchestrator
Thomas Kurian's framing was bold: "You have moved beyond the pilot. The experimental phase is behind us" .
For developers, this means our job is shifting. We aren't just "prompting" anymore — we are orchestrating systems of intelligence.
The "Agentic Cloud" is finally providing:
- A protocol standard (A2A) under neutral governance
- Power-efficient hardware (TPU v8i) for inference at scale
- Managed MCP servers that remove an entire category of integration work
- A security story (Model Armor, Agent Identity) you can defend to your CISO
Infrastructure doesn't announce itself. It just works — until the day you need it and it's not there. Thankfully, the plumbing for the future of AI is finally being laid.
What are you building with the new Agentic Stack? Drop a comment below — I'd love to hear your take on A2A vs. MCP and what integrations you're most excited about.
Resources to Dive Deeper
- A2A Protocol Spec
- ADK Documentation
- Linux Foundation Agentic AI Foundation
- Model Armor Security Guide
- Managed MCP Servers
Posted as part of the Google Cloud NEXT '26 Writing Challenge on DEV. The developer keynote is available on the DEV homepage — worth catching for how the ADK and A2A story gets told to a technical audience.
Top comments (0)