I'll be honest with you.
I almost didn't write this post about what I actually wanted to write about.
Everyone's keynote recap is going to lead with Gemini Enterprise Agent Platform. The rebrand is flashy. Thomas Kurian's line about "turning intelligence into a growth engine" will be quoted in a hundred LinkedIn posts by Thursday. And sure — the platform story is real and it matters.
But I've been building multi-agent systems for the past year. I know what the actual friction points are. And when I dug past the keynote slides and into the technical sessions and the updated GitHub repos, I found the announcement that genuinely made me put down my coffee and sit up straight.
It's the Agent-to-Agent (A2A) protocol hitting v1.0 in production — now natively integrated across LangGraph, CrewAI, LlamaIndex, AutoGen, and Semantic Kernel — combined with the Agent Development Kit (ADK) shipping stable releases in Python, Go, Java, and TypeScript with a graph-based orchestration engine, native OpenTelemetry tracing, and human-in-the-loop controls baked directly into the tool definition layer.
That's the story. Let me tell you why it changes everything.
Table of Contents
- The Problem Nobody Wants to Admit
- What A2A Actually Is
- The ADK Breakdown
- The Angle Everyone's Missing
- My Honest Critique
- What I'm Building This Weekend
- Final Take
The Problem Nobody Wants to Admit
If you've shipped a multi-agent system into production — not a demo, an actual production system — you've felt this pain:
You write a LangChain orchestrator. It needs to delegate a subtask to a specialised agent built by another team in CrewAI. You spend two days writing glue code. Then one team updates their agent's API. Your glue code breaks. You spend another day fixing it. Multiply that by every agent-to-agent connection in your system.
There's no standard handshake. There's no shared task interface. Every boundary between agents is a custom integration you own forever.
Thomas Kurian, CEO of Google Cloud: "The first wave of AI changed how we find information; the next wave is changing how we get work done."
That quote is true. But it only becomes true in practice when agents can reliably hand work to each other without you writing the bridge. That's the gap A2A fills — and it's a bigger gap than most keynote coverage is acknowledging.
What A2A Actually Is (Without the Marketing)
Think of A2A as the HTTP of the agentic web.
HTTP didn't matter because it was technically elegant. It mattered because once everyone agreed to use it, any browser could talk to any server. You didn't need a custom protocol for every client-server pair.
A2A does the same thing for agents. Any A2A-compliant agent publishes an "Agent Card" — a structured JSON descriptor of its capabilities. Any other A2A-compliant agent can discover it, send it a task, receive streamed updates, and handle completion or failure — without knowing anything about the other agent's internals, framework, or hosting environment.
| Without A2A | With A2A |
|---|---|
| Every agent-to-agent connection is a custom integration you write and maintain permanently. | Agents publish capabilities via a structured "Agent Card" that any compliant agent can discover and call. |
| Breaks on every API update from either side. | The protocol handles streaming, partial responses, and error states automatically. |
| A Salesforce agent talking to a ServiceNow agent requires you to own that bridge indefinitely. | A Salesforce agent hands off to a Google agent which queries a ServiceNow agent — none need to know each other's internals. |
The adoption signal here is worth pausing on. A2A v1.0 is already in production at 150+ organisations and baked natively into LangGraph, CrewAI, LlamaIndex, AutoGen, and Semantic Kernel. That's not Google getting ahead of the curve — that's the ecosystem converging. When competitors adopt your protocol, it stops being your protocol and starts being infrastructure.
The ADK: Stable v1.0 Is Not the Same as "Available"
The Agent Development Kit existed before NEXT '26. But "stable v1.0" is a genuinely different thing from "available as a preview." Here are the three changes that matter most to me as someone who'll actually build with this:
1. Graph-Based Orchestration — Finally
Before this release, ADK coordination was essentially prompt chaining with extra steps. The new graph-based framework lets you define agent coordination as a directed graph where each node is a well-typed agent and each edge is an explicit conditional or sequential flow.
Why does this matter? Because prompt chains are untestable black boxes. A directed graph is something you can write unit tests for. It's something your team can review in a pull request. It's something you can trace when it fails.
from google.adk.agents import LlmAgent, SequentialAgent, ParallelAgent
from google.adk.tools import google_search
# Agent 1: Specialised researcher
research_agent = LlmAgent(
name="researcher",
model="gemini-3-flash",
tools=[google_search],
instruction="Research the given topic and return structured key facts."
)
# Agent 2: Specialised writer
writer_agent = LlmAgent(
name="writer",
model="gemini-3.1-pro",
instruction="Write a concise, developer-focused brief from the research provided."
)
# Explicit graph: research MUST complete before writing begins
pipeline = SequentialAgent(
name="research_to_brief_pipeline",
sub_agents=[research_agent, writer_agent]
)
This is clean. This is testable. This is the kind of code that survives a team handoff six months from now.
2. Human-in-the-Loop as a Structural Guarantee
This is the one that actually made me stop and re-read the docs.
Every agentic system has the same nightmare scenario: the agent decides the right next step is to delete something, send something, or charge something — and it's wrong. The traditional solution has been to write a very careful system prompt that says "always ask before doing irreversible things." That's not a safety mechanism. That's hoping.
ADK 1.0 ships require_confirmation as a first-class property on tool definitions:
from google.adk.tools import FunctionTool
# This tool CANNOT execute without explicit human approval.
# The agent will pause, generate a confirmation event, and wait.
delete_tool = FunctionTool(
name="delete_production_records",
description="Permanently removes records from the production database.",
require_confirmation=True, # Structural safety — not a prompt instruction
func=delete_records_func
)
The agent pauses execution, surfaces the confirmation event to your application layer, and waits for an explicit human signal before proceeding. The safety is in the code, not in the prompt. That distinction is everything for production systems.
3. Native OpenTelemetry — The Feature That Makes Debugging Sane
Multi-agent debugging has always been a nightmare. When something goes wrong three hops into an agent chain, you're reading unstructured log output trying to figure out which model call, which tool execution, or which agent handoff caused the failure.
ADK Go 1.0 ships with native OTel integration:
// One initialisation call — every model call, tool execution,
// and A2A handoff now generates structured traces automatically
telemetryProviders, err := telemetry.New(ctx,
telemetry.WithOtelToCloud(true),
)
if err != nil {
log.Fatal(err)
}
defer telemetryProviders.Shutdown(ctx)
Every model call and tool execution generates structured spans. Your entire agent graph becomes observable in Cloud Trace, Grafana, Jaeger, or any OTel-compatible backend. This transforms debugging from archaeology into engineering.
The Angle Everyone's Missing: MCP + A2A = Complete Connectivity
Here's what I think the coverage is underweighting. Google didn't just announce A2A. They announced Managed MCP Servers with Apigee as an API-to-agent bridge at the same time.
MCP (Model Context Protocol) handles agent-to-tool communication — how your agent connects to databases, APIs, files, and external services. A2A handles agent-to-agent communication — how one autonomous system delegates work to another.
These two protocols together, built on ADK, form something genuinely new:
| Layer | Connects | Solves |
|---|---|---|
| MCP | Agent to Tool | Any agent, any data source — zero bespoke integration code |
| A2A | Agent to Agent | Any framework, any vendor — delegate tasks across system boundaries |
| ADK | You to All of it | Build, test, trace, and deploy with a code-first framework in 4 languages |
Before this week, you had pieces. Now you have a stack.
I keep coming back to this analogy: HTTP didn't just connect browsers to servers. TCP/IP didn't just connect two machines. The protocols that become foundational are the ones that quietly become infrastructure while everyone's watching the demos. That's what I think is happening here with MCP + A2A — and most of the keynote coverage is missing it entirely.
My Honest Critique — Because Hype Doesn't Help Anyone
I want to be clear: I'm genuinely excited about this stack. But I've been burned by Google's "open" announcements before, and there are real questions I'd want answered before going all-in.
But ADK's most valuable production features — Agent Engine for managed hosting, the tracing integration, the Agent Registry — are tightly coupled to Google Cloud infrastructure. The protocol is portable. The developer experience for running agents at production scale without Vertex AI is not clearly documented. Before you architect your next production system on ADK, ask these questions: The open-source repo exists. But "open source" and "vendor-neutral in practice" are different things. Right now, the honest answer is that managed production hosting without Vertex AI is an exercise for the reader. That may change — but watch it carefully before committing.⚠️ The lock-in risk nobody's asking about
A2A is published as an open protocol. ADK is open source on GitHub. Both of these things are true.
I want to see those numbers reproduced by independent researchers before I'd use them in an architecture decision. Vendor benchmarks are marketing until third parties validate them. The architectural direction is correct — purpose-built inference silicon is the right answer for agent-scale workloads — but wait for the independent analysis before committing to cost projections.🔬 A note on the TPU 8i "Zebrafish" inference benchmarks
The 8th-gen Zebrafish chip is specifically designed for inference workloads — which matters enormously for agentic systems where you might have hundreds of model calls per user session. Google's numbers show significant cost reduction for inference-heavy workflows.
Who This Stack Is NOT For
This matters as much as the praise. Skip ADK + A2A if:
- You're in a regulated industry without a clear GCP data residency story. The managed hosting layer currently runs on Google infrastructure. If your compliance team needs sovereign data guarantees, this isn't ready for you yet.
- Your agents are simple, single-step tools. If your "agent" is really just a prompt wrapper around one API call, A2A is massive overkill. Use a plain function call and move on.
- Your team has no experience with distributed systems. Multi-agent orchestration introduces failure modes — network partitions, partial completions, cascading retries — that don't exist in single-model applications. If your team hasn't shipped a distributed system before, the complexity budget will be spent on debugging infrastructure instead of building product.
- You need stack portability today. If you're multi-cloud or planning to be, the honest answer is that running ADK agents outside of GCP's managed environment requires significant DIY work that isn't well documented yet.
What I'm Building This Weekend to Validate All of This
I don't fully trust any framework until I've hit its rough edges in practice. Here's the experiment I'm running over the next two days:
The system: Two ADK agents connected via A2A
- A research agent — uses
google_searchvia a Managed MCP Server to gather structured information on a given topic - A drafting agent — receives the research output via A2A and produces a structured technical brief
What I'm specifically testing:
- Is the A2A handoff genuinely seamless, or does it require manual serialisation between agents?
- Does
require_confirmationblock correctly in a multi-agent pipeline, or does it create blocking issues? - How useful are the OTel traces when something actually goes wrong — I'll deliberately introduce a failure to find out
The full code is already up and being updated as I build:
ADK A2A Validation
A weekend experiment validating Google's Agent Development Kit (ADK) and Agent-to-Agent (A2A) protocol in a real multi-agent pipeline — built as a companion to my DEV Community post on the most underrated announcements from Google Cloud NEXT '26.
What I'm Testing
Test
Question
A2A Handoff
Is the agent-to-agent task delegation genuinely seamless, or does it require manual serialisation?
Human-in-the-Loop
Does
require_confirmation block correctly in a multi-agent pipeline without creating deadlocks?
Observability
Are OTel traces actually useful when a deliberate failure is introduced mid-pipeline?
Architecture
┌─────────────────┐ A2A Protocol ┌─────────────────┐
│ Research Agent │ ──────────────────────── │ Drafting Agent │
│ (gemini-3-flash)│ structured task │ (gemini-3.1-pro) │
│ │ streamed updates │ │
│ Tools: │ │ Output: │
│ - google_search │ │ - Technical │
│ (via MCP) │ │ Brief (MD) │
└─────────────────┘ └─────────────────┘
Agents
-
Research Agent — Uses
google_searchvia a Managed MCP Server to gather…
I'll post the full findings — including traces, failure cases, and honest friction points — as a follow-up here on DEV next week.
Scaffold your first agent project: Run it locally: The built-in web UI launches at 🚀 Get started yourself — 3 commands
Install ADK:
pip install google-adk
adk quickstart my_first_agent
cd my_first_agent
adk run
localhost:8080 and gives you an interactive chat interface, a tool call log, and a trace visualiser — all working out of the box. For a v1.0 release, the local development experience is genuinely polished. The trace visualiser alone is worth the install even if you never deploy to GCP.
Final Take
The "Agentic Cloud" framing from Google Cloud NEXT '26 is accurate. But the announcement that actually earns that title isn't the Gemini Enterprise rebrand — it's the quiet stabilisation of the protocols and tools that let agents communicate, collaborate, and operate safely in production.
A2A v1.0 in production at 150+ organisations. ADK stable in four languages. MCP managed servers. Human-in-the-loop as a structural guarantee. Native OTel tracing.
These aren't features. They're the foundation.
The era of building AI "features" is over. We're in the era of building systems of agents now. And for the first time, the tooling to do that responsibly — with observability, human oversight, and genuine framework interoperability — is stable enough to build on.
That's the story from NEXT '26 I'll be thinking about for the next year.
What are you most excited — or most skeptical — about from this week's announcements? I'm especially curious from folks who've already shipped multi-agent systems: does A2A solve the integration pain you've actually felt? And has anyone looked closely at the security boundary around require_confirmation — specifically whether an LLM can hallucinate a confirmation signal and bypass it? Drop it in the comments. If you're running similar experiments this weekend, let's compare notes.


Top comments (0)