This is a submission for the Google I/O Writing Challenge
Google I/O 2026’s Smartest Developer Release Wasn’t a Model. It Was the Runtime.
Every Google I/O has its headline magnet. A faster model, a shinier demo, a new capability that makes developers excited, nervous, or both. Google I/O 2026 had plenty of those moments. Gemini 3.5 Flash came with serious benchmark energy. WebMCP gave the open web crowd something ambitious to debate. AI Studio, Chrome, Search, and Gemini all moved deeper into agentic territory.
But the most important developer announcement was not the loudest one.
It was the Managed Agents in the Gemini API.
That may sound less glamorous than a new model, but that's also exactly why it matters. Models are the engines, while Managed Agents are the chassis, gearbox, dashboard, pit crew, and the emergency brake. It is the layer that turns “the model can reason and use tools” into “my application can ask an agent to do useful work, observe what it did, preserve state, collect artifacts, and continue from there.”
That is a very different product. And for developers, it may be the more important one.
The Real Bottleneck was Never Intelligence
For the last couple of years, agent demos have followed a mundane script:
- A model receives a task.
- It calls the required tools.
- It plans and writes code.
- Runs the code and inspects the result.
- Fixes its own mistake.
Everyone nods. But then, a developer tries to build the same thing in production and immediately runs into the actual problem.
The hard part is not only getting the model to think. The hard part is giving it a place to work. A serious agent needs a runtime. It needs a sandbox, files, tool boundaries, memory or state. It needs observable intermediate steps and controls for network access, credentials, cost, and cleanup, not to add developer ergonomics that do not require every team to rebuild the same orchestration layer from scratch.
This is the gap Managed Agents tries to close.
Google’s announcement is not plainly: “Gemini can use tools.” The more interesting claim is this: Google is packaging the agent loop itself as a managed developer primitive.
With Managed Agents, the Antigravity managed agent can run inside a Google-hosted Linux environment, execute code, manage files, use web access, preserve environment state, and return observable execution traces through the Interactions API. That shifts the developer’s job.
Instead of building the whole runtime yourself, you can start from a hosted agent environment and focus on the product boundary around it.
That boundary is where the real engineering begins.
What Google Shipped
At I/O 2026, Google introduced Managed Agents in the Gemini API, with the Antigravity agent available as a public preview. The agent is powered by Gemini 3.5 Flash and exposed through the Interactions API and Google AI Studio.
The product has a few key parts:
| Component | What it does |
|---|---|
| Gemini API | The developer API surface for Google’s models and agents |
| Interactions API | The API layer built for stateful, agentic, multi-turn workflows |
| Antigravity managed agent | Google’s hosted general-purpose agent harness |
| Remote Linux environment | A sandbox where the agent can execute code and manage files |
| Environment ID | A handle that lets later calls continue in the same workspace |
| Interaction ID | A handle for continuing conversational state |
| AI Studio Agents Playground | A visual way to prototype agent behavior |
| Custom agents | Reusable agent configurations with instructions, sources, and environment settings |
The important part is that this is not a single stateless prompt-response API. A stateless call is good for generation, classification, extraction, summarization, and one-shot reasoning. A managed agent is better suited for work that requires state, tools, files, and iteration.
Think data analysis, repository auditing, research synthesis, report generation, benchmark runs, documentation updates, or internal workflow automation. Which is why, Managed Agents is more than another AI feature. It is closer to an execution substrate.
The Architecture: Prompts Go In, Work Comes Out
A simplified Managed Agents flow looks like this:
The developer sends a task through the Interactions API. The managed agent receives it, reasons through the task, uses available tools, reads or writes files inside the remote environment, and returns both the final output and structured information about execution.
The key is state.
Google gives developers two major handles:
| Handle | Purpose |
|---|---|
previous_interaction_id |
Continue the conversation |
environment_id |
Continue working in the same sandbox |
That second handle is especially important. Without environment persistence, every agent task becomes a one-shot performance. With environment persistence, the agent can build on previous files and results.
Turn one can create an analysis. Turn two can improve the chart. Turn three can package the output. Turn four can audit the final files.
That feels less like prompting a chatbot and more like supervising a remote worker with a shell.
A Minimal API Pattern
The cleanest mental model is:
| Concept | Mental model |
|---|---|
| Interaction | The conversation and reasoning state |
| Environment | The working directory and execution state |
| Agent | The policy and tool-using worker |
| Artifact | The files created by the work |
| Step trace | The observable record of what happened |
A basic Python workflow could look like this:
from google import genai
client = genai.Client()
first_run = client.interactions.create(
agent="antigravity-preview-05-2026",
input=(
"Read revenue.csv, identify the top three trends, "
"and save a short report as report.md."
),
environment="remote",
)
print(first_run.output_text)
print(first_run.environment_id)
second_run = client.interactions.create(
agent="antigravity-preview-05-2026",
previous_interaction_id=first_run.id,
environment=first_run.environment_id,
input=(
"Now create a chart for the strongest trend "
"and save it as chart.png."
),
)
print(second_run.output_text)
The developer did not manually create a container, pass files between steps, write a tool router, manage the execution loop, or build a step logger. The managed runtime absorbs much of that scaffolding.
That is the product insight. Google is not only offering model intelligence. It is offering the operating context around that intelligence.
Why the Interactions API Matters
The Interactions API is one of the most important parts of this launch because it signals how Google expects developers to build with Gemini going forward.
Older model APIs are shaped around a single call: send content, get content. That works for many use cases. But agentic workflows need more structure. They need server-side state, tool traces, intermediate events, resumability, and file continuity.
Consider a data workflow: A user uploads three CSV files and asks for a short analysis --> The agent writes a script, runs it, creates a plot, and writes a markdown summary --> Then the user says, “Actually, split this by region and add a table.”
In a stateless setup, you either replay everything into context or manually store and reload outputs.
With Managed Agents, you continue from the prior interaction and reuse the same environment. The files are already there. The agent can inspect them again. The workflow becomes less like prompt engineering and more like a remote analytical session.
Custom Agents: From Prompt to Reusable Worker
Managed Agents are useful as one-off calls, but the more production-relevant pattern is creating reusable agents with stable instructions, sources, and environment controls. A repo auditing agent, for example, should not need a giant prompt every time. It should have a defined role, defined workspace, and clear output expectations.
A simplified setup might look like this:
from google import genai
client = genai.Client()
agent = client.agents.create(
id="repo-auditor",
base_agent="antigravity-preview-05-2026",
system_instruction=(
"Audit the repository for test failures, dependency issues, "
"and risky code patterns. Write findings to "
"/workspace/output/report.md."
),
base_environment={
"type": "remote",
"sources": [
{
"type": "repository",
"source": "https://github.com/your-org/your-repo",
"target": "/workspace/repo",
}
],
"network": {
"allowlist": [
{"domain": "api.github.com"},
{"domain": "pypi.org"},
]
},
},
)
This is where Managed Agents start to look less like “chat with tools” and more like infrastructure.
You can imagine teams defining internal agents such as:
| Agent | Purpose |
|---|---|
data-report-agent |
Turn CSVs into charts and summaries |
repo-auditor |
Review a codebase and write findings |
release-note-agent |
Compare commits and draft release notes |
benchmark-agent |
Run evaluation scripts and summarize metric changes |
doc-update-agent |
Propose documentation changes from source updates |
The important engineering move is repeatability.
A useful agent should not depend on a perfect prompt typed by a tired developer at 1:13 AM. It should have persistent instructions, restricted access, stable output paths, and behavior that can be reviewed.
That is the difference between a demo and a product.
Filesystem-Native Configuration is a Bigger Deal Than it Sounds
One detail that I like is the support for instruction files such as AGENTS.md and skill files such as SKILL.md.
Now why is that a huge thing?
Developers already understand files. Repositories already have conventions. Teams already review documentation, configuration, and scripts in pull requests. Putting agent behavior into files makes that behavior easier to inspect, diff, review, and version.
A repository might look like this:
repo/
AGENTS.md
.agents/
skills/
audit-tests/
SKILL.md
summarize-changes/
SKILL.md
src/
tests/
package.json
That is a smart direction because it makes agent behavior part of the software project, not an invisible prompt hidden inside a product dashboard.
A team can review:
| Question | Why it matters |
|---|---|
| What is the agent allowed to do? | Defines operational boundaries |
| What files can it inspect? | Controls scope |
| What outputs should it produce? | Improves repeatability |
| What external domains can it access? | Reduces leakage risk |
| What skills does it use? | Makes behavior easier to audit |
This is the kind of technical detail that decides whether agents become production tools or remain conference magic tricks.
Observability: Because “Trust Me Bro” is Not a Log Format
An agent that runs code, reads files, searches the web, and creates artifacts cannot be a black box. Developers need to know what happened. Not in a vague “the agent analyzed your data” way. They need step traces. They need to inspect tool calls. They need to see what files were touched, what commands ran, what sources were consulted, and where the process failed.
Agent observability has three jobs:
| Job | Why it matters |
|---|---|
| Debugging | You need to know where the process went wrong |
| Trust | Users are more likely to accept output when they can inspect the path |
| Governance | Teams need records for security, compliance, and review |
This is another reason the Interactions API matters. Agentic applications are not only about final text, they are about the work behind the text. A good platform needs to expose that work.
Pricing and Control
Managed Agents is useful, but agentic workflows can spend tokens quickly. A normal chat call is usually bounded by input and output. An agent run may include planning, tool calls, file inspection, code execution, error recovery, generated artifacts, and multiple rounds of iteration.
That means cost control is a product requirement, not an accounting afterthought.
A real integration should include:
| Control | Why it helps |
|---|---|
| Narrow task scopes | Prevents sprawling behavior |
| Budget limits | Stops runaway usage |
| Streaming visibility | Lets users cancel bad runs early |
| Clear stop conditions | Reduces unnecessary iteration |
| Human approval gates | Protects sensitive actions |
| Environment cleanup | Avoids stale or risky artifacts |
Security: A Useful Power Still Needs a Fence
Managed Agents is exciting because it gives the model a place to act. That is also why it deserves caution. An agent that can read private files, process untrusted content, browse the web, and call tools has a real attack surface. The risky combination is:
- Access to private data
- Exposure to untrusted instructions or content
- Ability to communicate externally or take actions
That combination can create prompt injection, data exfiltration, and tool misuse risks. A safer architecture should wrap the managed agent in policy checks, scoped files, network allowlists, human review, and audit logs.
A practical rule: treat the agent like a junior engineer with shell access.
Useful? Absolutely. Unsupervised in production? Please do not make your incident report write itself.
What Needs To Improve
Managed Agents is still a preview product, and the limitations matter.
Current constraints include preview API stability, limited tool support in some areas, no structured outputs for the Antigravity agent, no MCP support for this agent yet, no background execution for Antigravity, and limited multimodal input coverage.
The biggest gap for many developers is structured output. If an agent produces artifacts for humans, markdown is fine. If it feeds another system, developers often need strict schemas.
A more mature production version should improve:
| Feature | Why it matters |
|---|---|
| Structured outputs | Safer system-to-system integration |
| Job controls | Better cancellation, retries, and background runs |
| Policy controls | Stronger file, tool, and network permissions |
| MCP support | Better tool ecosystem interoperability |
| Evaluation hooks | Easier testing before deployment |
That would move Managed Agents from promising preview to serious default runtime.
Final Verdict
Gemini 3.5 Flash gives Google a stronger engine, and WebMCP hints at a more agent-readable web. But Managed Agents gives developers the layer they actually need to turn model intelligence into product behavior: a runtime.
That runtime can execute code, handle files, preserve state, expose steps, and produce artifacts. It also forces serious questions about security, cost, observability, and control. That is exactly why it is interesting. The future of agentic software will not be won only by the smartest model; it will be won by the platform that makes smart models useful, inspectable, constrained, and economically sane.
So yes, enjoy the flashy demos. Watch the model benchmarks. Argue about whether the web needs WebMCP. But if you are a developer deciding what to build after I/O, pay close attention to the less sparkly runtime announcement.
That is usually where the future hides.
References
Google I/O 2026 announcements hub:
https://blog.google/innovation-and-ai/technology/ai/google-io-2026-all-our-announcements/
Managed Agents in the Gemini API:
https://blog.google/innovation-and-ai/technology/developers-tools/managed-agents-gemini-api/
Antigravity managed agent documentation:
https://ai.google.dev/gemini-api/docs/antigravity-agent
Managed Agents quickstart:
https://ai.google.dev/gemini-api/docs/managed-agents-quickstart
Interactions API documentation:
https://ai.google.dev/gemini-api/docs/interactions
AI Studio Agents documentation:
https://ai.google.dev/gemini-api/docs/aistudio-agents
Gemini API pricing:
https://ai.google.dev/gemini-api/docs/pricing
WebMCP early preview:
https://developer.chrome.com/blog/webmcp-epp
DEV Google I/O Writing Challenge:
https://dev.to/challenges/google-io-writing-2026-05-19







Top comments (0)