Originally published at twarx.com - read the full interactive version there.
Last Updated: June 25, 2026
Every agentic framework you spent six months building on top of the old Gemini API is now a liability — Google just absorbed your orchestration layer into the API itself. The new Interactions API for Gemini models and agents hitting general availability isn't an incremental update. It's a declaration that stateless model calls were always the wrong abstraction, and the timing matters: every Gemini doc now defaults to it.
Here's the short version. The Interactions API Gemini models agents pattern is now Google's primary interface — one unified endpoint that gives you three things the old API never did: persistent server-side state, async background execution, and provisioned Managed Agents. It's becoming the industry baseline because Google is rewriting every doc and 3P SDK around it.
After reading this migration guide, you'll know exactly what it replaces, how it works, what it actually costs in dollars and engineer-hours, and whether to migrate off LangGraph, AutoGen, or the OpenAI Assistants API. If you want to skip ahead to reference architectures, our AI agent library maps cleanly onto Managed Agents.
Google's official Interactions API general availability announcement — one unified endpoint for Gemini models and agents with server-side state and background execution. Source
Coined Framework
The Stateless Tax — the hidden engineering cost every team building on the original Gemini API paid in custom session management, retry logic, and tool-chaining scaffolding that the Interactions API now eliminates at the infrastructure level, making prior agentic architectures a technical debt liability rather than a competitive advantage
The Stateless Tax is the recurring engineering overhead you pay when a model API forgets everything between calls, forcing you to rebuild context, state, and orchestration yourself. The Interactions API moves that burden server-side — so the scaffolding that used to be your moat becomes dead weight. I'll put a dollar figure on it later in this guide, because abstract 'technical debt' never moves a budget meeting; a number does.
What Google Announced: Interactions API Reaches General Availability
Official announcement date, source, and version details
Google announced that the Interactions API has reached general availability and is now its primary API for interacting with Gemini models and agents. The official blog.google post is co-authored by Ali Çevik (Group Product Manager, Google DeepMind) and Philipp Schmid (Senior Developer Relations Engineer, Google DeepMind). The API launched in public beta in December 2025 and, per the post, 'has quickly become developers' favorite way to build applications with Gemini.'
With this GA release, the API now ships a stable schema (the v1 stable surface, replacing the v1beta endpoint used during the December 2025 beta) alongside three capabilities developers had been asking for: Managed Agents, background execution, and Gemini Omni — the last of which Google flags as 'coming soon.'
What 'general availability' means for production teams
GA is the signal production teams wait for. A stable schema means breaking changes now follow a formal deprecation cycle — the single biggest blocker to building durable systems on a fast-moving API. Google also confirmed: 'All of our documentation now defaults to Interactions API and we are working with ecosystem partners to make it the default interface across 3P SDKs and Libraries.'
Key quote from Google's official blog.google post
'Today we're announcing that the Interactions API has reached general availability and is now our primary API for interacting with Gemini models and agents.' — Ali Çevik, Group Product Manager, Google DeepMind
The companion capability announced simultaneously is Managed Agents: a single API call provisions a remote Linux sandbox where an agent can reason, execute code, browse the web, and manage files. Google's own Antigravity agent ships as the default inside that secure sandbox.
Dec 2025
Interactions API public beta launch
[Google, 2026](https://blog.google/innovation-and-ai/technology/developers-tools/interactions-api-general-availability/)
1
Unified endpoint for models AND agents
[Google, 2026](https://blog.google/innovation-and-ai/technology/developers-tools/interactions-api-general-availability/)
Default
All Gemini docs now default to Interactions API
[Google AI for Developers, 2026](https://ai.google.dev/)
What the Interactions API Is and How It Works
The core abstraction: from stateless calls to stateful interactions
The original Gemini API was stateless. Every call was a fresh transaction — the model had no memory of the previous turn unless you re-submitted the entire conversation history on every request. That's the heart of The Stateless Tax. Your client became the database, the retry engine, and the orchestrator all at once. I burned the better part of two weeks chasing session-ID collisions in a Redis-backed transcript store before this API landed — the bug only ever surfaced under concurrent load, which is exactly the worst time to be debugging serialization logic.
The Interactions API flips the model. You create an interaction that the server retains. Whether you're calling a model or running an agent, Google's framing is blunt: 'the Interactions API gets you there in a few lines of code. Pass a model ID for inference, an agent ID for autonomous tasks, set background=True for anything long-running.'
Server-side history and session management explained
Server-side state means the API retains context across turns without the client re-submitting full history. This removes an entire class of bugs: truncated context windows, token-counting math, history serialization, and the latency hit of shipping a growing transcript on every call. For a 20-turn support conversation, you stop re-sending 19 turns of payload each time. That's not a minor optimization — it's a different architecture entirely.
Most teams underestimate the Stateless Tax because it never shows up as one line item. It hides inside retry middleware, a Redis session store, a context-truncation helper, and the on-call engineer who gets paged at 2am to figure out why turn 14 dropped the system prompt. The Interactions API deletes all four of those at once.
Background execution model and asynchronous processing
Set background=True on any call and the server runs the interaction asynchronously — the work continues after the HTTP connection closes. This is direct parity with OpenAI's Assistants API runs, and it's what makes long-running agentic tasks viable without holding open sockets or building your own job queue. Multi-step research, code generation, web browsing — none of that should be blocking a synchronous call, and now it doesn't have to be.
The single unified endpoint architecture
A RAG pipeline that previously required three orchestrated steps — vector database query, context injection, model call — can now be expressed as a single Interactions API call with tool declarations. The orchestration moves into the API. Less code you own, less code that breaks.
Before vs After: How a Multi-Turn Agentic Call Flows
1
**OLD: Client assembles full history**
Your app loads the entire transcript from Redis, counts tokens, truncates the oldest turns, and re-serializes the payload. Latency + error surface on every call.
↓
2
**OLD: Custom tool-chaining scaffolding**
You manually parse function-call outputs, dispatch to tools, loop responses back, and handle retries. This is LangGraph/AutoGen territory — and your maintenance burden.
↓
3
**NEW: Single Interactions API call**
Pass a model or agent ID + tool declarations. Server retains state, executes tools, and (with background=True) runs asynchronously. No client-side history, no scaffolding.
↓
4
**NEW: Poll or stream the result**
For background runs, poll the interaction ID for completion. The Stateless Tax — session store, retry logic, truncation math — is eliminated at the infrastructure level.
The sequence matters because every step in the OLD column was custom code your team owned, tested, and paged on — now absorbed server-side.
The architectural shift the Interactions API introduces — moving session state, tool-chaining, and retry logic from the client into Google's managed infrastructure, eliminating The Stateless Tax.
Full Capability Breakdown: Every Feature in the Interactions API
Server-side conversation history (persistent sessions)
The API retains context across turns server-side. Sessions persist without the client re-submitting full transcripts. This is the foundational capability — everything else builds on it, and it alone justifies a serious look at migration.
Background processing and async task execution
Per Google: 'Set background=True on any call. The server runs the interaction asynchronously.' This is the feature that makes autonomous, long-horizon agents production-viable on a managed API. I would not ship a multi-minute research agent on a synchronous call — that's how you get timeout cascades in prod, and I've cleaned up after exactly that mistake.
Tool combination and multimodal input support
Google highlighted 'Tool improvements: Mix built-in tools' at GA. Multimodal context spans text, image, audio, video, and code within a single interaction. Tool combination allows multiple tool calls in one turn — which cuts round-trips substantially in multi-tool agentic flows. That's not a minor throughput gain; for workflows with five or more tool calls per task, it compounds fast.
The counterintuitive win isn't the new features — it's the deletions. Every line of orchestration code you delete is a line you no longer have to keep compatible with the next Gemini model. The Interactions API turns your agent stack from an asset you maintain into a dependency Google maintains.
Managed Agents: what they are and how they differ from DIY agents
Managed Agents are the headline GA capability: 'A single API call provisions a remote Linux sandbox where an agent can reason, execute code, browse the web and manage files.' The Antigravity agent ships as the default, and you can define custom agents with instructions, skills, and data sources. This directly addresses the security surface problem that plagued self-hosted AutoGen and CrewAI deployments — no more managing your own code-execution sandboxes. That alone was a part-time job. For prebuilt patterns, browse our production agent blueprints.
MCP (Model Context Protocol) integration
The Model Context Protocol lets Interactions API agents consume MCP-compatible tool servers without custom adapter code. Google is aligning with the emerging open standard rather than locking developers into a proprietary tool schema. For teams already running MCP servers, integration is near-zero-effort — which is exactly how it should be.
Coined Framework
The Stateless Tax in the wild: what you were actually paying for
If your team maintained a session store, a retry wrapper, a token-truncation helper, and a function-call dispatch loop, that bundle was your Stateless Tax bill. The Interactions API doesn't optimize those components — it makes them obsolete.
Stable schema and developer-requested features at GA
The stable schema commitment is the most consequential reliability signal at this release. Breaking changes now follow a formal deprecation cycle. This was not a nice-to-have: according to the State of AI Engineering 2026 report published by the LangChain team (June 2026), 58% of teams shipping LLM applications cited API schema instability as a top-three reliability risk over the prior twelve months. Teams that got burned rebuilding integrations after silent schema shifts will understand exactly why a formal deprecation cycle changes the calculus.
Stateless model calls were never a feature. They were a default we all mistook for an architecture — and the bill came due the moment we shipped agents to production.
How to Access and Use the Interactions API: Step-by-Step Guide
Prerequisites: API keys, Google AI Studio, and Google Cloud setup
The Interactions API is available via two auth paths against the same endpoint: the Google AI for Developers portal (API key from Google AI Studio) and Google Cloud Vertex AI (IAM-based auth for enterprise). Apple developers can now call Interactions API-backed Gemini models via the Foundation Models framework, enabling Xcode build-pipeline agent invocation through the same surface — which is a genuinely unexpected distribution vector.
Making your first stateful Interactions API call
Python — first stateful call
Install: pip install google-genai
from google import genai
client = genai.Client(api_key='YOUR_API_KEY')
Create a stateful interaction — no client-side history needed
interaction = client.interactions.create(
model='gemini-2.5-pro', # recommended for production agentic flows
input='Summarize our Q2 support tickets and flag refund risks.'
)
print(interaction.output_text)
Continue the SAME interaction — server retains context
followup = client.interactions.create(
interaction_id=interaction.id, # state lives server-side
input='Now draft replies for the top 3 refund risks.'
)
print(followup.output_text)
Declaring tools and enabling background execution
Python — tools + background execution
Long-running agentic task with tool combination
run = client.interactions.create(
model='gemini-2.5-pro',
input='Research competitor pricing and produce a report.',
tools=[{'type': 'web_search'}, {'type': 'code_execution'}],
background=True # runs async, survives connection close
)
Poll for completion using the interaction ID
status = client.interactions.get(run.id)
print(status.state) # 'running' -> 'completed'
Setting up a Managed Agent with the Antigravity sandbox
Python — Managed Agent (Antigravity sandbox)
One call provisions a remote Linux sandbox
agent_run = client.interactions.create(
agent='antigravity', # default Managed Agent
input='Clone the repo, run the tests, and fix any failures.',
background=True
)
The agent reasons, executes code, browses, and manages files
inside an isolated Google Cloud sandbox — no infra to manage
Building production agents? You can pair these patterns with prebuilt blueprints — explore our reference architectures that map cleanly onto Managed Agents, and combine them with broader orchestration strategies.
Pricing model and free tier limits as of June 2026
Gemini 2.5 Pro is the recommended model for production agentic workflows at GA, with the Gemini 3 family also supported per the Gemini 3 Developer Guide. The free tier includes stateful sessions up to a documented context-window limit; enterprise background execution is billed per compute-second of agent runtime. The table below reflects published Gemini 2.5 Pro rates as of June 2026 — confirm exact figures on the official Gemini API pricing page before you budget, because these have changed before and will again.
Cost componentPublished rate (Jun 2026)Notes
Input tokens (Gemini 2.5 Pro)$1.25 / 1M tokensPrompts under 200K context
Output tokens (Gemini 2.5 Pro)$10.00 / 1M tokensStandard generation
Managed Agent background runtime$0.018 / compute-secondBilled only while the sandbox is active
Server-side session storageIncluded up to context-window limitNo separate per-session storage fee on the free tier
Rates compiled from the official Gemini API pricing page, June 2026. Always re-verify before committing a budget.
Availability: regions, platforms, and Apple developer access
Available globally through Google AI for Developers and Vertex AI, with Apple Foundation Models framework access enabling Gemini agent invocation inside Xcode build pipelines. For workflow-automation teams, see how this fits broader workflow automation strategies.
A worked implementation flow: stateful call, tool declaration, and Managed Agent provisioning — the practical mechanics of eliminating The Stateless Tax in production code.
[
▶
Watch on YouTube
Google Interactions API GA — building stateful Gemini agents
Google DeepMind • Gemini agents architecture
What The Stateless Tax Actually Costs: A Dollar-and-Hours Model
Abstractions don't move budgets — numbers do. So here is the model I use with clients. A mid-sized team running multi-turn agents on the old stateless Gemini API typically maintained four things: a managed session store, a retry/backoff wrapper, a token-truncation helper, and a function-call dispatch loop. Across the engagements I've reviewed, building and hardening that bundle ran 6–10 engineer-weeks up front, then 2–4 engineer-weeks per quarter in maintenance and on-call.
At a blended senior-engineer cost of roughly $4,800/week (a conservative figure derived from Levels.fyi 2026 US senior backend compensation data), an 8-week build is about $38,400, and 3 engineer-weeks of quarterly upkeep is about $14,400 per quarter — call it $57,600 annually in maintenance alone, on top of the managed session-store infrastructure (commonly $8,000–$12,000/month in compute). That is The Stateless Tax, itemized. The Interactions API doesn't trim that line — it deletes it.
6–10 wks
Typical up-front build for custom session + orchestration scaffolding
Twarx client engagement review, 2026
~$57.6K/yr
Estimated annual maintenance cost of The Stateless Tax (3 eng-weeks/quarter)
[Levels.fyi comp data, 2026](https://www.levels.fyi/)
58%
of LLM teams cited API schema instability as a top-three reliability risk
[State of AI Engineering 2026, LangChain](https://www.langchain.com/)
The most expensive code in your agent stack is the code you wrote to compensate for a stateless API — roughly $57,600 a year for a mid-sized team, before infrastructure. Google just made that code worthless, which is the best thing that could happen to your roadmap.
When to Use the Interactions API vs Alternatives
Use Interactions API when: production stateful agents, managed infra, Google ecosystem
The Interactions API eliminates The Stateless Tax for roughly 80% of standard agentic use cases: multi-turn assistants, RAG-backed chat, autonomous research tasks, and code-execution agents where you'd rather not run your own sandbox. If you're Google-native, this is your new default — there's no serious argument for keeping the old stateless plumbing.
Stick with LangGraph when: complex custom graph logic, multi-LLM routing
LangGraph retains a real edge for teams needing conditional branching across more than three agent roles, or routing across multiple non-Google models. If your workflow is genuinely a DAG with checkpointing, LangGraph stays more expressive — and the boilerplate cost is worth it. See our deeper LangGraph breakdown.
Stick with AutoGen or CrewAI when: deep role-based agent customisation
CrewAI's role-persona model has no native equivalent in the Interactions API yet. Teams building simulated expert panels or tightly role-scripted multi-agent systems should wait for Managed Agents v2 before migrating. Compare against our AutoGen guide.
Use n8n when: no-code workflow automation without agentic reasoning
n8n integrates with Gemini but doesn't natively surface Interactions API session state — webhook flows still require manual state passing. For visual, no-code pipelines, see our n8n walkthrough.
When OpenAI Assistants API still wins
The OpenAI Assistants API still leads on file search — the vector-store integration is more mature — but trails on multimodal background execution as of June 2026. Choose accordingly.
Interactions API vs Closest Competitors: Direct Comparison
vs OpenAI Assistants API: state, tools, and pricing
The OpenAI Assistants API launched in November 2023. The Interactions API reaching GA in June 2026 means Google took roughly 31 months to ship a comparable stateful interface — yet it launched with background execution and native MCP support that the Assistants API still lacks. So the lateness is real, but so is the head start it skipped.
vs Anthropic Claude API (tool use + extended thinking)
Anthropic's tool-use API remains stateless at the HTTP level — multi-turn state routes through the Messages API with client-managed history, identical to the old Gemini approach. Anthropic developers are still paying the Stateless Tax in full.
vs LangGraph Cloud: orchestration abstraction level
LangGraph Cloud offers graph-native orchestration with checkpointing — more expressive for DAG-style workflows, but it requires significantly more boilerplate and per-step cloud execution costs. There's a real tradeoff there, not a clear winner.
vs AutoGen Studio: managed vs self-hosted agents
AutoGen Studio requires self-hosted infrastructure or Azure. Interactions API Managed Agents is the first fully serverless, zero-infrastructure equivalent from a major model provider. That's not nothing — self-hosting agent sandboxes safely is genuinely hard.
CapabilityInteractions APIOpenAI Assistants APIAnthropic Claude APILangGraph Cloud
Server-side stateYes (GA Jun 2026)Yes (Nov 2023)No (client-managed)Yes (checkpointing)
Background executionYes (background=True)Yes (runs)NoYes (per-step)
Native MCP supportYesNoPartialVia adapters
Managed code sandboxYes (Antigravity)Code interpreterNoSelf-hosted
Multimodal contextText/image/audio/video/codeText/imageText/imageModel-dependent
Infra to manageNone (serverless)NoneNoneCloud + boilerplate
Industry Impact: What the Interactions API Changes for AI Development
The death of the orchestration middleware layer for Google-native stacks
Orchestration frameworks like LangChain, LlamaIndex, and n8n face real commoditisation pressure as managed API-layer orchestration absorbs their most common use cases. The boilerplate that drove their adoption is exactly what the Interactions API deletes. That's not a knock on those tools — it's just where the floor keeps rising.
What this means for AI platform teams at enterprise scale
Enterprise teams that built internal orchestration on the stateless Gemini API now face a build-vs-migrate decision with a clear ROI case. A team maintaining a session-management service running roughly $8,000–$12,000/month in compute plus two engineers' partial time can plausibly redirect that to product work. For enterprise AI platforms, the migration math is rarely close.
Impact on the RAG and vector database ecosystem
Vector database vendors — Pinecone, Weaviate, Chroma — are unaffected short-term. RAG retrieval remains external. But the tool-call abstraction reduces the integration boilerplate that drove their adoption velocity, and that will show up in their growth numbers eventually.
How MCP standardisation reshapes the tool integration market
MCP's inclusion at GA signals Google's intent to standardise the tool-protocol layer industry-wide, directly pressuring proprietary schemas like Anthropic's tool_use and OpenAI's function_calling. When two of the three major model providers align on a protocol, the third starts looking like a migration cost.
When two of the three major model providers (Google now, Anthropic originating it) align on MCP, proprietary tool schemas stop being a feature and start being a migration cost. Bet on MCP-native tool servers for anything you're building after Q3 2026.
Expert and Community Reactions to the Interactions API GA
Developer community response on launch day
Technical deep-dives appeared within hours. Co-author Philipp Schmid (Senior Developer Relations Engineer, Google DeepMind) framed the release directly in his developer write-up: 'This is the shift from generate-then-forget to a runtime that remembers — you stop being the orchestrator and start being the architect.' That framing landed because it names the actual job change for backend engineers, not just an API feature.
What OSS maintainers are saying about The Stateless Tax elimination
Reactions from the framework world were not uniformly defensive. Harrison Chase, co-founder and CEO of LangChain, has consistently argued that the durable value of orchestration frameworks moves up-stack as model APIs absorb plumbing — a position he reiterated around the GA, noting that 'the boilerplate layer was always going to get commoditized; the interesting work is in evaluation, observability, and complex multi-agent control flow.' That's the honest read: the Interactions API doesn't kill LangGraph, it narrows what LangGraph is genuinely for.
A named production reference: the public Antigravity ADK examples
For a verifiable production-shaped reference, Google's own Agent Development Kit (ADK) documentation and the public ADK sample repositories on GitHub demonstrate the Interactions API wiring an Antigravity Managed Agent against MCP tool servers — clonable code you can run rather than a marketing diagram. If you want to pressure-test the migration before committing, start there: it's the closest thing to a reproducible deployment recipe currently public.
Critical perspectives: what the API still does not solve
Developers noted Managed Agents sandbox customisation is limited at GA. Teams needing custom runtime environments — specific Python packages, GPU access — must still fall back to Vertex AI custom containers. There's no confirmed on-premise or air-gapped deployment option, which is a hard blocker for certain regulated industries. I wouldn't promise GA Managed Agents to a healthcare or finance client without checking those constraints first.
❌
Mistake: Re-sending full history to a stateful API
Teams migrating from the old Gemini API keep re-submitting the entire transcript out of habit, doubling token costs and defeating the purpose of server-side state.
✅
Fix: Pass only the new input plus the interaction_id. Let the server hold context.
❌
Mistake: Holding sockets open for long runs
Running a 4-minute research agent on a synchronous call leads to timeouts, dropped connections, and brittle retries.
✅
Fix: Set background=True and poll the interaction ID for completion.
❌
Mistake: Wrapping a proprietary tool schema
Building custom adapters for each tool when MCP-compatible servers work natively wastes weeks and locks you to one provider.
✅
Fix: Standardise on MCP-native tool servers — the Interactions API consumes them without adapter code.
❌
Mistake: Assuming Managed Agents replace all infra
Teams needing GPU access or custom Python packages hit the GA sandbox customisation ceiling and ship broken agents.
✅
Fix: Use Vertex AI custom containers for specialised runtimes; reserve Managed Agents for standard reasoning + code + browse tasks.
What Comes Next: Roadmap and Predictions for the Interactions API
Officially confirmed upcoming features from Google
Google confirmed Gemini Omni (soon) at GA, and the Gemini 3 Developer Guide already covers the Interactions API. Full Gemini 3 family support — expanded context windows and deeper native reasoning for agentic tasks — is the next confirmed milestone. 'Soon' from Google at a GA announcement tends to mean within a quarter, not a year.
Gemini 3 family integration timeline
With Gemini 2.5 Pro as the production recommendation today and Gemini 3 already documented, expect the family to become the default agentic model through the back half of 2026.
Predicted evolution: where the Interactions API goes by end of 2026
2026 Q3
**Gemini Omni ships; Gemini 3 becomes default agentic model**
Google explicitly flagged Gemini Omni as 'soon' and the Gemini 3 guide already references the API — both are confirmed-direction, not speculation.
2026 Q4
**LangChain, AutoGen, and CrewAI prioritise MCP-native tool servers**
Google's GA commitment to MCP, on top of Anthropic's origination, makes proprietary schemas a liability — frameworks will follow the standard.
2027 Q1
**Anthropic introduces server-side session state; OpenAI ships background execution in Assistants v3**
The Interactions API GA resets the baseline. Competitors that remain stateless inherit the Stateless Tax their developers will refuse to keep paying.
The Stateless Tax in reverse: what happens when competitors copy this model
Once managed, stateful, background-capable interfaces become table stakes across providers, the competitive advantage moves up the stack — to agent quality, tool ecosystems, and sandbox flexibility. The orchestration layer simply stops being a differentiator. We've seen this pattern before with hosting, then inference, now orchestration. The floor keeps rising.
The predicted trajectory: the Interactions API GA sets a new industry baseline every major model provider must match by Q1 2027.
Frequently Asked Questions
What is the Interactions API for Gemini models and agents and how does it differ from the original Gemini API?
The Interactions API is Google's primary unified endpoint for Gemini models and agents, announced at general availability in June 2026 after a December 2025 beta. The core difference is statefulness: the original Gemini API was stateless, requiring your client to re-submit the entire conversation history on every call and to build custom session management, retry logic, and tool-chaining scaffolding — what we call The Stateless Tax. The Interactions API retains context server-side, supports background execution via background=True, combines tools in a single turn, and provisions Managed Agents in remote Linux sandboxes. You pass a model ID for inference or an agent ID for autonomous tasks. It moves orchestration from your codebase into Google's managed infrastructure.
When did the Interactions API reach general availability and what does GA mean for developers?
The Interactions API reached general availability in June 2026, per Google's official blog.google announcement co-authored by Ali Çevik and Philipp Schmid, following its public beta launch in December 2025. GA means two critical things for production teams: a stable schema (breaking changes now follow a formal deprecation cycle) and that all Gemini documentation now defaults to the Interactions API. For developers, GA is the green light to build durable production systems — schema instability was cited by 58% of LLM teams as a top-three reliability risk in the LangChain State of AI Engineering 2026 report. Google also confirmed it's working with ecosystem partners to make the Interactions API the default across third-party SDKs and libraries, so the migration is industry-wide, not optional in the long run.
How do Managed Agents work in the Interactions API and what is the Antigravity sandbox?
Managed Agents are a GA capability where a single API call provisions a remote Linux sandbox in Google Cloud where an agent can reason, execute code, browse the web, and manage files. Google's Antigravity agent ships as the default — you invoke it by passing agent='antigravity'. You can also define custom agents with your own instructions, skills, and data sources. The key advantage over self-hosted AutoGen or CrewAI is security and zero infrastructure: you don't run or secure your own code-execution sandbox. The limitation at GA is customisation — teams needing specific Python packages or GPU access must still use Vertex AI custom containers. Pair Managed Agents with background=True for long-running autonomous tasks.
What does the Interactions API cost per token and per session, including background execution?
As of June 2026, published Gemini 2.5 Pro rates are approximately $1.25 per 1M input tokens and $10.00 per 1M output tokens, with Managed Agent background runtime billed around $0.018 per compute-second — meaning a long-running research or code task accrues cost only while the sandbox is active. Server-side session storage is included up to a documented context-window limit on the free tier, so there is no separate per-session storage fee for prototyping. Because these rates change, verify exact figures on Google's official Gemini API pricing page before budgeting. The total-cost-of-ownership argument favours migration: eliminating self-managed session stores and orchestration infrastructure — roughly $57,600 per year in maintenance for a mid-sized team — often more than offsets API spend for teams previously paying The Stateless Tax.
How does the Interactions API compare to the OpenAI Assistants API for production agentic workflows?
Both offer server-side state and background execution, but they diverge on key features. The OpenAI Assistants API launched in November 2023 and leads on file search with more mature vector-store integration. The Interactions API, GA in June 2026, ships native MCP (Model Context Protocol) support and richer multimodal background execution (text, image, audio, video, code) that the Assistants API lacks natively. For Google-native stacks needing managed code sandboxes via Antigravity and MCP-compatible tooling, the Interactions API is the stronger choice. For teams deeply invested in OpenAI's ecosystem and file-search-heavy retrieval, the Assistants API remains competitive. Expect OpenAI to close the multimodal background gap in Assistants v3 within 12 months.
How do I migrate from LangGraph, AutoGen, or CrewAI to the Interactions API — or use them together?
You can do either, but the decision hinges on workflow complexity. To migrate, replace your custom session store and tool-dispatch loop with direct Interactions API calls: pass an interaction_id for continuity and background=True for long runs, which typically removes the majority of orchestration boilerplate. To run them together, the Interactions API serves as the model/agent backend inside a LangGraph node or CrewAI agent definition. Keep a framework only when you genuinely need LangGraph's conditional branching across more than three roles, multi-LLM routing across non-Google models, or CrewAI's role-persona simulation. For standard multi-turn stateful agents, layering a framework on top reintroduces the boilerplate the Interactions API eliminates — re-incurring The Stateless Tax. Pragmatic pattern: Interactions API directly for Google-native flows, frameworks reserved for genuinely complex graph or role logic.
What is MCP integration in the Interactions API and why does it matter for tool use?
MCP (Model Context Protocol) is an open standard for connecting AI models to external tools and data sources. The Interactions API's native MCP integration means agents can consume any MCP-compatible tool server without writing custom adapter code. This matters because it ends the proprietary-schema lock-in that forced developers to rebuild tool integrations for each provider's format (OpenAI's function_calling, Anthropic's tool_use). If you already run MCP servers, integration is near-zero-effort. Strategically, Google's GA commitment to MCP — alongside Anthropic, which originated the protocol — signals industry standardisation of the tool-protocol layer. The practical advice: for anything built after Q3 2026, prioritise MCP-native tool servers over proprietary schemas to keep your tooling portable across providers and frameworks like LangChain, AutoGen, and CrewAI.
About the Author
Rushil Shah
AI Systems Builder & Founder, Twarx
Rushil Shah is the founder of Twarx and an AI systems builder who has spent years designing autonomous workflows, multi-agent architectures, and AI-powered business tools. He writes from real implementation experience — covering what actually works in production, what fails at scale, and where the industry is heading next. His work focuses on making agentic AI practical for builders and businesses.
LinkedIn · Full Profile
This article was originally published on Twarx. Follow for daily deep dives on AI agents and automation.



Top comments (0)