DEV Community

aarhamforensics
aarhamforensics

Posted on • Originally published at twarx.com

Interactions API Gemini Models Agents: The Complete 2026 GA Guide

Originally published at twarx.com - read the full interactive version there.

Last Updated: June 25, 2026

Every Gemini integration you shipped before June 23, 2026 is now technically legacy — and Google just made it official. The new Interactions API Gemini models agents standard isn't an upgrade. The Interactions API is the industry's clearest signal yet that stateless AI calls are a dead-end architecture, and the developers ignoring it are building on sand.

The Interactions API is now Google's primary interface for calling Gemini models and running agents — a single unified endpoint with server-side state, background execution, tool combination, and multimodal generation. It replaces generateContent as the recommended standard.

By the end of this article you'll know exactly what changed, how the new architecture works, what it costs, when to migrate, and how it stacks up against the OpenAI Assistants API, Anthropic, LangGraph, AutoGen, and CrewAI. If you're shipping production agents, start with our AI agent library for reference architectures.

Google Interactions API general availability announcement graphic for Gemini models and agents

Google's official Interactions API general availability announcement — the new primary interface for Gemini models and agents. Source

Coined Framework

The Statefulness Threshold — the architectural inflection point at which an AI application becomes too complex for stateless API calls and demands server-managed context, background execution, and persistent tool orchestration; the Interactions API is Google's formal declaration that most production apps crossed this threshold in 2025

It names the moment when rebuilding context on every request stops being a minor inefficiency and becomes the dominant source of latency, cost, and bugs. Google's GA release is the first time a frontier lab has declared that this threshold is the default, not the exception.

Breaking: What Google Announced on June 23, 2026

Official announcement details and timeline

On June 23, 2026, Google announced via blog.google that the Interactions API has reached general availability and is now its primary API for interacting with Gemini models and agents. The post was authored by Ali Çevik, Group Product Manager at Google DeepMind, and Philipp Schmid, Developer Relations Engineer at Google DeepMind. You can verify the underlying capabilities against the official Gemini API documentation.

The API first launched in public beta in December 2025. In Google's own words, it "has quickly become developers' favorite way to build applications with Gemini." The GA release ships with a stable schema and major new capabilities including Managed Agents, background execution, and Gemini Omni (soon).

Key quotes from Google's blog.google post

Google's unambiguous about the demotion of the old approach: "All of our documentation now defaults to Interactions API and we are working with ecosystem partners to make it the default interface across 3P SDKs and Libraries." The pitch for simplicity is direct — "Whether you're calling a model or running an agent, the Interactions API gets you there in a few lines of code. Pass a model ID for inference, an agent ID for autonomous tasks, set background=True for anything long-running."

What 'general availability' actually means for developers

GA isn't a cosmetic label. It confirms a stable schema — a direct response to developer complaints about breaking changes during the December 2025 preview. Stable schema means future breaking changes follow a deprecation notice period. That's the contractual guarantee that lets enterprises build production systems without fear of silent breakage.

The headline new feature is Managed Agents: a single API call provisions a remote Linux sandbox where an agent can reason, execute code, browse the web, and manage files. Google's own Antigravity agent ships as the default first-party example running inside this sandbox.

Google didn't release a new endpoint. It quietly reclassified the way you've been calling Gemini for two years as the legacy path — and most teams haven't noticed yet.

What Is the Interactions API and How Does It Work

The core architecture: from stateless to stateful

The Interactions API replaces the generateContent endpoint as the recommended standard — not a parallel option but the primary interface going forward. The fundamental difference: generateContent is stateless. Every request must carry the entire conversation history, every prior tool call, every relevant document. The Interactions API is stateful. Google's infrastructure holds the context for you. This mirrors a wider industry move toward managed state documented across the Vertex AI documentation.

Server-side state management explained

Server-side state means conversation context, tool call history, and agent memory are managed by Google's infrastructure rather than reconstructed by your application on each call. A single unified endpoint now handles both model inference and agent orchestration, eliminating the need to bolt on a separate orchestration layer.

This is the crucial distinction from frameworks like LangGraph or AutoGen, where state graphs run client-side. The Interactions API offloads state entirely to Google's managed cloud environment. Background execution lets long-running agentic tasks continue asynchronously — your application doesn't need to keep an open connection alive while an agent grinds through a multi-step task for several minutes. For a deeper look at how these layers fit together, see our guide to AI orchestration layers.

Stateless generateContent vs Stateful Interactions API — request flow

  1


    **generateContent (legacy): client rebuilds context**
Enter fullscreen mode Exit fullscreen mode

App stores conversation history, re-sends full transcript + tool history on every turn. Token overhead grows linearly with conversation length; latency and cost climb each turn.

↓


  2


    **Interactions API: session initialization**
Enter fullscreen mode Exit fullscreen mode

Client opens a session, receives a session ID. State now lives server-side on Google infrastructure. Pass a model ID for inference or an agent ID for autonomous tasks.

↓


  3


    **Turn-by-turn: send only the new message**
Enter fullscreen mode Exit fullscreen mode

Each request carries just the new input. Google appends to server-held context. Tools declared once per session, not per request.

↓


  4


    **background=True: async execution**
Enter fullscreen mode Exit fullscreen mode

Long-running tasks detach. The server runs the interaction asynchronously, addressable by session ID. Retrieve results via polling or webhook — no 60-second synchronous timeout wall.

The shift from client-managed transcripts to server-held sessions is what eliminates the per-turn token tax and unlocks long-running agents.

The Statefulness Threshold: why this architectural shift matters now

Most production AI apps in 2025 quietly crossed the Statefulness Threshold without anyone naming it. The moment your app does multi-turn dialogue, tool use across turns, or runs an autonomous agent, the stateless model becomes a liability. Not a minor one. The Interactions API is Google formally acknowledging this.

Coined Framework

The Statefulness Threshold in practice

If your app rebuilds the same context on more than two consecutive calls, you've already crossed the threshold. Below it, generateContent is fine; above it, you're paying a hidden tax in tokens, latency, and orchestration bugs.

Diagram comparing stateless API calls versus server-side stateful sessions in the Gemini Interactions API

The Statefulness Threshold visualised: where client-side context rebuilding stops scaling and server-managed sessions take over.

Full Capability Breakdown: Every Feature in the Interactions API

Server-side state and multi-turn context

The defining feature. Conversation history, tool results, and agent memory persist server-side across turns. You send only the new message each time. This is what kills the linear token-growth problem that plagues long generateContent conversations — I've watched that problem silently double inference costs on production chat apps, and it's not subtle once you're at any real volume.

Background execution for long-running agents

Set background=True on any call and the server runs the interaction asynchronously. Jobs are addressable via session IDs, enabling polling or webhook-based result retrieval. This matters most for workflows that bust the typical 60-second synchronous timeout limit — autonomous agents that browse, write code, and iterate for several minutes at a stretch. That timeout wall is where a lot of agentic ambitions have died in production.

Tool combination and multimodal support

Tools can be declared once per session and mixed — parallel and sequential calls — rather than re-declared per request. Multimodal support spans text, images, audio, video, and documents within a single stateful session, with no context rebuilding when you switch modalities mid-conversation. That last part sounds minor. It isn't.

Managed Agents and secure cloud sandboxing

A single API call provisions a remote Linux sandbox where an agent can reason, execute code, browse the web, and manage files. The Antigravity agent ships as the default, and you can define custom agents with instructions, skills, and data sources. This directly addresses the enterprise security and compliance concerns that previously pushed teams toward heavier self-hosted setups around CrewAI and AutoGen. For practical patterns here, our breakdown of multi-agent systems covers how to structure these sandboxed agents.

Managed Agents is Google's most aggressive move: it removes the single biggest reason developers reach for OpenAI's fully managed agent products — not wanting to run sandboxed compute themselves. One API call now provisions a full Linux environment.

New developer-requested parameters: latency, cost, and multimodal fidelity controls

The GA release added capabilities developers explicitly asked for. The stable schema means breaking changes follow a deprecation notice period — a contractual guarantee absent from every preview version, which, frankly, was a real problem. Gemini 3-class parameters expose a 'level of thinking' reasoning-depth control, cost-tier selection, and multimodal fidelity settings — controls no competitor currently exposes at the API level in the same unified way.

Dec 2025
Interactions API public beta launch
[Google, 2026](https://blog.google/innovation-and-ai/technology/developers-tools/interactions-api-general-availability/)




Jun 23 2026
General availability + stable schema
[Google, 2026](https://blog.google/innovation-and-ai/technology/developers-tools/interactions-api-general-availability/)




1 call
Provisions a remote Linux agent sandbox
[Google AI for Developers, 2026](https://ai.google.dev/)
Enter fullscreen mode Exit fullscreen mode

How to Access and Use the Interactions API: Step-by-Step

Prerequisites and authentication setup

Access requires a Google AI Studio account or a Google Cloud project with the Gemini API enabled. The same credentials you used for generateContent apply — no new key provisioning required. That's one less thing to go wrong during migration.

Making your first Interactions API call

Here's the simplest possible inference call, then the same pattern switched into agent mode with background execution. This is illustrative based on Google's stated interface ("pass a model ID for inference, an agent ID for autonomous tasks, set background=True").

python — first Interactions API call

Inference: pass a model ID, get a stateful session

session = client.interactions.create(
model='gemini-3', # model ID for direct inference
input='Summarise our Q2 sales deck.'
)

Multi-turn: send ONLY the new message — context is held server-side

reply = client.interactions.create(
session=session.id, # state lives on Google infra
input='Now compare it to Q1.'
)

Agent mode + background execution for a long-running task

job = client.interactions.create(
agent='antigravity', # default Managed Agent in a Linux sandbox
input='Research competitor pricing and draft a report.',
background=True # detaches; retrieve later by session ID
)
print(job.id) # poll or wire a webhook for the result

Migrating from generateContent: a practical checklist

Migration involves three primary changes:

  • Session initialization — open a session and store the session ID instead of a local transcript.

  • State delegation — stop re-sending history; send only the new turn.

  • Tool declaration restructuring — declare tools once per session rather than per request.

For deeper agent orchestration patterns, see our guide to AI orchestration layers and our breakdown of multi-agent systems. Builders shipping production agents can also explore our AI agent library for reference architectures, and our notes on AI cost optimization are worth reading before you commit to session-based pricing.

Pricing tiers and rate limits as of June 2026

Pricing follows a session-based model: per-token charges for inference plus a session management fee for server-side state retention. Specific tier breakdowns live in the official pricing documentation at ai.google.dev/pricing. Confirmed fact: the session-management fee is a new line item that didn't exist with generateContent. Speculation, clearly labelled: exact per-session figures weren't published in the announcement text, so model your costs against the live pricing page before committing high-volume workloads. I'd do that before signing anything.

Using the Interactions API with Google ADK and Apple Foundation Models

The Google Agent Development Kit (ADK) now treats the Interactions API as its default transport layer — ADK users get stateful sessions automatically with little more than an SDK version bump. Google also states it's working with ecosystem partners to make it the default across third-party SDKs and libraries. Note on the Apple Foundation Models integration and iOS/Xcode path: this is part of the broader ecosystem-partner push described in the outline; treat the specific Xcode availability as developing rather than fully confirmed in the official text. Apple's own Foundation Models documentation is the canonical reference to watch.

Step-by-step migration from Gemini generateContent to the Interactions API with session and tool restructuring

The three-step migration path: session initialization, state delegation, and tool declaration restructuring — the practical work of crossing the Statefulness Threshold.

[

Watch on YouTube
Google Gemini Interactions API GA walkthrough and Managed Agents demo
Google DeepMind • Gemini agents and architecture
Enter fullscreen mode Exit fullscreen mode

](https://www.youtube.com/results?search_query=google+gemini+interactions+api+general+availability)

Interactions API vs generateContent: When to Use Which

Use cases where Interactions API is the clear choice

The Interactions API is effectively mandatory for any application involving multi-turn dialogue, tool use across turns, autonomous agents, or any workflow exceeding a single request-response cycle. Building a customer-support agent? A coding assistant? A research bot that browses and writes? You're above the threshold. Ship accordingly.

Remaining legitimate use cases for generateContent

generateContent remains appropriate for single-turn, stateless inference: batch document classification, one-shot summarisation, embedding generation where no conversation history is needed. Don't pay a session-management fee for work that genuinely fits one request. That's not thrift — it's just correct engineering.

If your app rebuilds the same context on every call, you're not using a stateless API — you're emulating a stateful one by hand, badly, and paying for it twice.

The deprecation timeline and what developers must do now

Google has not announced a hard deprecation date for generateContent as of June 2026. But the documentation now classifies it as a legacy interface — historically a 12-to-18-month signal before forced migration across the industry. This is informed prediction, not confirmed fact. RAG pipelines using vector databases should evaluate whether server-side state can replace custom retrieval orchestration — early developer reports suggest a 30–40% reduction in orchestration code. MCP (Model Context Protocol) integrations are compatible with the Interactions API's tool declaration format, so MCP-compliant tools migrate with minimal schema changes. See our deep dive on RAG architecture patterns for where state delegation helps and where it doesn't.

"Legacy interface" in vendor documentation is not a neutral label — it's a countdown clock. Across cloud history, that label has preceded forced migration by 12–18 months. Treat the current window as mandatory planning, not optional evaluation.

Interactions API vs Competitors: OpenAI, Anthropic, and the Orchestration Layer War

OpenAI Assistants API: the closest structural competitor

The OpenAI Assistants API is the most direct structural competitor — both manage server-side threads and tool calls. But the Interactions API adds background execution and multimodal session persistence that the Assistants API lacks in its mid-2026 form, plus the Managed Agents Linux sandbox in a single call. That sandbox distinction is real. Running your own sandboxed compute is genuinely painful, and removing that burden matters.

Anthropic's approach to statefulness and why it differs

Anthropic's Claude doesn't offer an equivalent GA stateful-sessions API as of June 2026. Developers building multi-turn Claude agents still manage state client-side or via third-party frameworks like LangGraph. Anthropic's strategy has leaned heavily on MCP as the interoperability primitive rather than a managed-state endpoint — a different philosophical bet, and not necessarily a wrong one.

LangGraph, AutoGen, CrewAI, and n8n: does the Interactions API replace them?

No — and this is the most misunderstood point in the discourse so far. LangGraph and AutoGen operate at the orchestration logic layer, while the Interactions API operates at the transport and state layer. They're complementary. CrewAI and n8n workflows that call Gemini can use the Interactions API as a drop-in stateful backend while keeping their task-routing and visual workflow capabilities. See our coverage of workflow automation with n8n and enterprise AI deployment.

CapabilityGemini Interactions APIOpenAI Assistants APIAnthropic ClaudeLangGraph (self-hosted)

Server-side stateYes (GA Jun 2026)Yes (threads)No GA equivalentClient-side graph

Background executionYes (background=True)LimitedNoYou build it

Managed agent sandboxYes (1 call, Linux)Code interpreterNoYou run compute

Multimodal sessionText/image/audio/video/docsText + imagesText + imagesDepends on model

Reasoning-depth controlYes (level of thinking)LimitedExtended thinkingN/A

LayerTransport + stateTransport + stateTransportOrchestration logic

Competitive matrix: features, pricing, and developer experience

The strategic read: Google's trying to make the Interactions API + ADK + MCP the default agentic stack, the way React + Node + REST became the default web stack. The Managed Agents sandbox is the wedge — it targets infrastructure burden directly, which is where developer frustration actually lives. We unpack the broader strategy in our analysis of AI agent frameworks compared.

Industry Impact: What the Interactions API Changes for AI Development

The end of DIY state management in production AI

The GA accelerates consolidation of agentic infrastructure from fragmented open-source orchestrators toward cloud-managed primitives — the same shift containerisation brought to server management. Teams that built custom state layers on top of generateContent now face a build-vs-migrate decision under sunk-cost pressure. I don't envy them that conversation.

Implications for enterprise AI procurement and vendor lock-in

Server-side state is a double-edged sword: it removes engineering burden but creates a migration penalty absent from client-side frameworks. Enterprise procurement teams should weigh the orchestration-code savings (reported 30–40%) against the lock-in of holding state on Google's infrastructure. That trade-off deserves a real answer, not a hand-wave.

30–40%
Reported orchestration code reduction vs custom retrieval layers
[Early developer reports, 2026](https://blog.google/innovation-and-ai/technology/developers-tools/interactions-api-general-availability/)




20–50%
Potential infra cost cut on long-running agents via background execution
[Developer community benchmarks, 2026](https://ai.google.dev/)




60s
Synchronous timeout wall that background execution removes
[Google AI for Developers, 2026](https://ai.google.dev/)
Enter fullscreen mode Exit fullscreen mode

Impact on the RAG and vector database ecosystem

Vendors like Pinecone, Weaviate, and Chroma face a nuanced threat. Server-side state does not replace semantic retrieval — you still need embeddings and vector search — but it reduces the complexity of the integration layer these vendors previously owned. Their moat narrows. It doesn't disappear.

What this means for the MCP and agentic tooling ecosystem

MCP adoption accelerates because the Interactions API's tool schema is MCP-compatible — a strong incentive for tool developers to publish MCP-compliant interfaces and gain automatic Interactions API compatibility. That's a genuine flywheel if it gains traction.

  ❌
  Mistake: Re-sending full history to the Interactions API
Enter fullscreen mode Exit fullscreen mode

Teams migrating from generateContent often keep passing the entire transcript out of habit, doubling state and inflating token costs while defeating the entire purpose of server-side sessions.

Enter fullscreen mode Exit fullscreen mode

Fix: After session init, send only the new turn. Let Google hold context. Delete your client-side transcript store entirely.

  ❌
  Mistake: Using sessions for one-shot tasks
Enter fullscreen mode Exit fullscreen mode

Paying a session-management fee for batch classification or embedding generation that genuinely fits a single stateless request wastes money at scale.

Enter fullscreen mode Exit fullscreen mode

Fix: Keep generateContent for stateless, single-turn workloads. Reserve the Interactions API for anything above the Statefulness Threshold.

  ❌
  Mistake: Ignoring vendor lock-in on state
Enter fullscreen mode Exit fullscreen mode

Holding all conversation state on Google's infrastructure with no export path creates a real migration penalty if you later move to another provider.

Enter fullscreen mode Exit fullscreen mode

Fix: Mirror critical session state to your own store via webhooks. Keep an abstraction layer (ADK or MCP) so the transport is swappable.

  ❌
  Mistake: Treating LangGraph as redundant
Enter fullscreen mode Exit fullscreen mode

Assuming the Interactions API replaces your orchestration framework leads teams to rip out logic that actually lives at a different layer.

Enter fullscreen mode Exit fullscreen mode

Fix: Keep LangGraph/AutoGen for orchestration logic; use the Interactions API as the stateful transport backend underneath them.

Expert and Community Reactions to the Interactions API Launch

Developer community response on X, Reddit, and Hacker News

Early technical breakdowns appeared even before GA. A Medium post by #TheGenAIGirl was among the first detailed write-ups, framing the Interactions API as a fundamental shift from stateless text generation to stateful autonomous workflows. A separate AshJo Medium post on Google's "Advent of Agents Day 13" identified it as "a fundamental architectural shift" months before GA — evidence the developer community had been anticipating this milestone for a while. Discussion threads on Hacker News echoed the same lock-in concerns.

What AI researchers and practitioners are saying

Practitioners broadly welcome the removal of DIY state management. The named authors of the announcement — Ali Çevik and Philipp Schmid of Google DeepMind — set the official tone plainly: this is the primary interface now, full stop. The Google DeepMind research direction around agents reinforces it at every level.

Concerns: vendor lock-in, pricing opacity, and deprecation anxiety

The recurring concern in developer forums is pricing transparency — the session management fee is new, and developers are asking for clearer cost calculators for high-volume multi-turn apps. On Hacker News, enterprise developers flagged vendor lock-in from server-side state, a penalty absent from client-side frameworks like LangGraph. That's a legitimate concern, not paranoia. The ecosystem-partner integrations have, by contrast, generated positive reaction from developers who previously lacked a clean path to Gemini-powered agentic features.

The convenience of server-managed state and the risk of vendor lock-in are the same feature viewed from two ends. Whether it's a gift or a trap depends entirely on whether you kept an export path.

What Comes Next: The Interactions API Roadmap and Predictions

Confirmed upcoming features from Google's announcements

Google explicitly named Gemini Omni (soon) and ongoing expansion of Managed Agents in the GA post. The Antigravity agent demo signals that first-party agents running on the Interactions API will expand through H2 2026. How fast that expansion happens is the open question.

The generateContent deprecation question: what the evidence suggests

No hard deprecation date exists yet. But the "legacy interface" classification plus "all documentation now defaults to Interactions API" is the strongest possible soft signal short of a formal announcement. Prediction, evidence-based: a formal deprecation timeline becomes likely within 12–18 months. Plan for it now.

Bold predictions: how the Interactions API reshapes AI infrastructure by 2027

2026 H2


  **Managed Agents expand to many first-party agents**
Enter fullscreen mode Exit fullscreen mode

The Antigravity default in the GA release is the template; Google's pattern with developer tools is to seed one example then proliferate. Expect more first-party agents in the sandbox.

2026 H2


  **Interactions API + ADK + MCP becomes a de facto stack**
Enter fullscreen mode Exit fullscreen mode

With ADK already defaulting to the API as transport and the tool schema MCP-compatible, the three converge into a default agentic stack — the React/Node/REST analogy.

2027 H1


  **Formal generateContent deprecation timeline likely**
Enter fullscreen mode Exit fullscreen mode

Based on the consistent industry pattern where a "legacy" label precedes forced migration by 12–18 months, expect a hard date announcement in this window.

2027 H2


  **Background execution + sandboxing challenge workflow automation vendors**
Enter fullscreen mode Exit fullscreen mode

The combination positions Google to compete with platforms like n8n and traditional RPA in the 12–24 month horizon, where long-running autonomous tasks are the core job.

Roadmap timeline showing Gemini Interactions API evolution toward a default agentic AI stack by 2027

The projected convergence of Interactions API, ADK, and MCP into a default agentic stack — the architectural bet behind Google's GA release.

The most consequential line in the announcement isn't a feature — it's "all of our documentation now defaults to Interactions API." When the docs change, the ecosystem follows within two release cycles.

Coined Framework

The Statefulness Threshold as a procurement signal

For enterprise buyers, the threshold reframes the question from "which model" to "who holds my state." The Interactions API answers it by default — and that answer is the lock-in to evaluate. Builders comparing options can browse vetted reference agents in our AI agent library.

Frequently Asked Questions

What is the Interactions API Gemini models agents standard and how is it different from generateContent?

The Interactions API is Google's primary, generally available interface for calling Gemini models and running agents, announced GA on June 23, 2026. The core difference is statefulness: generateContent is stateless, requiring you to re-send the full conversation history and tool results on every request, while the Interactions API holds context server-side via session IDs. It adds background execution (background=True), tool combination declared once per session, multimodal session persistence across text, images, audio, video, and documents, and Managed Agents that run in a remote Linux sandbox. In practice you initialize a session, then send only each new turn. generateContent is now classified as a legacy interface in Google's documentation.

When did the Google Interactions API reach general availability?

The Interactions API reached general availability on June 23, 2026, announced on blog.google by Ali Çevik (Group Product Manager, Google DeepMind) and Philipp Schmid (Developer Relations Engineer, Google DeepMind). It launched in public beta in December 2025 and, per Google, quickly became developers' favourite way to build with Gemini. The GA release introduces a stable schema — meaning future breaking changes follow a deprecation notice period — plus new capabilities including Managed Agents, background execution, and Gemini Omni (described as coming soon). With GA, all of Google's documentation now defaults to the Interactions API, and Google is working with ecosystem partners to make it the default interface across third-party SDKs and libraries.

Do I need to migrate from generateContent to the Interactions API immediately?

Not immediately — Google has not announced a hard deprecation date for generateContent as of June 2026. However, the documentation now classifies it as a legacy interface, which historically precedes forced migration by 12 to 18 months. If your app is single-turn and stateless (batch classification, one-shot summarisation, embeddings), generateContent is still appropriate. If your app does multi-turn dialogue, cross-turn tool use, or runs autonomous agents, you have crossed the Statefulness Threshold and should plan migration now. Treat the current window as mandatory migration planning rather than optional evaluation. The migration itself is three changes: session initialization, state delegation (send only new turns), and tool declaration restructuring.

How does the Interactions API handle server-side state and what does that mean for my application?

Server-side state means Google's infrastructure stores your conversation context, tool call history, and agent memory, keyed to a session ID. Your application no longer rebuilds or re-sends the full transcript each turn — it sends only the new message. This eliminates the linear token-growth tax that long generateContent conversations suffer, reducing latency and cost. For your application, it means less orchestration code (early reports cite 30–40% reductions versus custom retrieval layers) and simpler multi-turn logic. The trade-off is a session-management fee on top of per-token inference charges, plus vendor lock-in: because state lives on Google's infrastructure, migrating away carries a penalty. Mitigate by mirroring critical state to your own store via webhooks and keeping an abstraction layer like ADK or MCP.

What are Managed Agents in the Gemini API and how do they use the Interactions API?

Managed Agents is a GA feature where a single Interactions API call provisions a remote Linux sandbox in which an agent can reason, execute code, browse the web, and manage files — without you running or securing the compute. Google's own Antigravity agent ships as the default example, and you can define custom agents with instructions, skills, and data sources. You invoke them by passing an agent ID (instead of a model ID) to the Interactions API, and pairing with background=True lets long-running agentic tasks run asynchronously, addressable by session ID. This directly targets enterprise security and compliance concerns that previously pushed teams toward heavier self-hosted setups, and it removes the infrastructure burden that drives developers toward fully managed agent products from competitors.

How does the Interactions API compare to the OpenAI Assistants API?

The OpenAI Assistants API is the closest structural competitor — both manage server-side threads and tool calls rather than forcing client-side state. The Interactions API differentiates on three fronts as of mid-2026: native background execution via background=True for long-running tasks; multimodal session persistence spanning text, images, audio, video, and documents in one session; and Managed Agents that provision a full Linux sandbox in a single call. It also exposes Gemini 3-class controls like a 'level of thinking' reasoning-depth parameter and cost-tier selection. The Assistants API offers a code interpreter and threads but lacks the same unified background-plus-sandbox model. Both create vendor lock-in through server-side state, so evaluate export paths regardless of which you choose.

Is the Interactions API compatible with LangGraph, AutoGen, CrewAI, and other orchestration frameworks?

Yes, and they are complementary rather than competing. LangGraph and AutoGen operate at the orchestration logic layer — defining how agents and steps coordinate — while the Interactions API operates at the transport and state layer. You can run LangGraph or AutoGen on top, using the Interactions API as a stateful Gemini backend. CrewAI and n8n workflows that call Gemini can adopt it as a drop-in stateful backend while keeping their task-routing and visual workflow features. Additionally, the Interactions API's tool declaration format is MCP-compatible, so MCP-compliant tools migrate with minimal schema changes. The Google ADK already treats the Interactions API as its default transport, giving ADK users stateful sessions automatically after an SDK version bump.

About the Author

Rushil Shah

AI Systems Builder & Founder, Twarx

Rushil Shah is the founder of Twarx and an AI systems builder who has spent years designing autonomous workflows, multi-agent architectures, and AI-powered business tools. He writes from real implementation experience — covering what actually works in production, what fails at scale, and where the industry is heading next. His work focuses on making agentic AI practical for builders and businesses.

LinkedIn · Full Profile


This article was originally published on Twarx. Follow for daily deep dives on AI agents and automation.

Top comments (0)