aarhamforensics

Posted on Jun 26 • Originally published at twarx.com

Interactions API Gemini Models Agents: Complete 2026 GA Guide

#ai #machinelearning #productivity #automation

Originally published at twarx.com - read the full interactive version there.

Last Updated: June 26, 2026

Google just made every third-party AI agent orchestration framework a liability — and the developers most at risk are the ones who built the most sophisticated LangGraph pipelines. The Interactions API Gemini models agents release isn't an incremental update to Gemini. It's the infrastructure layer Google needed to make agents a platform, not a product — a single unified endpoint that now sits between you and every Gemini capability.

As of today, the Interactions API has reached general availability and is officially Google's primary interface for interacting with Gemini models and agents — a single unified endpoint with server-side state, background execution, tool combination, and Managed Agents like Antigravity.

By the end of this article you'll know exactly what changed, how the architecture works, how to migrate from the OpenAI Responses API in three lines, and whether your LangGraph stack just became technical debt.

Google's official Interactions API GA announcement — a single unified endpoint for Gemini models and agents with server-side state, background execution, and Managed Agents. Source

Coined Framework

The Orchestration Absorption Problem

The phenomenon where cloud providers embed agentic orchestration logic — state management, tool routing, background execution, multi-turn memory — directly into their inference APIs, progressively rendering external frameworks redundant. It names the systemic risk where developers lock into platform-native agent architectures before they realise the migration cost.

What Google Announced: Interactions API Is Now Generally Available

Official announcement date, source, and exact GA status

On June 26, 2026, Google DeepMind announced via the official Keyword blog that the Interactions API has reached general availability and is now its primary API for interacting with Gemini models and agents. The announcement was authored by Ali Çevik, Group Product Manager at Google DeepMind, and Philipp Schmid, Developer Relations Engineer at Google DeepMind.

This isn't a preview. The blog states plainly: "the Interactions API has reached general availability." The public beta launched in December 2025, and per Google it "quickly became developers' favorite way to build applications with Gemini." I'll take that claim with the usual grain of salt, but the GA timing and the all-docs-default-to-it treatment do suggest real adoption pressure behind it.

What changed from the previous Gemini API developer preview

The GA release ships a stable schema plus major new capabilities developers explicitly asked for: Managed Agents, background execution, and Gemini Omni (coming soon). Critically, all Google documentation now defaults to the Interactions API. Google is also working with ecosystem partners to make it the default interface across third-party SDKs and libraries — including direct integration with the Agent Development Kit (ADK). If you are tracking how this fits the wider landscape, our overview of AI agent frameworks maps where each tool now sits.

The stable schema milestone and why it matters for production teams

A stable schema is the difference between a side project and a production dependency. Before GA, breaking changes were on the table — I've been burned by that before, shipping against a preview API only to spend a week re-integrating after a quiet schema update. Now, teams shipping enterprise AI agents can commit to the surface knowing the request/response contract won't shift under them. The flagship Managed Agent — Antigravity — runs in a secure cloud sandbox and ships as the default, the first first-party agent callable directly through the new interface.

Dec 2025
Interactions API public beta launch
[Google, 2026](https://blog.google/innovation-and-ai/technology/developers-tools/interactions-api-general-availability/)




1M
Token context window in Gemini 3 Pro stateful sessions
[Google AI, 2026](https://ai.google.dev/gemini-api/docs)




3 lines
Claimed migration cost from OpenAI Responses API
[Google, 2026](https://blog.google/innovation-and-ai/technology/developers-tools/interactions-api-general-availability/)

What the Interactions API Is and How It Works

Core architecture: stateful, server-side, and multimodal from day one

The Interactions API is a single unified endpoint. Whether you're calling a raw model for inference or running an autonomous agent, you get there in a few lines: pass a model ID for inference, an agent ID for autonomous tasks, set background=True for anything long-running. That's the entire mental model. It took me embarrassingly long to accept that the simplicity is real and not hiding something.

The architectural shift that actually matters is server-side state. The legacy Gemini generateContent endpoint was stateless — every turn required you to re-send the entire conversation history. The Interactions API maintains conversation state on Google's servers, so you reference a session and append to it rather than reconstructing it from scratch on every call. If you're new to the broader pattern, our primer on how AI agents work sets the foundation.

How server-side state replaces client-side conversation history management

This is the part that deletes code. Today, most production agents rely on Redis or Postgres to persist conversation history, tool-call logs, and intermediate agent state. With server-side state, that boilerplate disappears for the common case. You stop being the database for your own conversations — and if you've ever debugged a context-reconstruction bug at 2am, you know exactly how much that's worth.

The single most valuable practical change for solo developers isn't a new model — it's the elimination of Redis/Postgres state-management boilerplate. Server-side state turns a 200-line context manager into a session ID.

The unified endpoint model: one surface for models and agents

Before, raw inference and agent execution lived behind different abstractions. The Interactions API collapses that split. A model field accepting either gemini-3-pro or antigravity means switching from "answer this prompt" to "go execute this multi-step task in a sandbox" is a one-field change. Background execution is first-class too: long-running agent tasks don't require the client to hold an open connection. That's a production requirement that's been conspicuously absent from most competitor APIs, and it matters the moment you try to run anything that takes longer than thirty seconds.

The Interactions API doesn't add a feature to Gemini. It changes what Gemini is — from a model you call to a platform you build agents inside.

Interactions API Request Flow: From Client Call to Stateful Agent Execution

  1


    **Client sends Interaction request**

A single POST to the unified endpoint. Specify model='gemini-3-pro' for inference or model='antigravity' for a Managed Agent. Optionally set background=True.

↓


  2


    **Server resolves session state**

Google retrieves the stored conversation history and agent state server-side. No client-side history payload required after turn one.

↓


  3


    **Tool routing & combination**

The server combines built-in tools (Google Search grounding, code execution) with your custom functions and MCP-registered tools — all in one agent turn, no manual orchestration.

↓


  4


    **Managed Agent sandbox (if agent ID)**

A remote Linux sandbox is provisioned where the agent reasons, executes code, browses the web, and manages files. Antigravity runs here by default.

↓


  5


    **Response or async handle returned**

Synchronous calls return the result. Background calls return a handle you poll — the client never holds the connection open during long tasks.

The sequence matters because state, tool routing, and sandbox provisioning are all absorbed server-side — exactly the work LangGraph and AutoGen exist to perform.

The Orchestration Absorption Problem visualized: capabilities that lived in your application layer (state, routing, memory) now live inside the inference API.

Full Capability Breakdown: Every Feature in the Interactions API

Managed Agents: what they are and how they differ from function calling

Per the GA announcement, Managed Agents are: "A single API call provisions a remote Linux sandbox where an agent can reason, execute code, browse the web and manage files." The Antigravity agent ships as the default. You can also define custom agents with instructions, skills, and data sources.

This is categorically different from function calling, and the distinction matters in practice. Function calling returns a request for you to execute a tool, then you feed the result back. A Managed Agent executes the entire loop — reasoning, code, browsing, file management — inside Google's sandbox and returns the outcome. The orchestration loop you used to own is gone. Whether that feels like relief or like handing over the keys depends entirely on how much you trust Google's sandbox.

Tool combination: native grounding, code execution, and external API calls

Tool combination lets grounding (live Google Search), code execution, and custom function calls combine in a single agent turn without manual orchestration logic. The API also consumes the Model Context Protocol (MCP) ecosystem — it can call MCP-registered tools, making it complementary to MCP rather than a replacement for it. We cover the protocol in depth in our MCP explainer.

Background execution and async task patterns

Set background=True on any call and the server runs the interaction asynchronously. This is the real unlock for production agents: deep research tasks, multi-file code refactors, long-running data analysis — none of these require a persistent client connection or a custom job queue anymore. I'd have killed for this two years ago when we were running Celery workers just to keep agent tasks alive past Cloudflare's timeout limits.

Background execution billed on compute-time rather than token count changes the economics of long-running agents. A 12-minute research task isn't a token bomb — it's a metered compute job, which is far more predictable for budgeting.

Multimodal session handling and context windows

The API supports true multimodal inputs — text, image, audio, and video — within the same stateful session object. Combined with up to 1 million tokens of context in Gemini 3 Pro stateful sessions, this opens up RAG-free long-document workflows that would otherwise demand vector database infrastructure like Pinecone or Weaviate. Not for every use case — but for more of them than you'd expect.

OpenAI library compatibility layer

The OpenAI compatibility layer means switching from OpenAI's Responses API to the Interactions API is a three-line change. This is a deliberate competitive move targeting OpenAI's installed developer base — the most aggressive distribution play in the GA release, full stop. For more on multi-provider design, see our breakdown of multi-agent systems.

Coined Framework

The Orchestration Absorption Problem — in practice

Managed Agents and tool combination absorb the exact responsibilities — tool routing, the reason-act loop, sandbox execution — that frameworks like CrewAI and AutoGen were built to provide. The absorption is partial today, but the GA schema makes it durable.

How to Access and Use the Interactions API: Step-by-Step

Prerequisites: API key, SDK version, and project setup

The Interactions API is accessible via Google AI Studio and the Gemini API — no Vertex AI account required for the standard tier, which genuinely lowers the barrier compared to previous enterprise-only agent features. You'll need the google-generativeai Python SDK version 1.0+ or the equivalent TypeScript package. Older SDK versions do not support the Interactions endpoint — they silently fall back to generateContent patterns, which will cost you a frustrating debugging session if you don't catch it early.

Making your first stateful interaction: code walkthrough

Python — first stateful interaction

Requires google-generativeai >= 1.0

from google import genai

client = genai.Client(api_key='YOUR_API_KEY')

Turn 1 — server stores the session state

interaction = client.interactions.create(
model='gemini-3-pro',
input='Summarise our Q2 churn drivers.'
)
print(interaction.output_text)

Turn 2 — no history payload needed; reference the session

follow_up = client.interactions.create(
model='gemini-3-pro',
session=interaction.session_id, # server-side state
input='Now rank them by revenue impact.'
)
print(follow_up.output_text)

Calling a Managed Agent via the Interactions API

Python — calling the Antigravity Managed Agent in background

Managed Agents are invoked by passing the agent ID in the model field

task = client.interactions.create(
model='antigravity', # agent ID, not a model name
input='Clone the repo, run the test suite, and fix any failing tests.',
background=True # async; client connection not held open
)

Poll the async handle

result = client.interactions.get(task.id)
print(result.status) # running | completed | failed

Need pre-built agents to test these patterns against? You can explore our AI agent library for reference implementations and starter workflows, then adapt them to your own Interactions API sessions.

Combining tools in a single session

Python — grounding + code execution + custom tool in one turn

interaction = client.interactions.create(
model='gemini-3-pro',
tools=[
{'type': 'google_search'}, # live grounding
{'type': 'code_execution'}, # sandboxed code
{'type': 'function', 'function': my_pricing_lookup} # custom
],
input='Find current competitor pricing, compute our delta, and chart it.'
)

Tool routing is handled server-side — no manual orchestration loop

Pricing tiers, rate limits, and free quota as of June 2026

Pricing follows a per-token input/output model consistent with existing Gemini pricing, with background execution tasks billed on compute-time rather than token count. Apple developers also gain access via the Foundation Models framework, letting iOS and macOS apps call cloud-hosted Gemini through the Interactions API without direct REST integration. Confirm exact per-token and compute-time figures on the official pricing page at implementation time, as GA pricing may be adjusted.

A worked Interactions API session: turn one creates server-side state, turn two references the session ID — eliminating client-side history management entirely.

When to Use the Interactions API vs Alternatives

Interactions API vs legacy Gemini generateContent endpoint

Use the Interactions API when you need server-managed state, background execution, or Managed Agents. Use generateContent only for stateless one-shot inference where latency is the sole priority and you genuinely don't want session overhead. That use case exists, but it's narrower than it used to be.

Interactions API vs LangGraph, AutoGen, and CrewAI orchestration

LangGraph and AutoGen remain relevant for multi-model, multi-provider workflows where you can't or won't commit to a Google-only agent stack. The Orchestration Absorption Problem is real but not yet total. If you're orchestrating Claude, GPT, and Gemini in one graph, frameworks still win. If you're Gemini-native, the API absorbs most of what you were using them for. See our deep dive on AI orchestration for the trade-offs in detail.

Interactions API vs building custom agents on MCP

MCP integration is complementary, not competitive. The Interactions API can call MCP-registered tools, making it a consumer of the MCP ecosystem rather than a replacement for it. Standardise your tools on MCP and consume them from whichever runtime wins the next eighteen months.

Interactions API vs Vertex AI Agent Builder

Vertex AI Agent Builder is the enterprise path — compliance controls, VPC isolation, SLA guarantees. The Interactions API via Google AI Studio is the developer and startup path, with faster iteration and lower setup friction. For teams already on n8n or CrewAI, treat the Interactions API as the underlying inference and state layer rather than a wholesale replacement of workflow automation logic.

If your agents only ever call Gemini, your orchestration framework is now a translation layer between you and an API that already does the orchestration.

Interactions API vs OpenAI Responses API: Direct Competitor Comparison

Feature-by-feature comparison table

CapabilityGoogle Interactions APIOpenAI Responses APIAnthropic Claude API

Server-side stateYes (GA, June 2026)Yes (since early 2025)No — client-side

Managed AgentsYes (Antigravity + custom)Limited (Assistants/Actions)No equivalent

Background executionYes (background=True)PartialNo native async

Native multimodal sessionText, image, audio, videoText-first; separate APIs for A/VText + image

Max context (flagship)1M tokens (Gemini 3 Pro)~128K (GPT-4o)200K (Claude)

Tool combination per turnGrounding + code + custom + MCPBroadest 3P plugin ecosystemTool use (client-orchestrated)

Migration friction3 lines from OpenAI SDK——

The three-line migration claim: what it actually costs to switch

The three-line claim exploits OpenAI SDK compatibility to reduce switching friction. But it understates the real cost significantly. Tool definitions, system prompts, and agent logic all need retesting against Gemini 3 Pro's different instruction-following characteristics. The endpoint swaps in three lines. Behavioural parity takes a sprint — and in my experience, probably a longer one than you've budgeted for.

Where OpenAI still leads and where Google now has the edge

OpenAI leads on third-party plugin and tool ecosystem breadth as of June 2026. Google's Managed Agents catalogue is nascent — Antigravity is the flagship, and the catalogue around it is thin. But Google has a structural advantage in native multimodal sessions and the 1M-token context window, and that advantage directly impacts RAG architecture decisions in ways that matter to real teams.

Anthropic's position and what Claude's API lacks by comparison

Anthropic has no equivalent to Managed Agents. Claude's tool use and multi-turn handling remain client-side orchestration responsibilities — making it the most exposed to migration pressure if Google's agent catalogue grows at the rate the roadmap implies.

The 1M-token stateful context isn't just a spec-sheet number. For document sets under ~750K tokens — most enterprise knowledge bases — it makes a vector database optional. That's a Pinecone line-item potentially deleted from the architecture.

Industry Impact: What the Interactions API Changes for AI Development

The Orchestration Absorption Problem: are LangGraph and CrewAI under threat?

Server-side state and Managed Agents directly absorb capabilities that LangGraph, AutoGen, and CrewAI were built to provide. This isn't a full replacement today. But it's a credible 18-month platform risk for framework-dependent teams, and I'd be actively thinking about it if I were running a Gemini-native stack right now. The frameworks survive on multi-provider neutrality — the moment a team standardises on Gemini, that moat thins fast.

What this means for enterprise AI platform teams in 2026

Teams standardised on Vertex AI pipelines will see the Interactions API as a lower-friction path to production agents, potentially reducing custom orchestration code by an estimated 40-60% for common agent patterns. For a five-engineer platform team, that's real headcount redirected from plumbing to product. Whether that saving justifies the lock-in is a question every team needs to answer for itself — our guide to enterprise AI agents walks through the governance trade-offs.

Coined Framework

The Orchestration Absorption Problem — the enterprise cost curve

Absorption feels like savings — 40-60% less orchestration code — right up until pricing or capability access changes. The cost isn't paid at migration; it's paid at the moment you've lost the optionality to leave.

Impact on the MCP ecosystem and tool standardisation

The MCP ecosystem benefits here. The Interactions API supporting MCP-registered tools creates a demand signal that accelerates MCP adoption across Anthropic and Microsoft. Tool standardisation is the one layer that absorption strengthens rather than threatens — which is a useful property to build toward regardless of which inference platform wins.

Apple integration: the iOS and macOS developer opportunity

Apple's Foundation Models framework integration gives roughly 34 million registered Apple developers a low-friction path to cloud Gemini capabilities — a distribution advantage OpenAI doesn't currently have in the Apple ecosystem. Meanwhile, n8n and similar platforms face a real strategic question: is the Interactions API a node in their graph or a competitor to their agent automation layer? That answer isn't obvious yet.

The deepest risk of platform-native agents isn't technical lock-in. It's strategic: you don't notice you've lost portability until Google changes the price.

Expert and Community Reactions to the Interactions API Launch

Developer community response on X, Hacker News, and Reddit

Independent analysis from #TheGenAIGirl on Medium identified the Interactions API and ADK combination as the most important architectural shift in Google's developer platform since the original Gemini API launch. Positive reception concentrated among solo developers and startups, who cited eliminating Redis/Postgres state boilerplate as the single most valuable practical change — which tracks with what I'd expect from teams who've actually fought that problem in production. Threads on Hacker News echoed the same split between convenience and lock-in concern.

What AI engineers are saying about the OpenAI migration path

Engineers welcomed the three-line migration but flagged that it understates the real cost — tool definitions and prompts need retesting against Gemini 3 Pro's instruction-following behaviour. The consensus: easy to try, non-trivial to ship. That's an honest read.

Critical perspectives: lock-in concerns and the Orchestration Absorption Problem

Critical voices centre on vendor lock-in. Server-side state means conversation history, agent state, and tool-call logs live in Google infrastructure — a genuine concern for HIPAA and GDPR-regulated industries, not a theoretical one. For those teams, the Vertex AI enterprise path with VPC controls is the answer, not Google AI Studio.

Why Managed Agents shipped now

Managed Agents were the top developer-requested feature post-beta, suggesting Google shipped this in direct response to competitive pressure from OpenAI's GPT Actions and Anthropic's tool-use patterns. The roadmap is being written by the leaderboard. That's not a criticism — it just means you can predict what's coming next by watching what the competition already has.

[
▶

Watch on YouTube
Google DeepMind on the Interactions API and Gemini agents
Google DeepMind • Gemini agent architecture

](https://www.youtube.com/results?search_query=google+gemini+interactions+api+agents+deepmind)

The community split: solo developers celebrate deleted state boilerplate while enterprise teams weigh the lock-in cost of the Orchestration Absorption Problem.

Common Mistakes When Adopting the Interactions API

  ❌
  Mistake: Treating the 3-line migration as the whole job

Swapping the OpenAI SDK base URL for the Interactions API endpoint compiles — but your GPT-tuned prompts and tool schemas behave differently under Gemini 3 Pro's instruction-following. I've seen teams ship this confidently and then spend a week chasing regressions they didn't expect.

✅

Fix: Budget a full eval sprint. Re-run your golden test set against Gemini 3 Pro and retune system prompts before cutting production traffic over.

  ❌
  Mistake: Storing regulated data in server-side state without review

Server-side state means conversation history and tool logs live in Google infrastructure. That's a compliance problem for HIPAA and GDPR workloads on the standard Google AI Studio tier — not a maybe, a definite.

✅

Fix: For regulated data, use the Vertex AI enterprise path with VPC controls and a signed DPA rather than the standard developer tier.

  ❌
  Mistake: Going all-in on Managed Agents with zero portability plan

Building production agents entirely on Antigravity and Managed Agents means zero portability. The Orchestration Absorption Problem fully manifests if pricing or capability access changes — and you won't see it coming until it's already expensive.

✅

Fix: Keep agent logic and tool definitions in a portable layer (ADK or MCP-registered tools) so the runtime is swappable even if you default to Interactions API today. Our agent library ships portable, MCP-friendly templates you can reuse.

  ❌
  Mistake: Using an outdated SDK

The Interactions endpoint requires google-generativeai 1.0+. Older versions silently fall back to generateContent patterns and you lose server-side state entirely — with no error to tell you that's what happened.

✅

Fix: Pin google-generativeai>=1.0 in requirements and verify the client.interactions namespace exists before building.

Average Expense to Use the Interactions API

Cost has three components. First, standard inference bills per input/output token consistent with existing Gemini pricing — confirm Gemini 3 Pro rates on the live page before you commit to a budget. Second, background execution bills on compute-time, not tokens, which makes long-running agent tasks far more predictable to forecast than token-based pricing. Third, and this is the one teams undercount: total cost of ownership drops because you delete the Redis/Postgres state infrastructure you'd otherwise run — a real line-item saving, especially for smaller teams.

For a startup running modest agent volume, the practical picture looks like this: a free quota tier to prototype on Google AI Studio, per-token inference that scales with usage, and metered compute for background tasks. The hidden saving is engineering time. The estimated 40-60% reduction in orchestration code for common patterns is the biggest TCO lever here, not the per-token rate. Don't build the business case on token pricing alone — our AI agent pricing breakdown shows how to model the full picture.

What Comes Next: Roadmap, Open Questions, and Predictions

Expected expansion of the Managed Agents catalogue beyond Antigravity

Google's roadmap signals additional Managed Agents — deep research, data analysis, and code review agents are the most likely near-term additions based on prior preview sessions. Gemini Omni is explicitly listed as "soon" in the GA announcement. How quickly that catalogue grows will determine how credible the Orchestration Absorption Problem becomes for teams sitting on the fence.

The vector database and RAG obsolescence question

The 1M-token stateful context creates a credible path toward RAG-free enterprise search for document sets under ~750K tokens — which covers the majority of enterprise knowledge bases. That's a direct threat to Pinecone and Weaviate for a meaningful slice of use cases, not a marginal one. Our RAG systems guide covers where vector DBs still win convincingly.

2026 H2


  **Managed Agents catalogue expands beyond Antigravity**

Deep research, data analysis, and code review agents ship as first-party Managed Agents — grounded in Google's stated roadmap and the "Gemini Omni soon" note in the GA post.

2027 H1


  **Interactions API becomes default for all new Gemini integrations**

Google deprecates the legacy generateContent surface for new projects; teams that haven't migrated face forced transitions. Evidence: all docs already default to Interactions API.

2027 H2


  **Orchestration frameworks reposition as multi-provider neutrality layers**

LangGraph and CrewAI lean harder into cross-provider portability as their single-provider value erodes — the predictable response to the Orchestration Absorption Problem.

2028


  **Vector DB providers pivot to hybrid + agentic memory**

As 1M-token sessions absorb mid-size RAG, Pinecone and Weaviate emphasise scale beyond context limits and persistent agent memory rather than basic retrieval.

The 18-month trajectory of the Orchestration Absorption Problem: from optional convenience to default platform, with framework repositioning as the predictable counter-move.

Frequently Asked Questions

What is the Google Interactions API and how is it different from the previous Gemini API?

The Interactions API is Google's unified, server-side endpoint for both Gemini model inference and agent execution, announced GA on June 26, 2026. Unlike the legacy generateContent endpoint — which was stateless and forced you to re-send full conversation history on every turn — the Interactions API maintains state server-side. You reference a session ID instead of rebuilding context. It also adds Managed Agents (like Antigravity), background execution via background=True, native multimodal sessions (text, image, audio, video), and tool combination in a single turn. Google now positions it as its primary interface, with all documentation defaulting to it. Practically, it deletes the Redis/Postgres state boilerplate most production agents previously required and collapses model calls and agent runs behind one endpoint controlled by a model field.

Is the Interactions API generally available or still in preview as of June 2026?

It is generally available. Google announced GA on June 26, 2026 via the official Keyword blog, stating the API "has reached general availability and is now our primary API for interacting with Gemini models and agents." The public beta launched in December 2025. The GA release ships a stable schema — meaning the request/response contract is committed and safe for production dependencies — alongside new capabilities including Managed Agents and background execution. This is materially different from the beta period: teams can now build against it without risk of breaking schema changes. All Google documentation defaults to the Interactions API, and Google is working with ecosystem partners to make it the default across third-party SDKs and libraries. Gemini Omni is listed as coming soon rather than GA.

How do I migrate from the OpenAI Responses API to the Google Interactions API?

Google ships an OpenAI library compatibility layer, so the literal endpoint swap is roughly three lines — point the OpenAI SDK at the Interactions API base URL and supply a Gemini API key. That gets you running. The honest migration, however, is larger: your system prompts, tool definitions, and agent logic were tuned to GPT's instruction-following and must be retested against Gemini 3 Pro, which responds differently. Plan an evaluation sprint: re-run your golden test set, retune prompts, and validate tool schemas before cutting production traffic. The upside is access to 1M-token context, native multimodal sessions, and Managed Agents that OpenAI's Responses API doesn't match. Keep your tool definitions portable (MCP-registered) so you preserve the option to switch back if behavioural parity proves costly.

What are Managed Agents in the Interactions API and how do I call them?

Managed Agents are Google-hosted agent runtimes. Per the GA announcement, a single API call provisions a remote Linux sandbox where the agent can reason, execute code, browse the web, and manage files. Antigravity ships as the default agent. You call one by passing the agent ID in the model field — for example model='antigravity' instead of model='gemini-3-pro'. This differs from function calling: a Managed Agent executes the entire reason-act loop server-side and returns the outcome, rather than handing you a tool request to run yourself. You can also define custom agents with instructions, skills, and data sources, and the recommended framework for building them is Google's Agent Development Kit (ADK). Pair Managed Agents with background=True for long-running tasks so your client doesn't hold an open connection.

Does the Interactions API support background execution for long-running agent tasks?

Yes. Background execution is a first-class feature: set background=True on any Interactions call and the server runs the interaction asynchronously. Your client receives a handle to poll rather than holding an open connection for the duration. This is the key production unlock — deep research, multi-file code refactors, and long-running data analysis no longer require a persistent connection or a custom job queue. Background tasks are billed on compute-time rather than token count, which makes the cost of a twelve-minute agent run predictable instead of a token bomb. This capability is notably absent or only partial in most competitor APIs as of June 2026, including Anthropic's Claude API, which has no native async pattern. For agent architectures that previously needed Celery or a queue worker, background execution can remove that infrastructure entirely.

How does the Interactions API compare to using LangGraph or AutoGen for agent orchestration?

The Interactions API absorbs much of what LangGraph and AutoGen do — state management, tool routing, background execution, and the reason-act loop — directly into the inference layer. This is the Orchestration Absorption Problem. If your agents only ever call Gemini, the framework increasingly becomes a translation layer over an API that already orchestrates. However, frameworks still win for multi-model, multi-provider workflows: if you orchestrate Claude, GPT, and Gemini together, or need provider-agnostic portability, LangGraph and AutoGen remain the right choice. The estimated impact for Gemini-native teams is a 40-60% reduction in custom orchestration code for common patterns. The pragmatic stance: use the Interactions API as your inference and state layer, keep agent logic portable via ADK or MCP-registered tools, and reserve frameworks for genuine cross-provider orchestration where neutrality has value.

What are the pricing and rate limits for the Interactions API in 2026?

Pricing follows a per-token input/output model consistent with existing Gemini pricing, with one important distinction: background execution tasks are billed on compute-time rather than token count. There's a free quota tier accessible via Google AI Studio for prototyping, and no Vertex AI account is required for the standard tier, which lowers the barrier versus previous enterprise-only agent features. Because exact per-token Gemini 3 Pro rates and compute-time figures can change at GA, verify them on Google's official pricing page at implementation time. Beyond direct API cost, factor in total cost of ownership: server-side state lets you delete Redis/Postgres state infrastructure, and the 40-60% orchestration-code reduction is often the larger saving. For regulated workloads needing VPC controls and SLAs, use the Vertex AI enterprise path, which carries different pricing and compliance terms.

About the Author

Rushil Shah

AI Systems Builder & Founder, Twarx

Rushil Shah is the founder of Twarx and an AI systems builder who has spent years designing autonomous workflows, multi-agent architectures, and AI-powered business tools. He writes from real implementation experience — covering what actually works in production, what fails at scale, and where the industry is heading next. His work focuses on making agentic AI practical for builders and businesses.

LinkedIn · Full Profile

This article was originally published on Twarx. Follow for daily deep dives on AI agents and automation.