DEV Community

aarhamforensics
aarhamforensics

Posted on • Originally published at twarx.com

Interactions API Gemini Models Agents: The 2026 Complete Guide

Originally published at twarx.com - read the full interactive version there.

Last Updated: June 26, 2026

Every LangGraph workflow, every CrewAI pipeline, and every custom stateful agent scaffold you built in 2024 and 2025 was a workaround — and Google just shipped the thing that makes workarounds unnecessary.

The Interactions API Gemini models agents release — Google's single unified endpoint reaching general availability — replaces the fragmented GenerateContent surface as the primary interface, with server-side state, background execution, tool combination and native multimodal generation across both Gemini models and agents.

After this article you'll know exactly what changed, how to call it, what it costs, how it stacks against OpenAI's Responses API, and whether to migrate your orchestration stack now.

Google Interactions API general availability announcement graphic showing unified Gemini models and agents endpoint

Google's official Interactions API GA announcement — a single unified endpoint for Gemini models and agents with server-side state and background execution. Source

Coined Framework

The Stateful Gravity Shift — the irreversible migration of agent memory, tool routing, and multi-turn context from client-side orchestration frameworks into cloud-native API infrastructure, making most bespoke orchestration middleware architecturally obsolete

For two years, developers built memory stores, tool routers and context managers in their own application layer because the model APIs were stateless. The Interactions API GA is the moment that gravity reverses — state and execution move into the provider's infrastructure, and the middleware you wrote to compensate becomes a liability rather than an asset.

What Google Announced: Interactions API Reaches General Availability

Official announcement date, source, and exact scope

On June 23, 2026, Google announced via The Keyword (blog.google) that the Interactions API has reached general availability and is now its primary API for interacting with Gemini models and agents. The post is authored by Ali Çevik, Group Product Manager at Google DeepMind, and Philipp Schmid, Developer Relations Engineer at Google DeepMind. The API launched its public beta in December 2025 and, per Google, “quickly become developers’ favorite way to build applications with Gemini.”

What 'primary interface' actually means for the GenerateContent API

This is the part that actually matters. Google states all of its documentation “now defaults to Interactions API” and that it is “working with ecosystem partners to make it the default interface across 3P SDKs and Libraries.” The legacy GenerateContent API isn't deleted — but it's explicitly no longer the recommended entry point. In Google product language, “primary interface” is the strongest signal short of a formal deprecation notice. I'd treat it accordingly.

The Gemini model family connection

The GA release ships with a stable schema and “major new capabilities that developers asked for, including Managed Agents, background execution, Gemini Omni (soon) and more,” per the official announcement. The unified endpoint is the documented path for current Gemini inference and agent execution alike.

Dec 2025
Interactions API public beta launch
[blog.google, 2026](https://blog.google/innovation-and-ai/technology/developers-tools/interactions-api-general-availability/)




June 23, 2026
General availability date
[blog.google, 2026](https://blog.google/innovation-and-ai/technology/developers-tools/interactions-api-general-availability/)




1 endpoint
Unified surface for models AND agents
[blog.google, 2026](https://blog.google/innovation-and-ai/technology/developers-tools/interactions-api-general-availability/)
Enter fullscreen mode Exit fullscreen mode

What the Interactions API Is: Architecture and Core Design Philosophy

The single unified endpoint model explained

Building with Gemini used to mean juggling distinct surfaces — GenerateContent for inference, EmbedContent for vectors, and your own bespoke wrappers for anything resembling an agent. I've written about that fragmentation before; it was genuinely painful. The Interactions API collapses all of it. As Google describes it: “Whether you’re calling a model or running an agent, the Interactions API gets you there in a few lines of code. Pass a model ID for inference, an agent ID for autonomous tasks, set background=True for anything long-running.” That's the whole mental model. Deceptively simple.

One endpoint, three modes: model ID for inference, agent ID for autonomy, background=True for anything that outlives an HTTP request. That is the entire mental model — and that simplicity is the weapon.

Server-side state: what it stores and for how long

This is the architectural pivot. With server-side state, conversation history, tool-call results and intermediate agent outputs are managed by Google's infrastructure rather than your database. In the stateless GenerateContent world, you replayed the full message history on every turn and persisted it yourself — I've seen teams burn non-trivial engineering time on that replay logic alone. Now the session lives on the server, referenced by an interaction handle. This is the single feature that triggers the Stateful Gravity Shift. It's also the single feature that introduces the lock-in risk discussed later, so hold both thoughts at once. If you want a primer on why state management is hard, our breakdown of AI agent memory architectures covers the patterns this release replaces.

How background execution changes the agent runtime model

Setting background=True tells the server to run the interaction asynchronously. The work outlives the HTTP request lifecycle — critical for agentic workflows that exceed the typical ~60-second serverless timeout limits. Before this, long-running agents demanded a job queue, a worker pool and a polling layer you built and operated yourself. Background execution moves that runtime concern server-side. One boolean replaces an entire subsystem.

If your agent takes 4 minutes to browse, reason and write a report, the old stack needed Celery + Redis + a worker fleet. With background=True, that infrastructure becomes a single boolean. That is not a convenience — it is the deletion of an entire layer of your architecture.

Multimodal input handling in a single request

The Interactions API accepts text, image, audio, video and document inputs within a single call — true native multimodality rather than separate pre-processing pipelines stitched together client-side. Combined with the forthcoming Gemini Omni (announced as “soon”), the endpoint is positioned for both multimodal understanding and generation through one schema. Whether “soon” means weeks or quarters, I honestly can't tell you.

Stateless GenerateContent vs. Stateful Interactions API: The Request Lifecycle Shift

  1


    **Client builds request**
Enter fullscreen mode Exit fullscreen mode

Old way: you replay the entire message history + tool results from YOUR database on every turn. New way: you send only the new turn plus an interaction handle.

↓


  2


    **Interactions API endpoint**
Enter fullscreen mode Exit fullscreen mode

Single unified endpoint resolves model ID (inference) or agent ID (autonomous task). Server-side state hydrates prior context automatically.

↓


  3


    **Execution mode branch**
Enter fullscreen mode Exit fullscreen mode

Synchronous → streamed response. background=True → asynchronous job runs server-side, survives request timeout, returns via poll or callback.

↓


  4


    **Tool combination + Managed Agent sandbox**
Enter fullscreen mode Exit fullscreen mode

Built-in tools, custom function calling, and remote Linux sandbox (for Managed Agents) execute. State and intermediate outputs persist on Google infrastructure.

↓


  5


    **Response + persisted session**
Enter fullscreen mode Exit fullscreen mode

Client receives output. The session remains addressable for the next turn — no client-side history replay needed.

The sequence matters because steps 1 and 5 are where you previously owned the database, the replay logic and the orchestration glue — all now offloaded.

Diagram comparing client-side orchestration middleware versus cloud-native server-side state in the Gemini Interactions API

The Stateful Gravity Shift visualized: orchestration logic migrating from the application layer into Google's API infrastructure.

Full Capability Breakdown: Every Feature in the Interactions API

Managed Agents: what they are and how they differ from custom agents

Per the announcement, Managed Agents mean “a single API call provisions a remote Linux sandbox where an agent can reason, execute code, browse the web and manage files.” This is the headline GA capability. Google manages the execution environment, the scaling and the state persistence — eliminating self-hosted agent infrastructure. You can either run the default agent or “define your own custom agents with instructions, skills and data sources.” That second option is where most production teams will end up, once the novelty of the default wears off.

The Antigravity agent: Google's reference implementation

The Antigravity agent ships as the default Managed Agent, per the official text. Runnable immediately via the API. It demonstrates tool chaining and stateful multi-turn interaction — a reference agent you can call without building one yourself, which is useful for evaluating the primitives before you commit to a custom implementation.

Tool combination and native function calling

GA includes tool improvements that let you “mix built-in tools” with custom function calling within a single interaction. This is the routing layer that frameworks like LangGraph and CrewAI previously owned — now native to the endpoint. If your framework's main job was tool routing, that value proposition just got a lot thinner. We compared these tradeoffs in our guide to AI agent frameworks.

Stable schema and developer-requested features added at GA

The single most important enterprise signal in the entire release: GA ships a stable schema. Breaking changes now follow a versioned deprecation policy. For teams that refused to build production systems on a beta surface — and there were many, for good reason — this is the green light. Developer-requested features confirmed at GA include Managed Agents, the stable schema, background execution and improved tool combination primitives.

OpenAI library compatibility layer

Google included an OpenAI compatibility path so existing OpenAI Python and TypeScript SDK users can route requests through Gemini via the Interactions API endpoint with a minimal change. For simple use cases, the switching cost is measured in minutes. It's a deliberate adoption play, and honestly a smart one — it lowers the activation energy for testing Gemini without requiring a full migration commitment.

Coined Framework

The Stateful Gravity Shift in practice

When Managed Agents provision a sandbox, persist state, and route tools server-side, the developer's job collapses from “build the runtime” to “declare the intent.” The gravity well of orchestration has moved from your VPC into Google's.

A stable schema is not a feature. It is a contract. It is Google telling enterprise architects: build on this, and we will not break you on a Tuesday.

How to Access and Use the Interactions API: Step-by-Step Implementation Guide

Prerequisites: API key, SDK versions, and project setup

Both Google AI Studio and Vertex AI expose the Interactions API. For prototyping, AI Studio gives you a key in seconds. For production workloads requiring IAM, VPC controls and audit logging, Vertex AI is the right path — don't try to retrofit AI Studio keys into an enterprise security model. The client class for the Interactions path differs from the legacy GenerateContent client, and the docs have been updated to reflect this, but confirm the exact class name in the current Gemini developer guide before wiring it up. The SDK moves fast enough that I'd always verify.

Making your first Interactions API call: minimal working example

Python — minimal inference call

Install the latest SDK first: pip install -U google-genai

from google import genai

client = genai.Client() # picks up GEMINI_API_KEY from env

Pass a MODEL ID for plain inference

resp = client.interactions.create(
model='gemini-3-pro', # model ID = inference mode
input='Summarize Q2 churn drivers in 3 bullets.'
)
print(resp.output_text)

Enabling server-side state and multi-turn sessions

Python — stateful multi-turn (no client-side history replay)

Turn 1 creates a server-side session

first = client.interactions.create(
model='gemini-3-pro',
input='My customer ID is 4471. What is my plan?',
store=True # persist state server-side
)

Turn 2 references the prior interaction — no replay needed

second = client.interactions.create(
model='gemini-3-pro',
previous_interaction_id=first.id, # server hydrates context
input='Upgrade me to the annual tier.'
)
print(second.output_text)

Registering and invoking tools within the Interactions API

Python — Managed Agent with the default Antigravity sandbox

Pass an AGENT ID instead of a model ID for autonomous tasks

agent_run = client.interactions.create(
agent='antigravity', # default Managed Agent
input='Research the top 3 competitors to our SKU and draft a brief.',
tools=['web_browse', 'code_execution', 'file_manager']
)
print(agent_run.output_text)

Triggering background execution for long-running agent tasks

Python — background execution + poll

job = client.interactions.create(
agent='antigravity',
input='Crawl our 200-page docs site and produce a content audit.',
background=True # runs async, survives request timeout
)

Poll until complete (or register a callback endpoint)

status = client.interactions.retrieve(job.id)
while status.state != 'completed':
status = client.interactions.retrieve(job.id)
print(status.output_text)

Need pre-built agents to test these Interactions API Gemini models agents flows against your own data? You can explore our AI agent library for reference patterns, browse ready-to-deploy agent templates, and review how teams structure production AI agents before committing your stack.

Pricing tiers, rate limits, and quota information as of June 2026

Pricing follows standard Gemini model token pricing for inference. The structural new line item: server-side state storage introduces a session duration billing dimension that didn't exist in stateless GenerateContent calls. Managed Agents that hold a Linux sandbox open and persist files will carry execution and storage costs beyond raw tokens — budget for it explicitly, because this is exactly the kind of thing that shows up as a surprise on your cloud bill at end of month. Always confirm current per-model rates on the official pricing page before forecasting.

Accessing via Apple Foundation Models framework for iOS and Xcode developers

Concurrent with this release, Apple developers can access cloud-hosted Gemini models — including the Interactions API surface — via the Apple Foundation Models framework, lowering the barrier for native iOS and Xcode integration. This is a notable distribution vector into OS-level developer tooling. Worth watching if your users are on Apple platforms.

Step-by-step Gemini Interactions API implementation flow in Python showing model agent and background execution modes

The three call modes of the Interactions API — model ID for inference, agent ID for autonomy, and background=True for long-running tasks — shown as a unified implementation surface.

[

Watch on YouTube
Google Interactions API: building Gemini agents with server-side state
Google DeepMind • Gemini API architecture walkthrough
Enter fullscreen mode Exit fullscreen mode

](https://www.youtube.com/results?search_query=Google+Interactions+API+Gemini+agents+developer+guide)

When to Use the Interactions API vs. Alternatives

Interactions API vs. GenerateContent API: migration decision tree

If you're building anything multi-turn, agentic or long-running, migrate. Full stop. If you have a high-volume, single-shot, stateless inference workload where you'd never want server-side state billing, GenerateContent still works — but you're now on a non-primary path that I wouldn't bet a new production system on. For most new builds: default to Interactions API. For existing high-throughput stateless pipelines, migrate deliberately and watch the cost model shift.

Interactions API vs. ADK (Agent Development Kit): complementary or competing?

Complementary. The recommended production stack is ADK + Interactions API: the Agent Development Kit handles agent logic definition while the Interactions API handles execution and state. ADK defines the “what,” Interactions API runs the “how.” They're not fighting for the same job.

When LangGraph, CrewAI, or AutoGen still make sense

LangGraph remains genuinely relevant for complex graph-based agent topologies, non-Gemini models, or multi-model routing that Managed Agents don't yet support. AutoGen and CrewAI retain real advantages in multi-agent communication and cross-model orchestration. If your architecture spans GPT, Claude and Gemini, you still need a framework layer above the provider APIs — the Interactions API doesn't solve that. Read more on choosing between multi-agent systems and single-provider stacks.

When to stay on OpenAI's Responses API instead of switching

If your team is deeply invested in OpenAI tooling and your workloads run well on GPT models, the marginal benefit of switching may not justify the migration cost. Don't move because of announcement energy. The compatibility layer means you can test Gemini through your existing OpenAI SDK in minutes without committing to anything — do that first.

The Interactions API does NOT abstract away retrieval infrastructure. RAG pipelines on Pinecone, Weaviate or pgvector still require custom integration — Managed Agents accept data sources, but your vector store is still yours to operate.

Interactions API vs. Closest Competitors: Direct Comparison

The state-management picture in June 2026

OpenAI's Responses API introduced server-side state in early 2025 — Google's Interactions API GA arrives roughly 12–18 months later, but ships with native Managed Agents and a Linux-sandbox isolation model that OpenAI's equivalent hasn't matched at the same quality. Anthropic's Messages API remains stateless by design as of June 2026, requiring client-side state management — a structural disadvantage for long-running sessions. Self-hosted frameworks like LangGraph and AutoGen offload nothing: you own the infrastructure but keep full portability.

CapabilityGoogle Interactions APIOpenAI Responses APIAnthropic Messages APILangGraph (self-hosted)

Server-side stateYes (GA Jun 2026)Yes (since early 2025)No — client-side onlyDeveloper-managed

Managed sandbox agentsYes (Antigravity default)Partial (Assistants)NoSelf-built

Background execution + callbackYes (background=True)LimitedNoSelf-built queue

Native multimodal in one callText/image/audio/video/docsMultimodalMultimodalDepends on model

Multi-model routingGemini-focusedOpenAI-focusedAnthropic-focusedAny model

OpenAI SDK compatibilityYes (3-line switch)NativeNoAdapter required

Vendor lock-in riskHigh (no state export at GA)HighLow (stateless)Low (portable)

Anthropic staying stateless is not a gap — it is a philosophy. But in a world racing toward long-running autonomous agents, philosophy without server-side state is a structural handicap.

Industry Impact: What the Interactions API GA Means for the AI Ecosystem

The threat to third-party orchestration framework adoption

The Stateful Gravity Shift accelerates as Google, OpenAI and eventually Anthropic centralize state and execution inside their APIs. The total addressable market for orchestration middleware — the glue code that exists only because the APIs were stateless — shrinks. Frameworks that pivot to multi-model routing and observability will survive. Those whose core value was state management get hollowed out. That's not speculation; it's already happening.

Impact on enterprise AI platform vendors

Microsoft Azure AI Foundry and AWS Bedrock Agents face renewed pressure to match server-side state and Managed Agent primitives or risk being positioned as legacy infrastructure. Vendors that built atop the GenerateContent API face non-trivial migration engineering to expose Managed Agents and background execution to their own customers. That's months of work, not weeks.

What this means for MCP, RAG, and vector database vendors

MCP (Model Context Protocol) adoption gets more complex when the primary interface already abstracts tool routing — the boundary between MCP-compatible tools and Interactions API native tools is a genuine unresolved tension. Nobody's fully sorted this yet. Vector database vendors like Pinecone and Weaviate aren't displaced, but they're pushed one abstraction layer deeper, away from direct developer touchpoints.

12–18 mo
Lag behind OpenAI's server-side state launch
[OpenAI Docs, 2025](https://platform.openai.com/docs/api-reference/responses)




3 lines
Code change to route OpenAI SDK to Gemini
[blog.google, 2026](https://blog.google/innovation-and-ai/technology/developers-tools/interactions-api-general-availability/)




<30%
Projected new agentic apps using external orchestration as primary state layer by 2027 (Twarx estimate)
[Twarx Analysis, 2026](https://twarx.com/blog/orchestration)






  ❌
  Mistake: Treating server-side state as free
Enter fullscreen mode Exit fullscreen mode

Developers migrate to the Interactions API for convenience, then get surprised by session-duration and sandbox-execution billing that didn't exist in stateless GenerateContent calls.

Enter fullscreen mode Exit fullscreen mode

Fix: Model your cost on the Gemini pricing page including session duration AND sandbox time. Close idle sessions explicitly; do not let Managed Agents hold sandboxes open.

  ❌
  Mistake: Deleting your client-side state layer immediately
Enter fullscreen mode Exit fullscreen mode

Teams rip out LangGraph state stores on day one, then discover there's no documented export path for session state out of Google's infrastructure at GA — a lock-in trap. I'd wait.

Enter fullscreen mode Exit fullscreen mode

Fix: Run dual-write during migration. Keep a portable copy of critical conversation state in your own store until Google ships a documented export path.

  ❌
  Mistake: Assuming Managed Agents replace your RAG stack
Enter fullscreen mode Exit fullscreen mode

Engineers expect Managed Agents to handle retrieval, then find their vector database integration on Pinecone or Weaviate still needs custom wiring. The docs are optimistic about this.

Enter fullscreen mode Exit fullscreen mode

Fix: Treat Managed Agents as the execution and state layer; keep your RAG pipeline as a registered data source/tool, not an assumed feature.

  ❌
  Mistake: Building single-provider lock-in for a multi-model future
Enter fullscreen mode Exit fullscreen mode

Going all-in on Gemini-native tool routing makes future Claude or GPT routing expensive to add back.

Enter fullscreen mode Exit fullscreen mode

Fix: If multi-model is on your roadmap, keep a thin framework layer (LangGraph/AutoGen) above the Interactions API for routing flexibility.

Expert and Community Reactions to the Interactions API GA

Developer community response

Early sentiment across developer channels is cautiously optimistic. The OpenAI three-line migration path gets cited repeatedly — it's the most pragmatic immediate benefit for teams already invested in OpenAI tooling, and people know it. The unified endpoint simplicity drew the most positive reactions. The lock-in concern drew the most caution. Both reactions are correct.

Analysis from independent technical reviewers

Independent technical writers — including community analyses such as “Interactions API + ADK: A Closer Look” — highlighted stateful multi-turn interaction support as the most significant architectural shift in the Gemini API since launch. The framing resonates: ADK plus the Interactions API is the first time Google's agent story feels like a cohesive stack rather than assembled parts. Track ongoing coverage from Google DeepMind research.

Enterprise architect perspectives on production readiness

Enterprise architects flagged that a stable schema is table stakes, not a differentiator — the real signal is whether Google maintains the versioned deprecation policy under competitive pressure. That's the right read. Named DeepMind authors Ali Çevik (Group Product Manager) and Philipp Schmid (DevRel Engineer) are the public faces of the commitment, which means there's at least some organizational accountability attached to it.

Criticism: what the Interactions API still does not solve

The dominant criticism: server-side state creates vendor lock-in, and migrating session state out of Google's infrastructure has no documented export path at GA. That's a real gap, not a theoretical one. A second gap: independent benchmarking of Interactions API latency versus direct GenerateContent calls hadn't been published as of the GA date — a notable hole in the public evaluation picture that teams building latency-sensitive products should care about.

The feature everyone praises — server-side state — is the same feature everyone fears. Convenience and lock-in are the same coin; the Interactions API just flipped it.

What Comes Next: Roadmap, Open Questions, and Predictions

Google's signalled next steps

Google explicitly named Gemini Omni as “soon” in the GA post — extending the multimodal generation story through the same endpoint. The Antigravity Managed Agent running in cloud sandboxes also suggests a Google-curated agent marketplace is a plausible next move, conceptually similar to a GPT-Store model but with sandboxed execution. I wouldn't be surprised if that's a 2026 H2 announcement.

Will GenerateContent API be deprecated?

Google hasn't announced a deprecation date. But the “primary interface” language is the strongest deprecation signal Google typically issues before a formal sunset timeline. Treat GenerateContent as on a glide path, not a permanent home. Build new things on Interactions API.

The open question: will it support non-Gemini models?

The defining unresolved question: whether the Interactions API opens to third-party model routing or stays Gemini-exclusive. That answer determines whether it becomes infrastructure for the broader ecosystem or remains a Gemini feature. The difference is enormous.

2026 H2


  **Gemini Omni ships through the Interactions API**
Enter fullscreen mode Exit fullscreen mode

Google labeled it “soon” in the GA announcement — multimodal generation through the unified endpoint is the next confirmed milestone.

2026 H2


  **Bedrock Agents and Azure AI Foundry respond**
Enter fullscreen mode Exit fullscreen mode

Competitive pressure forces matching Managed Agent and background-execution primitives, mirroring how OpenAI's Responses API state set the prior precedent.

2027 H1


  **State-export pressure forces portability commitments**
Enter fullscreen mode Exit fullscreen mode

The lock-in criticism at GA is loud enough that providers begin shipping documented session-export paths to win enterprise trust.

2027


  **Fewer than 30% of new agentic apps use external orchestration as primary state layer**
Enter fullscreen mode Exit fullscreen mode

Grounded in current API adoption velocity and the Stateful Gravity Shift — middleware survives as routing/observability, not state management.

Coined Framework

The Stateful Gravity Shift — the endgame

The destination is an ecosystem where state and execution live with the model provider by default, and client-side orchestration exists only for cross-provider routing. The Interactions API GA is the clearest evidence yet that this is already underway.

Future roadmap timeline of the Gemini Interactions API showing Gemini Omni Managed Agents and competitor responses through 2027

The projected agentic API landscape through 2027 under the Stateful Gravity Shift — provider-owned state becomes the default, framework middleware narrows to routing.

Frequently Asked Questions

What is the Google Interactions API for Gemini models and agents and when did it reach general availability?

The Interactions API is Google's single unified endpoint for interacting with Gemini models and agents, offering server-side state, background execution, tool combination and native multimodal generation. It reached general availability on June 23, 2026, per the official blog.google announcement, after launching its public beta in December 2025. With GA, it became Google's primary API for Gemini, with all documentation now defaulting to it. You call it by passing a model ID for inference, an agent ID for autonomous tasks, or setting background=True for long-running work. The GA release ships a stable schema with a versioned deprecation policy, plus Managed Agents and the default Antigravity reference agent.

How does the Interactions API differ from the Gemini GenerateContent API?

GenerateContent is stateless — you replay the full conversation history on every call and persist it in your own database. The Interactions API adds server-side state, so Google manages conversation history, tool-call results and intermediate agent outputs; you reference a session by interaction handle instead of replaying it. It also unifies inference and agent execution under one endpoint, adds background execution for tasks exceeding HTTP timeouts, and introduces Managed Agents running in cloud Linux sandboxes. Google now designates the Interactions API as the primary interface and GenerateContent as a non-recommended legacy path. The trade-off: server-side state introduces a new session-duration billing dimension and a vendor lock-in risk, since there is no documented state-export path at GA.

What are Managed Agents in the Gemini Interactions API and how do you use them?

Managed Agents let a single API call provision a remote Linux sandbox where an agent can reason, execute code, browse the web and manage files — Google handles the execution environment, scaling and state persistence, eliminating self-hosted agent infrastructure. The Antigravity agent ships as the default, and you can define custom agents with your own instructions, skills and data sources. To use one, pass an agent ID (e.g. 'antigravity') instead of a model ID, optionally specifying tools like web_browse, code_execution or file_manager. For long-running agent work, add background=True so the job survives the request timeout and returns via polling or a callback. Pair Managed Agents with Google's ADK for agent logic definition; the Interactions API handles the runtime and state.

Can I use the Interactions API with OpenAI's Python or TypeScript SDK?

Yes. Google shipped an OpenAI compatibility layer that lets existing OpenAI Python and TypeScript SDK users route requests through Gemini via the Interactions API endpoint with a roughly three-line change — typically swapping the base URL and API key, then pointing to a Gemini model. For simple use cases, the switching cost is measured in minutes, which is a deliberate adoption strategy by Google to capture teams already invested in OpenAI tooling. The compatibility path is ideal for testing Gemini against your existing codebase before committing to a full migration. Note that advanced Interactions-API-native features like Managed Agents and background execution are best accessed through the native google-genai SDK rather than the OpenAI compatibility shim.

How does server-side state in the Interactions API work and what are the vendor lock-in risks?

Server-side state means Google's infrastructure stores conversation history, tool-call results and intermediate agent outputs, addressable by an interaction handle. You send only the new turn plus a reference to the prior interaction; the server hydrates context automatically, removing client-side history replay. The lock-in risk is real: as of GA there is no documented export path to move session state out of Google's infrastructure, so deeply stateful applications become hard to port to OpenAI, Anthropic or self-hosted frameworks. The recommended mitigation is dual-write during migration — keep a portable copy of critical conversation state in your own store (Postgres, Redis) until Google ships an export mechanism. This also adds a new session-duration billing dimension absent from stateless calls.

How does Google's Interactions API compare to OpenAI's Responses API in June 2026?

OpenAI's Responses API introduced server-side state in early 2025, roughly 12–18 months before Google's Interactions API GA. However, the Interactions API ships with native Managed Agents in isolated Linux sandboxes and first-class background execution with callback support — capabilities OpenAI's Assistants-style equivalent has not matched at the same sandbox-isolation quality. Both offer server-side state, multimodal input and SDK ecosystems; Google additionally provides an OpenAI compatibility layer that makes trialing Gemini a near-zero-cost switch. Both carry comparable vendor lock-in exposure. The practical decision: if you're GPT-native and satisfied, the Responses API is fine; if you need provider-managed sandboxed agents and long-running background jobs as first-class primitives, the Interactions API is currently more complete.

What is background execution in the Interactions API and when should I use it?

Background execution lets you set background=True on any Interactions API call so the server runs the interaction asynchronously, surviving the HTTP request lifecycle. You retrieve results by polling the interaction ID or registering a callback endpoint. Use it for any agentic workflow that exceeds typical serverless timeouts (~60 seconds) — multi-step web research, large document audits, code execution, or report generation where an agent reasons across many tool calls. Before this feature, you needed a job queue, worker pool and polling layer (Celery, Redis, custom workers) to handle long-running agents; background execution collapses that infrastructure into a single boolean. Do not use it for fast synchronous chat where streaming responses give a better UX. It is the runtime backbone of production Managed Agents.

The Interactions API Gemini models agents GA isn't a feature release — it's the opening move in a war for who owns the agentic infrastructure layer. The teams that recognize the Stateful Gravity Shift early, migrate deliberately, and hedge against lock-in will ship faster with less infrastructure to maintain. The teams that ignore it will keep paying engineers to maintain middleware the API has already made obsolete. For more on building production-grade systems, explore our guides on enterprise AI, workflow automation, and n8n integration patterns, or start deploying with our ready-made AI agents.

About the Author

Rushil Shah

AI Systems Builder & Founder, Twarx

Rushil Shah is the founder of Twarx and an AI systems builder who has spent years designing autonomous workflows, multi-agent architectures, and AI-powered business tools. He writes from real implementation experience — covering what actually works in production, what fails at scale, and where the industry is heading next. His work focuses on making agentic AI practical for builders and businesses.

LinkedIn · Full Profile


This article was originally published on Twarx. Follow for daily deep dives on AI agents and automation.

Top comments (0)