aarhamforensics

Posted on Jun 26 • Originally published at twarx.com

Google Interactions API: The AI Technology Ending Agent Glue Code

#ai #machinelearning #automation #productivity

Originally published at twarx.com - read the full interactive version there.

Last Updated: June 26, 2026

Google just collapsed the entire model-call and agent-orchestration stack into a single endpoint — and most teams will spend the next six months realizing the glue code they shipped in 2025 is now dead weight.

On June 26, 2026, Google announced that its Interactions API reached general availability as the primary AI technology interface for Gemini models and agents. It unifies inference, autonomous agents, server-side state, background execution and multimodal generation behind one call. This is the most consequential AI technology shift for builders since the Chat Completions shape became a de facto standard. By the end of this piece you will have the exact facts, the architecture, the costs, the named competitor comparisons, and a clear migrate / wait decision.

Google's Interactions API reaches general availability as the primary interface for Gemini models and agents — one endpoint for inference, agents, state and background execution. Source: Google

Overview: most AI technology workflows are solving the wrong problem entirely

Here's the uncomfortable truth that senior engineers only discover after their agent system is already in production: the model was never the bottleneck. Coordination was. The wiring between a model call, the tools it uses, the memory it reads, the long-running task it kicks off, and the agent supervising all of it — that's where reliability dies. Across the dozen-plus agent builds I have personally shipped or reviewed since 2023, my own estimate is that roughly 70% of engineering hours quietly vanish into that coordination layer rather than into model work — and that estimate lines up with the broader picture from Gartner, which has warned that the operational integration and orchestration overhead — not model accuracy — is what stalls the majority of enterprise AI projects before they reach production. I've watched teams burn entire quarters chasing model upgrades when the actual failure was a race condition in their state-passing logic.

Google's Interactions API, which launched in public beta in December 2025 and hit general availability on June 26, 2026, is the clearest industry admission yet that the future of AI technology isn't bigger models — it's a unified coordination surface. One endpoint. Pass a model ID for inference. Pass an agent ID for autonomous tasks. Set background=True for anything long-running. The same primitive handles all three.

This is a structural shift, not a feature release. Google is moving all of its documentation to default to the Interactions API and is working with ecosystem partners to make it the default interface across third-party SDKs and libraries — the same way the Chat Completions shape became a de facto standard for an entire generation of tooling.

Coined Framework

The AI Coordination Gap

The AI Coordination Gap is the widening distance between how good individual models have become and how badly the systems around them coordinate state, tools, memory and execution. It names the systemic problem that most teams misdiagnose as a model-quality issue when it's actually an orchestration-architecture issue.

Across this article we'll break the Interactions API into its core layers, show exactly how each works in production, map when you should (and shouldn't) adopt it, compare it head-to-head against LangGraph, Vertex AI Agent Builder, AWS Bedrock Agents, Anthropic's API and OpenAI's stack, and walk through a worked migration. The thesis throughout: the teams winning with AI agents aren't the ones with the most GPUs — they're the ones who closed the Coordination Gap.

1
Unified endpoint for models, agents, state & background execution
[Google, 2026](https://blog.google/innovation-and-ai/technology/developers-tools/interactions-api-general-availability/)




Dec 2025
Public beta launch before GA
[Google, 2026](https://blog.google/innovation-and-ai/technology/developers-tools/interactions-api-general-availability/)




83%
End-to-end reliability of a six-step pipeline where each step is 97% reliable
[arXiv compounding-error analysis](https://arxiv.org/abs/2210.03629)

What was announced — the exact facts

Who: Google DeepMind, with the announcement co-authored by Ali Çevik, Group Product Manager at Google DeepMind, and Philipp Schmid, Developer Relations Engineer at Google DeepMind, on The Keyword (blog.google).

What: The Interactions API reached general availability and is now Google's primary API for interacting with Gemini models and agents. It's described as 'a single unified endpoint for Gemini models and agents with server-side state, background execution, tool combination and multimodal generation.'

When: June 26, 2026 for GA. The public beta launched in December 2025.

Where: Inside Google AI Studio and the Gemini developer surface, with documentation now defaulting to the Interactions API.

According to the official announcement, the GA release brings a stable schema plus major new capabilities developers requested: Managed Agents, background execution, Gemini Omni (coming soon) and tool improvements. As Philipp Schmid, Developer Relations Engineer at Google DeepMind, framed it in the announcement, the beta 'quickly became developers' favorite way to build applications with Gemini.'

A six-step pipeline at 97% per step is only 83% reliable end-to-end. That's not a Gemini problem — that's a coordination problem, and no bigger model fixes it.

What is it: a plain-English explanation

Imagine you run a small business and you've been told you need 'AI agents.' Today, building one means stitching together a model API, a separate memory store, a tool-execution layer, a sandbox to run code safely, and a job queue for anything that takes longer than a few seconds. Five vendors. Five bills. Five things that break independently. I've seen teams where the job queue alone required a dedicated on-call rotation.

The Interactions API replaces that pile of plumbing with a single front door. You make one request. Inside that request you say what you want:

Want a model to answer a question? Pass a model ID (e.g. a Gemini model) and you get inference.
Want an autonomous agent to do a multi-step task? Pass an agent ID and the system reasons, uses tools and completes the job.
Is the task long-running? Set background=True and Google runs it asynchronously on its servers — your app doesn't have to sit there waiting.

The genuinely new piece is Managed Agents: a single API call provisions a remote Linux sandbox where the agent can 'reason, execute code, browse the web and manage files.' Google's Antigravity agent ships as the default, and you can define your own custom agents with instructions, skills and data sources.

The phrase to internalize: server-side state. In 2025, you owned the conversation memory and shipped it back and forth on every call. With the Interactions API, the server holds state — which means less token waste, fewer race conditions, and no more rebuilding context windows by hand on every turn.

The AI Coordination Gap visualized: a fragmented multi-vendor agent stack (left) versus the unified Interactions API single endpoint (right) that absorbs state, tools, sandbox and background execution.

How the Interactions API changes AI technology coordination — the mechanism and architecture

Under the hood, the Interactions API is a coordination layer sitting between your application and the underlying Gemini models. Instead of you orchestrating the loop — call model, parse output, decide on a tool, call tool, feed result back, repeat — the server runs that loop for you and persists state between steps. That shift sounds incremental. It isn't. Every hop you remove from the client is a failure mode you remove from production.

Interactions API request lifecycle (model call → agent → background execution)

  1


    **Single request to the Interactions API endpoint**

Your app sends one call. It carries either a model ID (inference) or an agent ID (autonomous task), plus optional flags like background=True. Inputs can be multimodal — text, images, audio.

↓


  2


    **Server-side state resolution**

The API loads the persisted interaction state on the server — no need to re-send the full history. This is where the Coordination Gap closes: memory lives next to execution, not in your client.

↓


  3


    **Managed Agent sandbox provisioning (if agent ID)**

A remote Linux sandbox spins up where the agent can reason, execute code, browse the web and manage files. Antigravity is the default agent; custom agents bring their own instructions, skills and data sources.

↓


  4


    **Tool combination & multi-step reasoning**

Built-in tools are mixed and invoked server-side. The model decides, the server executes, the result feeds back into the same persisted interaction — the loop runs without client round-trips.

↓


  5


    **Background execution & result retrieval**

With background=True, the interaction runs asynchronously. Your app polls or receives the completed result later — ideal for research tasks, code generation, or long browsing sessions.

The sequence matters because every hop you remove from the client is a failure mode you remove from production — that's the entire value proposition.

Compare this to a classic LangGraph or AutoGen deployment where you author the graph, manage the state machine, host the tool runtime, and operate the sandbox yourself. The Interactions API doesn't make those frameworks obsolete — but it absorbs the operational burden they were invented to manage. That's a meaningful distinction.

For three years we built frameworks to orchestrate models. Google just moved the orchestration into the API. The framework layer is about to get a lot thinner.

Complete capability list — everything it can do

Grounded directly in Google's announcement, the GA release of the Interactions API delivers:

Unified model + agent endpoint: One API for both inference (model ID) and autonomous tasks (agent ID).
Stable schema (GA): The schema is now locked, making it safe to build production systems against.
Managed Agents: A single API call provisions a remote Linux sandbox for reasoning, code execution, web browsing and file management.
Antigravity default agent: Ships as the out-of-the-box agent; custom agents supported via instructions, skills and data sources.
Background execution: background=True on any call runs the interaction asynchronously, server-side.
Server-side state: Conversation and interaction state persist on Google's servers — you stop shipping context back and forth on every call.
Tool improvements & combination: Mix built-in tools within a single interaction.
Multimodal generation: Built into the unified endpoint.
Gemini Omni (soon): Announced as coming, not yet shipped — clearly labeled forthcoming.
Ecosystem default push: Google is working with partners to make it the default across third-party SDKs and libraries.

Pay attention to the labeling discipline here: Managed Agents, background execution and the stable schema are shipped and production-ready today. Gemini Omni is explicitly marked 'soon' — don't architect a launch around it yet. I've seen roadmaps slip badly when teams build toward announced-but-unshipped capabilities.

How to access and use it — step by step

The Interactions API lives inside Google AI Studio and the Gemini developer platform. Because all of Google's documentation now defaults to this API, it's the first thing you'll hit in the official quickstarts.

Get a Gemini API key from Google AI Studio.
Choose your mode: model ID for plain inference, agent ID for autonomous work.
Decide on execution mode: synchronous for short calls, background=True for long-running tasks.
Pick or define an agent: use the default Antigravity agent, or register a custom agent with your instructions, skills and data sources.
Retrieve results: read the response inline, or poll for the completed background interaction.

Python — Interactions API (illustrative)

Illustrative pseudocode based on Google's described interface.

Verify exact method names against the official Interactions API docs.

from google import genai

client = genai.Client(api_key='YOUR_GEMINI_API_KEY')

1) Plain inference: pass a model ID

response = client.interactions.create(
model='gemini-2.5-pro', # model ID -> inference
input='Summarise our Q2 support tickets.'
)
print(response.output)

2) Autonomous agent task with background execution

job = client.interactions.create(
agent='antigravity', # agent ID -> autonomous task
input='Research competitor pricing and draft a comparison table.',
background=True # run asynchronously, server-side
)

3) Retrieve the completed background interaction later

result = client.interactions.get(job.id)
print(result.output)

For teams who'd rather not hand-roll every agent, you can explore our AI agent library for pre-built patterns that map cleanly onto the model-ID / agent-ID split. And if you're orchestrating across multiple providers, our guide to multi-agent systems shows how to keep the Interactions API as one node in a larger graph.

The implementation path: a Gemini API key, a choice between model ID and agent ID, and the background execution toggle — the three decisions that define every Interactions API call.

Pricing note (important): Google's GA announcement text does not publish specific per-token or per-call prices for the Interactions API itself. Pricing follows the underlying Gemini model and any sandbox/agent compute used. For exact, current figures, check the official Gemini API pricing page — don't rely on third-party estimates for budgeting.

AI technology migration: when to use it (and when NOT to)

The Interactions API isn't a universal answer. Map it against alternatives like this:

ScenarioUse Interactions API?Better alternative

Single-vendor Gemini app needing agents + state✅ Yes — this is the sweet spot—

Long-running research / code tasks✅ Yes — use background=True—

Multi-provider routing (Gemini + Claude + GPT)⚠️ Partial — wrap it as one nodeLangGraph as the outer orchestrator

Deterministic, low-latency single inference call⚠️ OverkillDirect model endpoint

Strict on-prem / data-residency requirement❌ No — sandbox runs server-sideSelf-hosted orchestration

No-code business automation❌ Non8n with an Interactions API node

Head-to-head comparison vs the closest competitors

CapabilityGoogle Interactions APIOpenAI (Responses/Assistants)Anthropic APILangGraphAWS Bedrock Agents

Unified model + agent endpoint✅ Native (model ID / agent ID)✅ Responses API⚠️ Messages + tool use⚠️ You build the graph✅ Agents runtime

Server-side state✅ Yes✅ Yes (threads)⚠️ Client-managed⚠️ You host it✅ Session state

Background execution✅ background=True✅ Background mode⚠️ DIY⚠️ DIY⚠️ Async invoke

Managed code sandbox✅ Remote Linux sandbox✅ Code interpreter⚠️ Via tools❌ Self-hosted✅ Code interpreter

Multi-provider portability❌ Gemini-only❌ OpenAI-only❌ Anthropic-only✅ Provider-agnostic⚠️ Bedrock model catalog

Open standard / MCP supportEmergingMCP support growing✅ MCP origin✅ MCP-friendlyEmerging

A factual distinction worth extracting: Google's Interactions API and AWS Bedrock Agents both run a managed code sandbox server-side, while LangGraph requires you to self-host that runtime — and Vertex AI Agent Builder, Google's own higher-level offering, sits one layer above the Interactions API as a no/low-code surface rather than a raw endpoint. If you want the orchestration loop owned by the provider, the Interactions API and Bedrock Agents are the direct comparison; if you want provider-agnostic control, LangGraph is the only column above that is not vendor-locked.

The strategic read: Google, OpenAI, Anthropic and AWS are all racing to absorb orchestration into the API. Framework-layer tools like LangGraph and AutoGen survive by being the provider-agnostic outer layer — the one place where vendor lock-in goes to die.

What it means for small businesses

If you run a 10-person company, the Coordination Gap was previously a hiring problem: you needed a senior engineer just to wire agents together safely. The Interactions API turns that into a configuration problem.

Concrete opportunity: a single developer can now stand up a customer-research agent that browses the web, runs code, and returns a comparison table — with a server-managed sandbox and background execution — without operating any of that infrastructure. In one engagement I reviewed, an agency had quoted roughly $8,000/month for a fully custom contractor-built orchestration stack (model wiring, a managed job queue, a hosted sandbox, and on-call coverage) to deliver exactly that capability; the equivalent on the Interactions API was a few days of internal work plus metered Gemini and sandbox compute. That gap is the entire value proposition for a small team — treat the figure as one real-world data point, not an industry average.

Concrete risk: server-side state and a server-side sandbox mean your data and execution live on Google's infrastructure. For regulated industries or strict data-residency requirements, that's a hard blocker — keep workflow automation on tooling you control. Vendor lock-in is the other quiet cost. A Gemini-only endpoint is fast to adopt and slow to leave — I'd want an abstraction layer in place before I went deep on this.

War stories: four coordination failures I have watched cost real money

None of these were model failures. Every one of them was a coordination failure that a bigger Gemini model would not have touched. I'm writing them as I lived them, anonymized.

1. The team that kept buying bigger models. A mid-stage SaaS company was convinced their support agent was unreliable because the model wasn't smart enough. They upgraded twice, watched end-to-end reliability barely move, and only then traced the failures to a race condition in their state-passing between steps — the model was answering correctly and the client was dropping context between hops. The fix that actually worked was structural: move state and the tool loop server-side so the client-side coordination failures simply stopped existing. The Interactions API does this by default with server-side state; back then they had to hand-build it.

2. The synchronous research agent that timed out in production. Another team ran a multi-minute browsing-and-parsing task synchronously inside a request handler. In testing it worked; in production it timed out, retried, and silently duplicated work — a textbook compounding-error failure that surfaced as mysterious double-charges. The cure was embarrassingly simple in hindsight: run the long task asynchronously and poll for the result. That is precisely what background=True now gives you out of the box — let Google run the interaction server-side and stop blocking the request thread.

3. The launch built on an unshipped feature. I watched a roadmap slip badly because a team architected a launch-critical capability around a provider feature still marked 'soon.' From a cloud provider, 'soon' can mean six weeks or never. The lesson I now repeat to every AI lead: ship on the GA-stable surface — Managed Agents, background execution, server-side state — and treat anything labeled forthcoming, including Gemini Omni, as an additive upgrade you bolt on later, never a dependency you bet the launch on.

4. The lock-in nobody priced in. A team adopted a single provider's agent layer for everything because it shipped fastest. Eighteen months later a pricing and capability shift made another provider the obvious choice — and the migration was a rewrite, not a config change, because the orchestration was welded to one vendor's primitives. They didn't think they cared about portability until the day they desperately did. The hedge is cheap if you do it early: keep a provider-agnostic outer layer like LangGraph and treat the Interactions API as one swappable node.

Who are its prime users

Senior engineers & AI leads building Gemini-native production agents who want to delete their custom orchestration code.
Startups (Seed–Series B) that need agent capability without a dedicated infra team — this closes a gap that previously required multiple hires.
Internal platform teams at mid-to-large enterprises standardizing on Gemini.
Developer-tooling companies wrapping the API into vertical products — Google is explicitly courting n8n-style and SDK partners to make it the default interface.
Research and data teams who need long-running, sandboxed code execution and web browsing without standing up their own infrastructure.

How to use it: a worked demonstration

Let's walk a real scenario end-to-end: a 12-person e-commerce brand wants an agent to research three competitors' pricing and return a structured comparison.

Worked example — competitor pricing agent

INPUT

input_task = (
'Browse the public pricing pages of CompetitorA, CompetitorB '
'and CompetitorC. Extract plan names and monthly prices. '
'Return a markdown comparison table.'
)

job = client.interactions.create(
agent='antigravity', # default Managed Agent: browse + code + files
input=input_task,
background=True # long-running: web browsing takes time
)

STEP: agent provisions a Linux sandbox, browses each page,

runs parsing code, assembles the table, persists state.

result = client.interactions.get(job.id)
print(result.output)

Worked demo: from one prompt to a finished comparison table

  1


    **Submit task with agent=antigravity, background=True**

One call. The brand's app fires the request and immediately gets a job ID — no blocking.

↓


  2


    **Managed Agent sandbox browses + parses**

The remote Linux sandbox visits each pricing page, runs extraction code, and handles failures internally.

↓


  3


    **Result retrieved via job ID**

The app polls and receives a clean markdown table — no orchestration code was written by the brand.

OUTPUT (illustrative): a markdown table mapping CompetitorA/B/C plan names to monthly prices — produced from a single prompt and zero self-hosted infrastructure.

Sample output (illustrative):

Output — markdown table

Competitor	Plan	Monthly Price
CompetitorA	Starter	$29
CompetitorA	Pro	$99
CompetitorB	Basic	$25
CompetitorB	Growth	$120
CompetitorC	Team	$49

For deeper retrieval needs — say, grounding the agent in your own knowledge base — pair this with a vector database and a RAG pipeline. The Interactions API handles execution; your enterprise AI data layer handles grounding.

Good practices and common pitfalls

Default long tasks to background execution. Synchronous multi-step agents are a timeout factory — I would not ship a browsing agent any other way.
Pin to the stable GA schema. Don't build production against beta-only behavior. The schema is locked now; use it.
Constrain custom agents tightly. Define explicit instructions, skills and data sources rather than open-ended autonomy. Vague agents do vague things.
Keep a provider-agnostic seam. Wrap the Interactions API so swapping to Anthropic or OpenAI is a config change, not a rewrite.
Verify pricing live. Use the official Gemini pricing page — sandbox compute and model usage both count, and the combination adds up faster than you'd expect.
Watch MCP adoption. If your tools speak MCP, plan how they map onto Interactions API tool combination before you're halfway through a migration.

Coined Framework

The AI Coordination Gap

The Interactions API is the clearest commercial attempt yet to close the Coordination Gap by pulling state, tools, sandboxing and execution into the model provider itself. The strategic question for every AI lead: do you close the gap with the vendor's API, or with a vendor-agnostic orchestration layer you control?

Average expense to use it

Honest disclosure: Google's GA announcement text does not list specific per-token or per-call prices for the Interactions API. Costs flow from two places: (1) the underlying Gemini model usage, and (2) Managed Agent sandbox compute for code execution and browsing. Budget for both — I've seen teams underestimate the sandbox line item badly.

Free tier: Google AI Studio has historically offered free experimentation tiers — confirm current limits in AI Studio.
Model usage: billed per the chosen Gemini model on the official pricing page.
Agent/sandbox compute: background execution and Managed Agents consume additional compute — budget for it as a separate line item, not a rounding error.
Total cost of ownership win: the real saving is the orchestration infrastructure you no longer host — for many small teams that's actually the largest cost the API removes.

Industry impact — who wins, who loses

Winners: Gemini-native startups and platform teams who can now ship agents without an infra org; n8n-style automation vendors who get a richer native node; and Google, which moves the developer center of gravity toward its primary API.

Pressured: framework-layer tools whose primary value was state and tool orchestration — they must pivot harder toward provider-agnostic value. As Google DeepMind normalizes server-side coordination, the 'glue code framework' thins out. That's not speculation — it's the same pattern we saw when managed Kubernetes ate the custom scheduler market.

The 2025 AI stack was 20% model and 80% plumbing. The 2026 stack is API-native coordination — and the companies still maintaining their own plumbing are paying a tax their competitors just stopped paying.

Reactions — what the community is saying

The announcement was authored by Philipp Schmid (Developer Relations Engineer, Google DeepMind) and Ali Çevik (Group Product Manager, Google DeepMind), who frame the Interactions API as 'developers' favorite way to build applications with Gemini' since the December 2025 beta. That framing echoes the broader analyst read from Gartner that orchestration and integration overhead, not raw model capability, is the dominant blocker for enterprise AI projects — exactly the surface this API removes. Expect rapid uptake in developer communities given Google's move to default all documentation to it. For the canonical reference, follow the official announcement and the Gemini developer docs — not third-party summaries, including this one, for anything schema-critical.

[
▶

Watch on YouTube
Google Interactions API & Gemini agents — developer walkthroughs
Google DeepMind • Gemini agent architecture

](https://www.youtube.com/results?search_query=Google+Interactions+API+Gemini+agents)

What forward-looking AI leads are doing this quarter: auditing custom orchestration code against the Interactions API to find what they can safely delete.

What happens next — roadmap and predictions

2026 H2


  **Gemini Omni ships into the Interactions API**

Google explicitly labels Gemini Omni as 'soon.' Its arrival will fold richer multimodal generation into the same unified endpoint — but don't hold a launch date for it.

2026 H2


  **Third-party SDKs default to the Interactions API**

Google states it is 'working with ecosystem partners to make it the default interface across 3P SDKs and Libraries' — expect wrapper updates across the tooling ecosystem. When the docs change, the gravity shifts.

2027


  **Framework layer consolidates around provider-agnostic orchestration**

As all major providers absorb coordination into their APIs, tools like LangGraph double down on cross-provider routing — the one job APIs can't own.

The trajectory of the Interactions API: from GA today to Gemini Omni and ecosystem-default status — the coordination layer is consolidating fast.

Frequently Asked Questions

What is the Interactions API and why does this AI technology matter?

The Interactions API is the AI technology Google made generally available on June 26, 2026 as the primary interface for Gemini models and agents. It unifies inference, autonomous agents, server-side state, background execution and multimodal generation behind one endpoint. It matters because it absorbs the orchestration burden teams previously hand-built — pass a model ID for inference, an agent ID for autonomous tasks, or set background=True for long-running work. See the official announcement for the canonical reference, and our enterprise AI guide for production context.

Should I migrate to the Interactions API now, or wait?

Migrate now if you are Gemini-native, running production agents, and currently maintaining your own state store, tool loop, sandbox or job queue — the schema is GA-stable and locked, so the main reason to wait is gone. Wait, or stay hybrid, if you route across multiple providers, have strict data-residency requirements that forbid a server-side sandbox, or have a launch dependent on Gemini Omni (still marked 'soon'). My rule: pilot it on one non-critical agent first, keep a provider-agnostic seam via LangGraph, and only then delete custom orchestration code. See our orchestration guide for a migration checklist.

How much does the Interactions API cost compared to direct Gemini API calls?

Google's GA announcement does not publish a separate price for the Interactions API itself — you pay for the underlying Gemini model usage exactly as you would on a direct call, plus any Managed Agent sandbox compute consumed for code execution and browsing. So a plain inference request via a model ID should cost effectively the same as a direct Gemini call, while an agent task adds sandbox compute as a distinct line item. The genuine saving is the orchestration infrastructure you no longer host. Always confirm live figures on the official Gemini pricing page rather than third-party estimates, and budget sandbox compute separately — teams routinely underestimate it.

Does background=True work with all Gemini models and agents?

Background execution is a flag on the unified endpoint that runs the interaction asynchronously on Google's servers, and it is most valuable on long-running agent tasks — research, code generation, extended browsing — rather than short single-shot inference, where synchronous calls are simpler. Because the GA announcement does not enumerate a per-model support matrix, you should verify exact model and agent compatibility against the official Gemini developer docs before architecting around it. The practical pattern: default long agent tasks to background=True and poll the job ID, and keep deterministic low-latency single inference calls synchronous. This eliminates the blocking-timeout failure mode that silently duplicates work in production.

How does the Interactions API compare to AWS Bedrock Agents and Vertex AI Agent Builder?

All three absorb orchestration into managed infrastructure, but at different layers. The Interactions API is a raw, single-endpoint primitive for Gemini with server-side state, a managed Linux sandbox and background execution. AWS Bedrock Agents is the closest direct analogue — it also runs a managed code-interpreter sandbox and session state, but over the Bedrock model catalog rather than Gemini. Vertex AI Agent Builder sits one layer higher as Google's lower-code agent surface and can build on the same coordination foundations. For provider-agnostic control across Anthropic, OpenAI and Google, only LangGraph stays unlocked.

When should I NOT migrate to the Interactions API?

Skip it — or wrap it as one swappable node rather than your whole stack — in four cases. First, strict on-prem or data-residency rules: the Managed Agent sandbox runs server-side on Google's infrastructure, so regulated data may not be allowed to leave your boundary; keep that workflow automation on tooling you control. Second, genuine multi-provider routing across Gemini, Claude and GPT — use LangGraph as the outer orchestrator. Third, deterministic low-latency single inference, where a direct model endpoint is leaner. Fourth, no-code business automation, where n8n with an Interactions API node fits better. The deciding question is always whether you can tolerate Gemini-only lock-in for that workload.

What is MCP in AI?

MCP — the Model Context Protocol — is an open standard originated by Anthropic that lets AI models connect to tools and data sources through one consistent interface. The payoff is that you expose a tool once as an MCP server and any compatible model can call it, instead of writing bespoke integrations per provider. In the Interactions API era this becomes a portability hedge: as Google's tool combination and Managed Agents mature, MCP-compatible tools keep your capabilities movable across Gemini, Claude and GPT. For teams worried about lock-in, building tools as MCP servers is the cleanest way to decouple your AI agents from any single provider's implementation.

About the Author

Rushil Shah

AI Systems Builder & Founder, Twarx

Rushil Shah is the founder of Twarx and an AI systems builder who has spent years designing autonomous workflows, multi-agent architectures, and AI-powered business tools. He writes from real implementation experience — covering what actually works in production, what fails at scale, and where the industry is heading next. His work focuses on making agentic AI practical for builders and businesses.

LinkedIn · Full Profile

This article was originally published on Twarx. Follow for daily deep dives on AI agents and automation.

DEV Community

Google Interactions API: The AI Technology Ending Agent Glue Code

Overview: most AI technology workflows are solving the wrong problem entirely

The AI Coordination Gap

What was announced — the exact facts

What is it: a plain-English explanation

How the Interactions API changes AI technology coordination — the mechanism and architecture

Complete capability list — everything it can do

How to access and use it — step by step

Illustrative pseudocode based on Google's described interface.

Verify exact method names against the official Interactions API docs.

1) Plain inference: pass a model ID

2) Autonomous agent task with background execution

3) Retrieve the completed background interaction later

AI technology migration: when to use it (and when NOT to)

Head-to-head comparison vs the closest competitors

What it means for small businesses

War stories: four coordination failures I have watched cost real money

Who are its prime users

How to use it: a worked demonstration

INPUT

STEP: agent provisions a Linux sandbox, browses each page,

runs parsing code, assembles the table, persists state.

Good practices and common pitfalls

The AI Coordination Gap

Average expense to use it

Industry impact — who wins, who loses

Reactions — what the community is saying

What happens next — roadmap and predictions

Frequently Asked Questions

What is the Interactions API and why does this AI technology matter?

Should I migrate to the Interactions API now, or wait?

How much does the Interactions API cost compared to direct Gemini API calls?

Does background=True work with all Gemini models and agents?

How does the Interactions API compare to AWS Bedrock Agents and Vertex AI Agent Builder?

When should I NOT migrate to the Interactions API?

What is MCP in AI?

About the Author

Top comments (0)