Abdelrahman Farag

Posted on May 22

Antigravity 2.0 CLI Through the Eyes of a DevOps Engineer — What Google I/O 2026 Means for Infrastructure Automation

#devchallenge #googleiochallenge #ai #devops

Google I/O Writing Challenge Submission

This is a submission for the Google I/O Writing Challenge

Google I/O 2026 was, by any measure, an AI-agents showcase. Gemini 3.5 Flash, Gemini Omni, intelligent eyewear — the keynote was packed. But the announcement that made me sit up straight wasn't a model release or a consumer gadget. It was a developer tool: Antigravity 2.0, and specifically its new CLI.

I'm a Cloud and DevOps engineer. I spend my days inside AWS — writing Ansible playbooks, rotating IAM keys, remediating CVEs across fleets of EC2 instances, debugging Lambda permissions, and wiring up WAF logging compliance. My terminal is my office. So when Google announced a CLI that can orchestrate autonomous coding agents from the command line, integrate into CI/CD pipelines, and share the same agent harness as the full desktop IDE — it landed differently for me than it might for a frontend developer watching the keynote for Flutter updates.

This article is my honest first-look assessment: what Antigravity CLI actually offers, why it matters for infrastructure work, where the gaps are, and whether it changes anything for engineers already deep in the AWS ecosystem.

What Google Actually Shipped

Let me break down what Antigravity 2.0 is, because the branding is doing a lot of heavy lifting.

The original Antigravity launched in late 2025 as an AI-first IDE — essentially a VS Code fork with Gemini baked in. Version 2.0, announced at I/O on May 19, is no longer a single product. It's now a platform with five surfaces:

Antigravity 2.0 Desktop — a standalone app (not the old IDE) built around agent orchestration, with dynamic subagents, scheduled tasks, and voice commands
Antigravity CLI — a terminal-native tool rewritten in Go, designed for automation and CI pipeline integration
Antigravity SDK — programmatic access to the agent harness for custom agent behaviors on your own infrastructure
Managed Agents in the Gemini API — server-side agent execution in isolated Linux environments, billed per-run
Gemini Enterprise Agent Platform — the enterprise deployment path through Google Cloud

The CLI replaces the older Gemini CLI (Google is encouraging migration), and it preserves key capabilities: Agent Skills, Hooks, Subagents, and Extensions (now called plugins).

Underneath it all, the default model is Gemini 3.5 Flash — which scored 76.2% on Terminal-Bench 2.1, 1656 Elo on GDPval-AA (a real-world agentic benchmark), and 83.6% on MCP Atlas for tool-use reliability. Google claims it's 4x faster than comparable frontier models and often at less than half the cost ($1.50 input / $9.00 output per million tokens).

Why This Matters for DevOps — Not Just App Development

Most coverage of Antigravity 2.0 focuses on app development: scaffolding a React app, generating unit tests, vibe-coding a web UI. Fair enough — that's who Google was pitching to in the keynote. But the architectural decisions here have deeper implications for infrastructure engineers.

The CLI Is Built for Pipelines, Not Just Humans

The fact that the CLI shares the same agent harness as the desktop app is the key detail. This means agents running in your terminal have the same capabilities as agents in the GUI — subagent spawning, parallel execution, tool use — but wrapped in a surface designed for scripted automation.

For a DevOps engineer, this immediately suggests use cases:

CVE remediation workflows: Imagine an agent that reads a vulnerability scan report, identifies affected packages across your fleet, generates the patching commands per OS (AL2, AL2023, Ubuntu, RHEL all need different approaches), and stages the Ansible playbook. Today I do this manually, cross-referencing CVE databases, checking package managers, testing in staging. An agentic CLI that can spawn subagents per OS family and work in parallel would compress a multi-day task into hours.
IAM policy auditing: Feed the agent your CloudTrail logs and current IAM policies, ask it to identify over-permissioned roles and generate least-privilege replacements. The parallel subagent architecture means it could analyze multiple roles simultaneously.
Incident investigation: When a production batch job fails — say, a missing file that should have been generated by an upstream process — the agent could trace the execution path across CloudWatch logs, check cron configurations, and identify the parameter mismatch. This is the kind of multi-step, multi-source investigation that eats up entire afternoons.

Scheduled Tasks Turn Agents Into Persistent Automation

Antigravity 2.0 introduces scheduled tasks — agents that run on cron schedules or at fixed times, without manual prompting. This moves AI agents from "interactive assistant" to "background automation," which is exactly the abstraction DevOps engineers already think in.

Imagine scheduling an agent to run nightly that checks your ECR image scan results, compares against your vulnerability thresholds, and opens tickets (or even PRs) for anything that needs attention. That's not science fiction — that's a cron job with an LLM in the loop.

The Honest Assessment: Where I'm Skeptical

Here's where I stop being excited and start being a DevOps engineer.

Trust and Auditability

In my current work, every change to production infrastructure goes through a documented, auditable process. When I remediate a penetration test finding — say, removing a webshell from a Tomcat server — I document exactly what I did, why, and what the before/after state looks like. The output is a formal Word document in a specific security template.

Agentic AI that "independently navigates complex tasks" is exactly the kind of thing that makes security auditors nervous. The fundamental question isn't "can the agent do it?" but "can I prove what the agent did, and why, and that nothing else was touched?" Google mentions "hardened Git policies" and "credential masking," but the audit trail story for agent-executed infrastructure changes is still immature across the industry, not just for Google.

The AWS-Shaped Elephant in the Room

Antigravity is deeply integrated into Google's ecosystem: Google AI Studio, Firebase, Cloud Run, Android Studio. For someone already working in Google Cloud, the idea-to-production flow is cohesive.

But I work in AWS. My infrastructure is EC2, Lambda, S3, RDS, CloudWatch, WAF. My CI/CD runs through GitLab. My configuration management is Ansible via SSM.

The Antigravity CLI is model-agnostic to a degree — it supports Claude Sonnet 4.5 and GPT-OSS alongside Gemini — but the integrations are Google-first. There's no native connection to AWS CloudFormation, no SSM integration, no understanding of IAM policy syntax out of the box.

This isn't a dealbreaker. The SDK surface and plugin architecture could theoretically be extended to AWS tooling. But "could theoretically" and "works today in production" are separated by a lot of engineering effort. For now, the value proposition is strongest for teams already in or willing to move into Google's orbit.

Cost at Scale

The new AI Ultra plan is $100/month with 5x the Pro limits. Ultra Premium is $200/month with 20x limits. Managed Agents bill per run, not per token, and Google warns that long-running agents can get expensive.

For a solo DevOps engineer or small team, the math works if the agent saves even a few hours per week. But for an enterprise running dozens of agents across multiple pipelines? The cost model needs careful evaluation, especially since Managed Agents abstract away the token-level granularity that lets you optimize spend.

What I'd Actually Try First

If I were to integrate Antigravity CLI into my workflow today, I'd start narrow:

Documentation generation: I already spend significant time writing remediation documents — describing findings, listing steps taken, documenting before/after states. An agent that watches my terminal session, understands the context from the CVE or pentest finding, and drafts the compliance document would be immediately valuable and low-risk (I'd review everything before submission).
Policy analysis: Feed it an IAM policy JSON and ask it to identify violations of least-privilege principles, cross-referenced against actual CloudTrail usage. This is read-only analysis, so the blast radius of a mistake is zero.
Runbook generation: Convert my mental models for incident response into structured runbooks. The agent can ask clarifying questions, identify gaps, and produce something that a junior engineer could actually follow.

Notice what's missing: I wouldn't let it execute infrastructure changes unsupervised. Not yet. Maybe not for a long time. The agent is most valuable to me as an accelerator for the cognitive work around infrastructure — the analysis, the documentation, the planning — rather than as an autonomous executor.

The Bigger Picture: Agents Are Coming to Infrastructure Whether We're Ready or Not

Google isn't the only one here. Anthropic has Claude Code. Amazon has their own agentic coding tools. The entire industry is converging on the idea that AI agents should be able to plan, code, test, and deploy software with minimal human intervention.

For DevOps engineers, this creates an interesting tension. Our entire discipline exists because "move fast and break things" doesn't work when the things you break are production databases and customer-facing services. We introduced CI/CD, infrastructure as code, and policy-as-code precisely to add guardrails to the deployment process.

Now the industry wants to put AI agents inside those guardrails. The question isn't whether this will happen — it will. The question is whether the tooling will mature fast enough to earn the trust of the people responsible for keeping systems running.

Antigravity 2.0 is a serious step. The CLI-first surface, the parallel agent architecture, the scheduled task capability, and the enterprise deployment path show that Google is thinking about production workflows, not just demos. But the integration depth outside Google's ecosystem, the audit story, and the cost model all need more time.

My Verdict

Antigravity 2.0 CLI is the most DevOps-relevant announcement from Google I/O 2026. Not because it solves my problems today, but because it's the first time a major platform has shipped an agent orchestration tool that speaks the language of infrastructure automation: terminals, pipelines, scheduled execution, and programmatic SDKs.

If you're already in Google Cloud, this is worth trying immediately. If you're in AWS like me, keep a close eye on the SDK and plugin ecosystem — the architecture is right, even if the integrations aren't there yet.

The era of agentic DevOps is arriving. The engineers who learn to supervise AI agents effectively — knowing when to delegate and when to intervene — will have a significant advantage. Antigravity 2.0 is one of the first tools purpose-built for that transition.

Abdelrahman Farag is an AWS Cloud & DevOps Engineer at Sopra Steria and an MSc candidate in AI Research at UIMP. He works across cloud infrastructure, security remediation, and applied AI.

DEV Community