Arthur

Posted on May 24

Antigravity 2.0 in one day: the four shells and what each is good for

#devchallenge #googleiochallenge #gemini #agents

Google I/O Writing Challenge Submission

This is a submission for the Google I/O Writing Challenge

A field guide to the four ways Antigravity 2.0 lets you drive an AI coding agent, with a 10-minute SDK recipe for writing your first skill.

The architectural news worth keeping

The most useful sentence Google I/O 2026 produced was not on the main stage. It was Kevin Howe, in the Google Cloud Live segment afterward, defining the term that had been circling the keynote all morning. Asked what a harness actually is, Howe gave the answer in two beats. The model first: "LLMs are really just tokens in, tokens out." Then the layer around the model: a harness wraps the tokens-in/tokens-out core and gives the agent senses (codebase state, filesystem, environment signals) and limbs (the tools the agent can call). The set of primitives a complex task decomposes into is essentially what defines a harness.

That single framing reorganizes the entire 2.0 announcement. On the surface, Google shipped a new AI IDE. Underneath, Google did something more interesting: it consolidated one agent execution layer and exposed it through four interchangeable shells. Same harness, same tools, same skill format. Different chrome.

This is a field guide to those four shells, with one shell taken end to end so you can run it tonight.

The four shells, briefly

One harness, four shapes. The choice between them is a workflow question, not a power question.

Antigravity editor. The standalone IDE familiar to existing users. Best when you want source code on screen most of the time and traditional file diffs in your review loop.
Antigravity 2.0 (the Manager). A new Electron desktop app, built around conversations and agent artifacts. Best when you are juggling three to five agents at once on separate git worktrees and want an agent-inbox view, not a code editor view.
Antigravity CLI (agy). Terminal-first, scriptable, lives nicely in SSH sessions to GPU boxes and CI runners. Authenticates via Google Cloud OAuth. The unified successor to Gemini CLI: available to all Gemini CLI users at I/O, with published migration guides.
Antigravity SDK (pip install google-antigravity). A Python package that drops the same harness into your own scripts and apps. Takes a plain GEMINI_API_KEY from Google AI Studio. The fastest path to programmatic control of the harness.

A compressed picker:

If your workflow is...	Reach for
One agent, one repo, see diffs	Editor
Many agents in parallel	Manager 2.0
SSH, GPU boxes, dotfiles sacred	CLI
Embed agents in your own product	SDK

The SDK is the shell I want to walk you through, because it is the one you can install and have producing real output in ten minutes, with no desktop session and no OAuth round-trip.

Why skills are the unlock

The single most important primitive in Antigravity 2.0 is not a tool, a model, or a UI. It is the skill: a Markdown file that tells the agent how to do a category of work. The Antigravity harness reads skills lazily, on demand, so you can stash dozens in a directory and the agent picks the relevant one when it needs it.

Here is the skill I used for the rest of this guide. Thirty lines of Markdown. Save it as AGENTS.md at the root of any small project.

# Directory triage skill

You are a directory-triage assistant. When asked to summarize or improve
a directory, follow this procedure exactly.

## Inputs

- Working directory: assume CWD unless told otherwise.
- Skip: `node_modules/`, `.git/`, `dist/`, `build/`, `.venv/`, anything
  in `.gitignore`.

## Steps

1. List the directory tree to depth 2 with `list_directory`.
2. Read `README.md` if present; otherwise read the first `*.md` you find.
3. Read `package.json` / `pyproject.toml` / `go.mod` if present.
4. Produce a 3-paragraph summary covering, in order: purpose, stack,
   current state.
5. Propose exactly three concrete improvements, each as one sentence
   stating *why* and *which file(s) would change*.

## Constraints

- Do not edit any files in this skill: read-only.
- Do not invoke `run_command` for anything other than `git status` or
  `git log -5`.
- If `AGENTS.md` exists in a subdirectory, prefer its instructions for
  that subtree.
- Cite this skill by name ("directory triage skill") at the top of
  your final summary so a reviewer can confirm you used it.

That is the whole skill. There is no framework. There is no DSL. The "language" is Markdown plus the names of the harness's built-in tools (list_directory, view_file, run_command, and friends), which the agent already knows.

The SDK loads the skill via LocalAgentConfig.skills_paths. The minimum runnable Python is short:

import asyncio
from google import antigravity as ag

async def main():
    cfg = ag.LocalAgentConfig(
        skills_paths=["./skills"],
        workspaces=["./tiny-rss"],
        system_instructions="You are a careful staff engineer.",
    )
    async with ag.Agent(cfg) as agent:
        resp = await agent.chat(
            "Triage the workspace using the directory triage skill."
        )
        print(await resp.text())

asyncio.run(main())

I pointed this at a small Mattermost RSS bridge of mine called tiny-rss (Python 3.10, feedparser plus httpx, an infinite loop with a 10-minute sleep). The agent read the three source files, ran git status, cited the directory triage skill by name at the top of its summary, and returned exactly three improvements with file anchors:

Persistent deduplication. Move the in-memory seen set to SQLite or a JSON file so restarts do not reprocess every feed item. Touches main.py.
Error handling and retries. Wrap the httpx.post calls in try/except with exponential backoff so a flaky Mattermost endpoint does not stall the loop. Touches main.py.
Configuration validation. Parse and validate env vars at startup, fail loudly on missing MATTERMOST_WEBHOOK_URL. Touches main.py and possibly pyproject.toml.

Useful, file-anchored, opinionated. Each improvement was something I would actually merge.

The cost ledger from two runs of the same prompt, against the same skill and target, captures the harness's character:

Run 1: 28 tool calls, 230,496 tokens, 119 seconds wall.
Run 2: 21 tool calls, 128,281 tokens, 99 seconds wall.

Two runs, same artifact shape, different exploration depth. That is the price of giving the agent latitude to discover the project on its own, and it is a price worth paying once you see what the agent does with what it learns. Think of those tokens as tuition: you are paying the agent to understand your code so its three suggestions are about your main.py, not a generic RSS bridge.

Two tips before you sit down at the keyboard

The CLI and the SDK want different credentials, and that is the one piece of friction worth flagging up front. The CLI authenticates with Google Cloud OAuth (it calls Google's internal code-assist backend, the same one the editor and Manager use), so reach for it when you are already signed into Google Cloud at a workstation. The SDK takes a plain GEMINI_API_KEY from Google AI Studio, so reach for it when you want CI, headless servers, or self-hosted automation. Second tip: skills are discovered at runtime, not preloaded into the system prompt. A single hint in your user prompt ("Use the directory triage skill in AGENTS.md") collapses the exploration overhead and keeps your token spend predictable, the difference between Run 2 and Run 1 above.

Try it tonight, a 10-minute recipe

If you have ten minutes and a small Python project lying around, you can have the harness producing real output before the kettle boils.

Install in a clean venv.

   python3 -m venv .venv
   .venv/bin/pip install google-antigravity

This pulls the SDK, the transitive google-genai client, and the harness binary that does the actual tool execution.

Set your key. Grab a GEMINI_API_KEY from aistudio.google.com, then export GEMINI_API_KEY=... in the same shell.
Save the skill. Drop the 30-line AGENTS.md from the previous section into ./skills/AGENTS.md next to your project directory.
Run the snippet above (asyncio.run(main())), pointing workspaces= at any small project you know well. Watch the agent walk the tree, read the README and pyproject.toml, and produce its three improvements.
Read the trace, then tweak. The SDK streams every tool call as JSON; pipe stdout through tee run.jsonl if you want to keep it. Tighten the skill (add a ## Output format heading, ask for a fourth improvement, forbid the agent from suggesting tests), and run again. The artifact shape should change in exactly the way you asked.
Try Manager 2.0 next if you want the same skill, same artifact, agent-inbox chrome. The same AGENTS.md works there with no changes; the harness underneath is identical.

That is the whole loop. Install, key, skill, script, read, iterate. You are now writing for the harness, which is the interesting layer 2026 produced.

Closing

The shells separate workflow from the harness underneath, and that is the genuinely good news from I/O 2026. You get to pick an interface that matches how you like to work without giving up portability of your skills, your tools, or your muscle memory. The Markdown skill you write tonight in the SDK will run unchanged in Manager 2.0 tomorrow, and on the CLI on a GPU box next week. That is the consolidation, and it is the part of the announcement that compounds.

The harness is the runtime now. Pick a shell, write a skill, and the rest of 2026 gets easier.

Pick your editor by what feels good. Pick your harness like it is a runtime, because it is.

Which shell are you reaching for first? I would like to compare notes on the skills you write.

DEV Community