xbill for Google Developer Experts

Posted on Jul 1 • Originally published at Medium on Jul 1

Google Omni Flash Preview with MCP and Antigravity CLI

#apiinteraction #antigravitycli #agents #mcps

This article covers the MCP setup and configuration for using Google Omni Preview and underlying Gemini LLM with Antigravity CLI to enhance and extend Omni operations over MCP.

What is Google Omni?

Google Gemini Omni (often referred to simply as Google Omni) is a powerful, multimodal AI video model developed by Google. It allows users to generate, edit, and iterate on video content using any combination of text, images, video, and audio inputs through simple, conversational prompts. [1, 2, 3]

More details are here:

https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-omni

Gemini Omni

and here:

Start building with Nano Banana 2 Lite and Gemini Omni Flash

But wait- there is more!

Gemini Omni Flash Preview | Gemini Enterprise Agent Platform | Google Cloud Documentation

Antigravity CLI

Antigravity CLI is the follow-on successor to Gemini CLI- the terminal driven, agent assisted coding tool.

Full details on installing Antigravity CLI are here:

Getting Started with Antigravity CLI

Testing the Antigravity CLI Environment

Once you have all the tools in place- you can test the startup of Antigravity CLI.

You will need to authenticate with a Google Cloud Project or your Google Account:

agy

This will start the interface:

Full Installation Instructions

The detailed installation instructions for Antigravity CLI are here:

Getting Started with Antigravity CLI

What is MCP?

Unless you have been living off grid without Internet- MCP is the new universal connector and next “Big Thing”.

More information is here:

What is Model Context Protocol (MCP)? A guide

Python MCP Documentation

The official GitHub Repo provides samples and documentation for getting started:

GitHub - modelcontextprotocol/python-sdk: The official Python SDK for Model Context Protocol servers and clients

Isn’t MCP overkill?

There are many ways to call the Omni models. Native MCP support in Google Antigravity opens up a new pathway to call the underlying Omni model.

Google Antigravity utilizes the Model Context Protocol (MCP) as a universal, open-standard “bridge” that allows AI agents to securely interact with external tools, APIs, and data sources. Combined with models like Gemini Omni , this integration allows developers to build rich, agentic workflows natively. [1, 2, 3, 4]

Model Context Protocol (MCP) in Antigravity

MCP acts as a standardized communication layer for AI. Instead of needing hard-coded integrations for every data source or platform, MCP allows Antigravity’s AI agents to securely query external environments, ranging from your local files to cloud infrastructure. [1, 2]

Universal Tool Access: Allows your Gemini Omni-powered agent to discover and trigger specific tools (like reading logs, querying databases, or executing scripts) on demand. [1]
Security & Control: Supports both local execution and managed remote MCP servers hosted on Google Cloud, letting you limit access with granular approval gates. [1, 2]
Resource Discovery: Features Agentic Resource Discovery (ARD) so agents know exactly what tools and data are available in your ecosystem. [1]

Integrating Antigravity with Google Omni

Pairing Antigravity with Gemini Omni expands the agent’s core capabilities, allowing it to act on multimodal inputs and execute complex tasks on your behalf. [1, 2, 3]

Enhanced Reasoning & Speed: The Antigravity agent harness combined with Omni models enables asynchronous subagents, allowing the system to delegate parallel background tasks. [1]
Workspaces: Antigravity 2.0 organizes MCP usage around specific projects, meaning you can configure MCP configurations (via mcp_config.json) per workspace directory. [1, 2, 3, 4, 5]
Multimodal Actions: Agents using Gemini Omni can generate visual artifacts, trigger browser testing, and interact seamlessly with developer tools. [1, 2, 3]

So what is the Secret Sauce?

The Interactions API allows for a clean consistent call to the model.

The Interactions API is Google’s unified interface for building with Gemini AI models and autonomous agents. Rather than using multiple separate endpoints for text chat, tool use, image generation, or background agents, it provides a single, streamlined API to handle complex, stateful workflows. [1, 2, 3]

More information is here:

Interactions API | Gemini API | Google AI for Developers

and here:

Gemini Interactions API | Gemini API | Google AI for Developers

There is A Skill for That!

You can make the magic happen!

Interactions API: our primary interface for Gemini models and agents

npx skills add google-gemini/gemini-skills --skill gemini-interactions-api

Setup the Basic Environment

At this point you should have a working Python environment and a working Antigravity CLI installation. The next step is to clone the GitHub samples repository with support scripts:

cd ~
git clone https://github.com/xbill9/omni-flash-video-agent

Then run init.sh from the cloned directory.

The script will attempt to determine your shell environment and set the correct variables:

cd omni-flash-video-agent
source init.sh

If your session times out or you need to re-authenticate- you can run the set_env.sh script to reset your environment variables:

cd omni-flash-video-agent
source set_env.sh

Variables like PROJECT_ID need to be setup for use in the various build scripts- so the set_env script can be used to reset the environment if you time-out.

Running the Python Code

First- switch the directory with the Python version of the MCP sample code:

xbill@penguin:~/omni-flash-video-agent$ make install
pip install -r requirements.txt

The project can also be linted:

xbill@penguin:~/omni-flash-video-agent$ make lint
ruff check .
All checks passed!
ruff format --check .
4 files already formatted
mypy .
Success: no issues found in 4 source files

xbill@penguin:~/omni-flash-video-agent$ make test
python test_agent.py
🟢 Imports check passed successfully!
.🟢 Server tools definition check passed successfully!
.
----------------------------------------------------------------------
Ran 2 tests in 0.772s

OK

Antigravity CLI mcp_config.json

A sample MCP server file is provided in the .agents directory:

{
  "mcpServers": {
    "omni-video-agent": {
      "command": "python3",
      "args": [
        "/home/xbill/omni-flash-video-agent/server.py"
      ]
    }
  }
}

Note: The generated config file will have the Gemini Keys in plain text in the file. This is a workaround for an Antigravity CLI limitation. If you are forking the project MAKE SURE TO CLEAN THE KEYS!!!

Validation with Antigravity CLI

The final connection test uses Antigravity CLI as a MCP client with the Python code providing the MCP server:

/mcp list

MCP Servers

Plugins (~/.gemini/config/plugins)
> ✓ google-dev-knowledge Tools: search_documents, answer_query, get_documents
   ✓ omni-video-agent Tools: generate_video, edit_video, animate_image, interpolate_images, generate_with_subjects, +2
                       more

  ### 3. omni-video-agent 

  Provides endpoints for stateful video generation, editing, keyframe interpolation, and image animation using Gemini
  Omni Flash ( gemini-omni-flash-preview ).

  • Lazy-loaded tools:
      • generate_video : Text-to-Video generation.
      • edit_video : Stateful editing of an existing video session using an interaction ID.
      • animate_image : Bring a static image to life with a motion prompt.
      • interpolate_images : Create a transition/timelapse between a start and end image.
      • generate_with_subjects : Video generation maintaining identity consistency from subject reference images.
      • edit_user_video : Upload a standard local video and perform natural language edits or stylization.
      • get_help : Fetch help/instructions for video agent operations.

Omni Video Management via MCP

The MCP tools provide a complete suite of agent-oriented operations for managing the Omni Video calls.

Overview of MCP tools :

● omni-video-agent/get_help(Call get_help MCP tool) (ctrl+o to expand)
  Here is the summary, prompting best practices, and usage guide for the available MCP tools in the Gemini Omni Flash
  Video Agent:

  ### 🛠️ Available MCP Tools

  1. generate_video 
      • Description: Generates an initial video from a text prompt.
      • Parameters:
          • prompt (str): Text description of the video.
          • aspect_ratio (str, default: '16:9' ): '16:9' (landscape) or '9:16' (portrait).
          • delivery (str, default: 'inline' ): 'inline' (base64 bytes) or 'uri' (Google File API delivery).

  2. edit_video 
      • Description: Edits a previously generated video statefully using its interaction ID.
      • Parameters:
          • previous_interaction_id (str): Interaction ID of the video from the previous turn.
          • edit_prompt (str): Natural language description of what to change.
          • delivery (str, default: 'inline' ): 'inline' or 'uri' .

  3. animate_image 
      • Description: Animates a static local image using a motion description.
      • Parameters:
          • image_path (str): Path to the local image file.
          • motion_prompt (str): Instructions on how the image should animate.
          • delivery (str, default: 'inline' ): 'inline' or 'uri' .

  4. interpolate_images 
      • Description: Creates an interpolation transition video between two local keyframe images.
      • Parameters:
          • start_image_path (str): Path to the first image.
          • end_image_path (str): Path to the final image.
          • prompt (str): Instruction detailing the transition (e.g., sunset progression).
          • delivery (str, default: 'inline' ): 'inline' or 'uri' .

  5. generate_with_subjects 
      • Description: Generates a video incorporating specific subjects provided as reference image paths.
      • Parameters:
          • subject_image_paths (list[str]): List of local paths to subject images.
          • prompt (str): Description of the scene and subject actions.
          • delivery (str, default: 'inline' ): 'inline' or 'uri' .

  6. edit_user_video 
      • Description: Uploads a local video using the Gemini File API and edits it with Gemini Omni Flash.
      • Parameters:
          • video_path (str): Path to the local video file to upload and edit.
          • edit_prompt (str): Instruction of what to change in the video.
          • delivery (str, default: 'inline' ): 'inline' or 'uri' .

  7. get_help 
      • Description: Returns this summary, prompting best practices, and usage guide.

  ──────
  ### 📦 Delivery Modes

  • inline : Returns video data embedded as base64. Fast and convenient for small clips (< 4MB).
  • uri : Recommended for larger clips. Uploads and downloads via the Google File API to bypass payload limit issues.
  ──────
  ### 💡 Prompting Best Practices for Cinematic Control

  1. Scene Layout: Describe the environment, subjects, clothing, and spatial arrangement.
  2. Subject Action: Be specific about movement (e.g., "The cat slowly sips its tea, lifting the warm mug with both
  paws").
  3. Camera & Motion: Use cinematic vocabulary (e.g., panning, tracking shot, crane shot, slow zoom, cinematic close-
  up).
  4. Lighting & Mood: Specify lighting conditions (e.g., volumetric lighting, golden hour, cyberpunk neon glow, moody
  shadows).
  5. Style: State the style clearly (e.g., photorealistic 3D render, Pixar animation style, macro photography, flat
  design 2D vector).
  ──────
  ### 🔗 Project Files & Documentation

  • GEMINI.md - Reference and Cheat Sheet for Gemini Omni Flash API and endpoints.
  • README.md - Setup guide, installation instructions, and script running procedures.
  • server.py - Main service backend implementation for all MCP video tools.

We have Lift Off!

Finally- the MCP tools are now available for working with the Omni models:

> Generate a video of the Omni Preview model in a futuristic data center

https://www.youtube.com/watch?v=lJutIwddan8

> Generate a video of the Gemma4 model deployed on the NVIDIA L4 GPU

https://www.youtube.com/watch?v=l-n4fHXsqaU

Results:

> Update the video to include the GEMMA12B model on GPU

https://www.youtube.com/watch?v=bDPbqE2g7cE

> use the mcp tool animate_image with omni.jpg

56.4k in | 209 out | 122 thinking
● Bash(find /home/xbill -name "omni.jpg" 2>/dev/null) (ctrl+o to expand)

  I am locating the omni.jpg file on your system to perform the animation. I'll proceed as soon as the search
  completes.

? What motion prompt or animation effect would you like to apply to [omni.jpg](file:///home/xbill/omni-flash-video-
agent/omni.jpg)?
> Bring the subject/character to life with natural movements (blinking, smiling, breathing)

● omni-video-agent/animate_image(Animate image) (ctrl+o to expand)

  The static image omni.jpg has been successfully animated!

  ### Animation Details:

  • Motion Prompt: "Bring the subject/character to life with natural movements (blinking, smiling, breathing), slow
  camera zoom, photorealistic details."
  • Saved Video: animated_1782861971.mp4
  • Interaction ID: v1_ChdrVkJFYXJ5SE8tYXUxTWtQMmM3aDBRNBIXa1ZCRWFyeUhPLWF1MU1rUDJjN2gwUTQ

https://www.youtube.com/watch?v=pTpUZOrWxpI

> interpolate_images omni.jpg edit_local_1782839741_1c2c5bd5.jpg

● Bash(find /home/xbill -name "edit_local_1782839741_1c2c5bd5.jpg" 2>/dev/null) (ctrl+o to expand)

? What transition prompt or effect would you like to use to interpolate between the two keyframes?

Question
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

Question 1/1: What transition prompt or effect would you like to use to interpolate between the two keyframes?

> 1. (Recommended) A smooth, seamless morph and transition between the two images
  2. A gradual lighting shift and atmospheric transition
  3. A cinematic zoom out while transforming from the first subject into the second
  4. Write-in...

  The transition video has been successfully generated by interpolating between the two images!

  ### Interpolation Details:

  • Start Image: omni.jpg
  • End Image: edit_local_1782839741_1c2c5bd5.jpg
  • Prompt: "A smooth, seamless morph and transition between the two images."
  • Saved Video: interpolation_1782862638.mp4
  • Interaction ID: v1_ChdMVk5FYXZDLUY5ejBqckVQLXFDRWlBURIXTFZORWF2Qy1GOXowanJFUC1xQ0VpQVE 

https://www.youtube.com/watch?v=jSJARsM94sk

> generate_with_subjects v1_ChdMVk5FYXZDLUY5ejBqckVQLXFDRWlBURIXTFZORWF2Qy1GOXowanJFUC1xQ0VpQVE
  v1_ChdrVkJFYXJ5SE8tYXUxTWtQMmM3aDBRNBIXcUZGRWFxLTdJYy1NMzlJUDhKbXl3UUU

10.4k in | 1.6k out | 1.1k thinking
● omni-video-agent/generate_with_subjects(Generate video with subjects) (ctrl+o to expand)

? You provided two interaction IDs. Would you like to use [omni.jpg](file:///home/xbill/omni-flash-video-
agent/omni.jpg) and [edit_local_1782839741_1c2c5bd5.jpg](file:///home/xbill/omni-flash-video-
agent/edit_local_1782839741_1c2c5bd5.jpg) as the subject images, and what prompt should we use?
> (Recommended) Use omni.jpg and edit_local_1782839741_1c2c5bd5.jpg as subjects: 'A scene showing both subjects
interacting in a futuristic steampunk environment'

  The subject-based video has been successfully generated!

  ### Video Details:

  • Subject Images:
      1. omni.jpg
      2. edit_local_1782839741_1c2c5bd5.jpg
  • Prompt: "A scene showing both subjects interacting in a futuristic steampunk environment"
  • Saved Video: subject_1782862973.mp4
  • Interaction ID: v1_ChdmRlJFYXNXREI3S3pzT0lQeU5tVTBRRRIXZkZSRWFzV0RCN0t6c09JUHlObVUwUUU 

https://www.youtube.com/watch?v=M6Aur9j5wwM

> edit_user_video • Saved Video: animated_1782861971.mp4
    • Interaction ID: v1_ChdrVkJFYXJ5SE8tYXUxTWtQMmM3aDBRNBIXa1ZCRWFyeUhPLWF1MU1rUDJjN2gwUTQ - make it steam punk

6.9k in | 559 out | 334 thinking
● omni-video-agent/edit_video(Edit video) (ctrl+o to expand)

  The video has been successfully edited to apply a steampunk theme!

  ### Edited Video Details:

  • Style Edit: Steampunk (brass gears, copper pipes, Victorian sci-fi atmosphere)
  • Saved Video: edit_1782862249.mp4
  • Interaction ID: v1_ChdrVkJFYXJ5SE8tYXUxTWtQMmM3aDBRNBIXcUZGRWFxLTdJYy1NMzlJUDhKbXl3UUU

https://www.youtube.com/watch?v=K4PFQ2LGGik

One last thought- Where are Agents?

Where are the Agents?

The Antigravity CLI harness itself uses agents to manage the workflow.

The additional touch point for leveraging the MCP interface is that they all are exposed as MCP tools. The MCP tools can be called directly from Agent code or used as a template for Interactions API calls embedded in an Agent.

Summary

The strategy for using MCP with Omni and Antigravity CLI was validated with a incremental step by step approach.

A minimal stdio transport MCP Server was started from Python source code and validated with Antigravity CLI running as a MCP client in the same local environment. This Python server provided all of the management tools to interact with the Omni models.

DEV Community