DEV Community

xbill for Google Developer Experts

Posted on • Originally published at xbill999.Medium on

Building a Multimodal Agent with the ADK, Azure ACI, and Gemini Flash Live 3.1

Leveraging the Google Agent Development Kit (ADK) and the underlying Gemini LLM to build Agentic apps using the Gemini Live API with the Python programming language deployed to Azure Container Instances.

Aren’t There a Billion Python ADK Demos?

Yes there are.

Python has traditionally been the main coding language for ML and AI tools. The goal of this article is to provide a minimal viable basic working ADK streaming multi-modal agent using the latest Gemini Live Models.

In the Spirit of Mr. McConaughey’s “alright, alright, alright”

So what is different about this lab compared to all the others out there?

This is one of the first implementations of the latest Gemini 3.1 Flash Live Model with the Agent Development Kit (ADK). The starting point for the demo was an existing Code lab- which was updated and re-engineered with Gemini CLI.

The original Codelab- is here:

Way Back Home - Building an ADK Bi-Directional Streaming Agent | Google Codelabs

What Is Python?

Python is an interpreted language that allows for rapid development and testing and has deep libraries for working with ML and AI:

Welcome to Python.org

Python Version Management

One of the downsides of the wide deployment of Python has been managing the language versions across platforms and maintaining a supported version.

The pyenv tool enables deploying consistent versions of Python:

GitHub - pyenv/pyenv: Simple Python version management

As of writing — the mainstream python version is 3.13. To validate your current Python:

python --version
Python 3.13.12
Enter fullscreen mode Exit fullscreen mode

Azure Container Instances

Azure Container Instances (ACI) is a serverless, managed service that allows you to run Docker containers in the cloud without managing virtual machines. It is ideal for rapid deployment, bursting, and simple, isolated applications, offering per-second billing and quick startup times. ACI supports Linux and Windows containers, with options for volume mounting and GPU resources.

More details are available here:

https://azure.microsoft.com/en-us/products/container-instances

Why would I want Gemini CLI with Azure? Isn’t that a Google Thing?

Yes- Gemini CLI leverages the Google Cloud console and Gemini models but it is also open source and platform agnostic. Many applications are already cross-cloud so this enables familiar tools to be run natively on Microsoft Azure.

Azure Container Instance Configuration

To configure your Azure Service with the base system tools- this article provides a reference:

MCP Development with Python, and the Azure Container Instance

Gemini Live Models

Gemini Live is a conversational AI feature from Google that enables free-flowing, real-time voice, video, and screen-sharing interactions, allowing you to brainstorm, learn, or problem-solve through natural dialogue. Powered by the Gemini 3.1 Flash Live model , it provides low-latency, human-like, and emotionally aware speech in over 200 countries.

More details are available here:

Gemini 3.1 Flash Live Preview | Gemini API | Google AI for Developers

The Gemini Live Models bring unique real-time capabilities than can be used directly from an Agent. A summary of the model is also available here:

https://deepmind.google/models/model-cards/gemini-3-1-flash-live/
Enter fullscreen mode Exit fullscreen mode

Gemini CLI

If not pre-installed you can download the Gemini CLI to interact with the source files and provide real-time assistance:

npm install -g @google/gemini-cli
Enter fullscreen mode Exit fullscreen mode

Testing the Gemini CLI Environment

Once you have all the tools and the correct Node.js version in place- you can test the startup of Gemini CLI. You will need to authenticate with a Key or your Google Account:

▝▜▄ Gemini CLI v0.33.1
    ▝▜▄
   ▗▟▀ Logged in with Google /auth
  ▝▀ Gemini Code Assist Standard /upgrade no sandbox (see /docs) /model Auto (Gemini 3) | 239.8 MB
Enter fullscreen mode Exit fullscreen mode

Node Version Management

Gemini CLI needs a consistent, up to date version of Node. The nvm command can be used to get a standard Node environment:

GitHub - nvm-sh/nvm: Node Version Manager - POSIX-compliant bash script to manage multiple active node.js versions

Agent Development Kit

The Google Agent Development Kit (ADK) is an open-source, Python-based framework designed to streamline the creation, deployment, and orchestration of sophisticated, multi-agent AI systems. It treats agent development like software engineering, offering modularity, state management, and built-in tools (like Google Search) to build autonomous agents.

The ADK can be installed from here:

Agent Development Kit (ADK)

Where do I start?

The strategy for starting multimodal real time agent development is a incremental step by step approach.

First, the basic development environment is setup with the required system variables, and a working Gemini CLI configuration.

Then, a minimal ADK Agent is built and tested locally. Next — the entire solution is deployed to Azure ACA.

Setup the Basic Environment

At this point you should have a working Python environment and a working Gemini CLI installation. All of the relevant code examples and documentation is available in GitHub. This repo has a wide variety of samples- but this lab will focus on the ‘level_3-gemini’ setup.

The next step is to clone the GitHub repository to your local environment:

cd ~
git clone https://github.com/xbill9/gemini-cli-azure
cd gemini31-aci
Enter fullscreen mode Exit fullscreen mode

Then run init.sh from the cloned directory.

The script will attempt to determine your shell environment and set the correct variables:

source init.sh
Enter fullscreen mode Exit fullscreen mode

If your session times out or you need to re-authenticate- you can run the set_env.sh script to reset your environment variables:

source set_env.sh
Enter fullscreen mode Exit fullscreen mode

Variables like PROJECT_ID need to be setup for use in the various build scripts- so the set_env script can be used to reset the environment if you time-out.

Build the User Interface

The front end files provide the user interface:

xbill@penguin:~/gemini-cli-azure/gemini31-aci$ make frontend
cd frontend && npm install && npm run build

added 218 packages, and audited 219 packages in 2s

49 packages are looking for funding
  run `npm fund` for details

1 high severity vulnerability

To address all issues, run:
  npm audit fix

Run `npm audit` for details.

> frontend@0.0.0 build
> vite build

vite v7.3.1 building client environment for production...
✓ 33 modules transformed.
dist/index.html 0.46 kB │ gzip: 0.29 kB
dist/assets/index-xOQlTZZB.css 21.60 kB │ gzip: 4.54 kB
dist/assets/index-0hbet2qm.js 214.56 kB │ gzip: 67.44 kB
✓ built in 1.02s
xbill@penguin:~/gemini-cli-azure/gemini31-aca$
Enter fullscreen mode Exit fullscreen mode

Test The User Interface

The mock server test script allows the interface and Browser settings to be set to allow multimedia — without using any external Model calls or tokens:

xbill@penguin:~/gemini-cli-azure/gemini31-aci$ make mock
. ./mock.sh
http://127.0.0.1:8080/
Serving static files from: /home/xbill/way-back-home/level_3_gemini/frontend/dist
INFO: Started server process [24098]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8080 (Press CTRL+C to quit)
Enter fullscreen mode Exit fullscreen mode

The Deployed mock front-end will look similar to:

Verify The ADK Installation

To verify the setup, run the ADK CLI locally with the biometric_agent:

xbill@penguin:~/gemini-cli-azure/gemini31-aci$ make testadk
. ./testadk.sh
connect to local ADK CLI 

Log setup complete: /tmp/agents_log/agent.20260406_160553.log
To access latest log: tail -F /tmp/agents_log/agent.latest.log
/home/xbill/.local/lib/python3.13/site-packages/google/adk/cli/cli.py:204: UserWarning: [EXPERIMENTAL] InMemoryCredentialService: This feature is experimental and may change or be removed in future versions without notice. It may introduce breaking changes at any time.
  credential_service = InMemoryCredentialService()
/home/xbill/.local/lib/python3.13/site-packages/google/adk/auth/credential_service/in_memory_credential_service.py:33: UserWarning: [EXPERIMENTAL] BaseCredentialService: This feature is experimental and may change or be removed in future versions without notice. It may introduce breaking changes at any time.
  super(). __init__ ()
Running agent biometric_agent, type exit to exit.

[biometric_agent]: Scanner Online.



Test The ADK Web Interface

This tests the Audio / Video ADK agent interactions:

xbill@penguin:~/gemini-cli-azure/gemini31-aci$ make adk
. ./runadk.sh
connect on http://127.0.0.1:8000/

2026-04-06 16:06:25,026 - INFO - service_factory.py:266 - Using in-memory memory service
2026-04-06 16:06:25,026 - INFO - local_storage.py:84 - Using per-agent session storage rooted at /home/xbill/way-back-home/level_3_gemini/backend/app
2026-04-06 16:06:25,026 - INFO - local_storage.py:110 - Using file artifact service at /home/xbill/way-back-home/level_3_gemini/backend/app/.adk/artifacts
/home/xbill/.local/lib/python3.13/site-packages/google/adk/cli/fast_api.py:193: UserWarning: [EXPERIMENTAL] InMemoryCredentialService: This feature is experimental and may change or be removed in future versions without notice. It may introduce breaking changes at any time.
  credential_service = InMemoryCredentialService()
/home/xbill/.local/lib/python3.13/site-packages/google/adk/auth/credential_service/in_memory_credential_service.py:33: UserWarning: [EXPERIMENTAL] BaseCredentialService: This feature is experimental and may change or be removed in future versions without notice. It may introduce breaking changes at any time.
  super(). __init__ ()
INFO: Started server process [24350]
INFO: Waiting for application startup.

+-----------------------------------------------------------------------------+
| ADK Web Server started |
| |
| For local testing, access at [http://0.0.0.0:8000.](http://0.0.0.0:8000.) |
+-----------------------------------------------------------------------------+

INFO: Application startup complete.
INFO: Uvicorn running on [http://0.0.0.0:8000](http://0.0.0.0:8000) (Press CTRL+C to quit)
Enter fullscreen mode Exit fullscreen mode

Then use the web interface — either on the local interface 127.0.0.1 or the catch-all web interface 0.0.0.0 -depending on your environment:

Special note for Google Cloud Shell Deployments- add a CORS allow_origins configuration exemption to allow the ADK agent to run:

adk web --host 0.0.0.0 --allow_origins 'regex:.*'
Enter fullscreen mode Exit fullscreen mode

Lint and Test the Main Python Code

The final step is to build, lint, and test the main Python code.

To Lint:

xbill@penguin:~/gemini-cli-azure/gemini31-aci$ make lint
ruff check .
All checks passed!
ruff format --check .
10 files already formatted
cd frontend && npm run lint

> frontend@0.0.0 lint
> eslint .

xbill@penguin:~/gemini-cli-azure/gemini31-aca$
Enter fullscreen mode Exit fullscreen mode

To Test:

xbill@penguin:~/gemini-cli-azure/gemini31-aci$ make test
python -m pytest
============================================================ test session starts ============================================================
platform linux -- Python 3.13.12, pytest-9.0.2, pluggy-1.6.0
rootdir: /home/xbill
configfile: pyproject.toml
plugins: anyio-4.11.0
collected 9 items / 1 skipped                                                                                                               

backend/app/biometric_agent/test_agent.py ..... [55%]
test_ws_backend.py .. [77%]
test_ws_backend_v2.py ..
Enter fullscreen mode Exit fullscreen mode

Running Locally

The main Python Code can then be run locally:

xbill@penguin:~/gemini-cli-azure/gemini31-aci$ make run
. ./biosync.sh
Local URL
http://127.0.0.1:8080/
2026-04-06 16:09:42,868 - INFO - System Config: 2.0 FPS, 10.0s Heartbeat
Serving static files from: /home/xbill/way-back-home/level_3_gemini/frontend/dist
INFO: Started server process [25860]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8080 (Press CTRL+C to quit)
Enter fullscreen mode Exit fullscreen mode

Then connect to the local front end:

Deploying to Google Azure ACA

A utility script runs the deployment to Azure ACA. Use the deploy version from the local system:

xbill@penguin:~/gemini-cli-azure/gemini31-aci$ make deploy
./deploy.sh
                                                                    0.0s 0.0s
Enter fullscreen mode Exit fullscreen mode

You can validate the final result by checking the messages:

Azure Deployment complete.
URL: https://biometric-scout-app.wonderfuldune-ec8eec50.eastus.azurecontainerapps.io
xbill@penguin:~/gemini-cli-azure/gemini31-aci$
Enter fullscreen mode Exit fullscreen mode

Once the container is deployed- you can then get the endpoint:

xbill@penguin:~/gemini-cli-azure/gemini31-aca$ make status
Name State URL
------------------- --------- -----------------------------------------------------------------------
biometric-scout-app Succeeded biometric-scout-app.wonderfuldune-ec8eec50.eastus.azurecontainerapps.io
xbill@penguin:~/gemini-cli-azure/gemini31-aca$ make endpoint
biometric-scout-app.wonderfuldune-ec8eec50.eastus.azurecontainerapps.io
xbill@penguin:~/gemini-cli-azure/gemini31-aca$
Enter fullscreen mode Exit fullscreen mode

The service will be visible in the Azure console:

biometric-scout-ssl-pengu.eastus.azurecontainer.io
Enter fullscreen mode Exit fullscreen mode

Running the Web Interface

Start a connection to the Azure deployed app:

biometric-scout-ssl-pengu.eastus.azurecontainer.io
Enter fullscreen mode Exit fullscreen mode

Then connect to the app :

Then use the Live model to process audio and video:

Finally — complete the sequence:

Project Code Review

Gemini CLI was used for a final project review:

✦ The code is in great shape. All 8 tests passed, and the entire project is compliant with the linter rules.

  There is one warning related to an experimental feature (PLUGGABLE_AUTH) in the Google ADK, but this is informational and doesn't indicate an
  error.

  Since the automated checks are clean, what specific part of the codebase would you like me to review? For example, we could look at:

   * The agent's logic in backend/app/biometric_agent/agent.py
   * The frontend WebSocket and component logic in frontend/src/BiometricLock.jsx
   * The Azure deployment scripts
   * The overall architecture
Enter fullscreen mode Exit fullscreen mode

Summary

The Agent Development Kit was used to enable a multi-modal agent using the Gemini Live Model. This Agent was tested locally with the CLI and then deployed to Azure ACI. Several key take-aways and lessons learned were summarized from working with the transition to a new Live Gemini LLM model. Finally, Gemini CLI was used for a complete project code review.

Enter fullscreen mode Exit fullscreen mode

Top comments (0)