This project is developed to enter the GeminiLiveAgentChallenge
The idea:
We wanted a voice agent that actually joins meetings: listens, answers when you call it by
name, and can search workspace tools, create meeting minutes, and create Jira tickets—all by
voice. The goal was to use Gemini Live for real-time and low-latency voice.
Used Zoom instead of Meet
We initially focused on Google Meet but Meet doesn’t provide a straightforward way to inject a
bot as a participant with its own mic/speaker and access to meeting audio and chat. So we built
it over Zoom keeping the same idea that it can be used for Meet if support is added.
How it’s wired
The app has three main parts.
First, a meeting bot that joins the Zoom call, captures
meeting audio, plays back the assistant’s voice, and can post to the meeting chat.
Second, a WebSocket server that sits between the bot and Gemini Live. It streams audio
from the meeting to the Live API and streams the model’s voice back to the bot.
Third, a Next.js app that handles auth, Drive/Jira API calls, and the UI for launching the
bot and managing settings.
The voice agent is driven entirely by Gemini Live: it decides when to reply, when to call tools, and what to say.
Deployed it on CloudRun (service + Job)

Top comments (0)