This post is my submission for DEV Education Track: Build Multi-Agent Systems with ADK.
What I Built
I built Gemini Tales, an interactive storytelling experience that blends real-time AI conversation with physical activity. It solves a haunting statistic: 80% of children today don't move enough. While technology is often seen as the cause of sedentary behavior, I wanted to turn the screen into a catalyst for movement.
๐น Watch the Vision: See how we turn sedentary screen time into an active adventure.
Gemini Tales doesn't just tell a storyโit sees your child, hears their voice, and asks them to ACT. Every physical movement becomes part of the magic.
Cloud Run Embed
The project is currently running in Google Cloud Run:
๐ง The Experience: Live Multimodal Storytelling
The frontend is a direct bridge to Gemini Live API, enabling unified Voice + Vision interaction in real-time.
Features That Create Magic โจ
| Feature | What It Does | Tech Stack |
|---|---|---|
| ๐๏ธ Stable Voice Live | Interruption-aware, low-latency conversation. Child speaks and changes story path anytime. | Gemini Live API |
| ๐ธ Visual Awareness | Real-time video stream (1 FPS) lets AI "see" costumes, toys, movements. | Gemini 2.5 Flash Native Vision |
| ๐จ Dynamic Illustrations | Watercolor-style art that evolves with the plot. | Gemini 3.1 Flash Preview |
| โก Agent-Driven Context | Before the story starts, Storysmith agents research & craft unique plot. | Google ADK Backend |
| ๐ฎ Interactive Challenges | AI pauses story for "Hero's Challenges"โphysical actions detected via video. | Computer Vision |
๐ค The Brain: Multi-Agent Story Engine
The backend is a distributed multi-agent system built with the Google Agent Development Kit (ADK) and the A2A (Agent-to-Agent) protocol. This ensures specialization, reliability, and scalability.
๐ญ Meet the Agents
| Agent | Role | Specialty | Model |
|---|---|---|---|
| ๐ Adventure Seeker | Physical activity planning | Multi-step reasoning for movement-based quests | Gemini 2.5 Flash + Google Search |
| โ๏ธ Guardian of Balance | Safety gatekeeper | Strict age-appropriate & movement-focused validation | Gemini 2.5 Flash (Temp 0.1) |
| โ๏ธ Storysmith | Narrative weaver | Advanced storytelling & character depth | Gemini 2.5 Pro |
| ๐ช Orchestrator | Pipeline coordinator | Manages state & flow between agents | Gemini 2.5 Flash |
Architecture Highlights
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Frontend (React + Live API) โ
โ Voice โข Vision โข Real-time Feedback โ
โโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโ
โ WebSocket
โโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโ
โ API Gateway (FastAPI) โ
โ Authentication & Rate Limiting โ
โโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโ
โ A2A Protocol
โโโโโโโโโโโโโโผโโโโโโโโโโโโโฌโโโโโโโโโโโโโ
โ โ โ โ
โโโโโผโโโ โโโโโโโผโโโ โโโโโโโผโโโ โโโโโโผโโโโ
โSeekerโ โGuardianโ โStorysmth Orch. โ
โโโโโโโโ โโโโโโโโโโ โโโโโโโโโโโ โโโโโโโโโโ
For a detailed deep-dive into the system design, see ARCHITECTURE.md.
๐ From Tutorial to Hackathon Reality
This project started as a journey through the Build Multi-Agent Systems with ADK track. The tutorial taught me Agent-to-Agent (A2A) communication and state-based orchestrationโthe exact foundation I needed for my entry in the Gemini Live Agent Challenge ๐ฌ
I took those core architectural patterns and pivoted toward something bigger: an AI Nanny that inspires children to move. The same ADK logic that powers educational tools now powers interactive, movement-based storytelling.
๐ ๏ธ Tech Stack
| Layer | Technology |
|---|---|
| Frontend | React 19, Vite, TypeScript, Tailwind CSS |
| AI Intelligence | Gemini Live API, Gemini 2.5 Flash/Pro, Gemini 3.1 Flash Preview |
| Backend Framework | Google ADK, Agent-to-Agent (A2A) Protocol |
| Infrastructure | FastAPI (Python 3.12), WebSockets, Google Cloud Run |
| Observability | OpenTelemetry, Google Cloud Trace |
| Dev Tools | Antigravity IDE for agentic orchestration |
๐ Getting Started
Prerequisites
- Python 3.10+ & Node.js 20+
-
uvfor Python dependency management - Google Cloud Project with Vertex AI enabled
1. Backend Launch
Easy Mode (Windows):
# Starts all 5 services with automatic cleanup
.\run_local.ps1
Manual Launcher:
# Start distributed agents (Leaf Services)
uv run shared/adk_app.py agents/researcher --port 8001 --a2a
uv run shared/adk_app.py agents/judge --port 8002 --a2a
uv run shared/adk_app.py agents/content_builder --port 8003 --a2a
# Start Orchestrator & API Gateway
uv run shared/adk_app.py agents/orchestrator --port 8004
uv run app/main.py
2. Frontend Launch
cd frontend
npm install
npm run dev
Visit http://localhost:5173 and start creating stories! โจ
๐ Key Learnings
Specialization > Monoliths
I was surprised at how much more reliable the system became when I stopped relying on one giant prompt and started treating agents like a specialized team with distinct responsibilities.
The Power of A2A Protocol
Implementing Agent-to-Agent communication was challenging, especially handling Google Application Default Credentials (ADC) on Windows. But once it clicked, the elegance of distributed agents became clear.
Movement Changes Everything
The most rewarding part? Seeing a child leap off the couch when the AI asked them to "show me how you jump." Screen time transformed from sedentary consumption into active play.
๐ Open Source & Reproducibility
The full source codeโincluding ADK orchestration logic, deployment scripts, and frontendโis available on GitHub:
๐ GitHub: vero-code/gemini-tales
Features:
- โ Full Docker & Cloud Run deployment
- โ Multi-agent architecture with A2A protocol
- โ Live API integration with fallbacks
- โ Comprehensive ARCHITECTURE.md for deep-dives
๐ง Current Status & What's Next
Gemini Tales is in active development for the Gemini Live Agent Challenge. The multi-agent backend is fully operational, and I'm continuously refining:
- ๐ฌ Multimodal streaming: Synchronizing text, generated images, and voice.
- ๐ฏ Real-time movement detection: Analyzing video feed for physical actions.
- ๐ Gamification System: Badges and rewards to motivate physical activity.
New updates coming soon:
- ๐ก๏ธ Strict Interaction Logic: Implementing "anti-cheat" verification so the agent visually confirms actions instead of relying on voice.
- โก Performance: Optimizing agent loading times for instant storytelling.
- ๐ฅ Live demo video showcasing the full experience.
Stay tuned as we turn screen time into move time! ๐๐งธ
๐ฏ Why This Matters
Technology is often the villain in this story. But what if it could be the hero?
Gemini Tales proves that with the right architecture and intention, we can build AI experiences that:
- โ Entertain (magical storytelling)
- โ Engage (real-time interaction)
- โ Activate (physical movement required)
- โ Educate (safe, age-appropriate learning)
This is technology in service of human health.
๐ License
MIT โ See LICENSE
Created with โค๏ธ for the next generation of active explorers.
Tags: #GeminiLiveAgentChallenge #GoogleAI #MultiAgentSystems #ADK #ChildHealth #InteractiveTech


Top comments (2)
How's it going?
Great, just added the embed + dev label. Thanks for checking in! ๐