Travel as fast as you speak with Gemini ! Where live agents meet immersive storytelling & 3D navigation
This project was created for entering the Gemini Live Agent Challenge
#GeminiLiveAgentChallenge
Travigo is a next-generation AI Agent that utilizes multimodal inputs and outputs, moving far beyond simple text-in/text-out interactions. The project leverages Google's Gen AI SDK, Gemini Live API, Gemini 3, Google Maps API cloud services with the creative power of generative AI and spatial context to solve complex problems and create entirely new, immersive user experiences in 3D navigation and storytelling.
Features & Functionality
- Multimodal Interactions: Communicate via voice and text while the AI processes real-time visual context from the interactive Street View and 3D map spatial data.
- Dynamic Personas: Choose between Concierge Mode (realistic local guides) and Game Mode (mystical/run-time personas) adapting tone and narrative focus on the fly. There are also local personas spinned up on the fly by Gemini 3 depending on the user AR location, street view.
- Real-time Context Processing: Uses a Live Agent Orchestrator to stream dialogue & voice directly tied to user actions and spatial events.
- Immersive Storytelling: Generates contextual narratives overlaid seamlessly onto the UI and 3D environment.
Models Used
The project utilizes a multi-model architecture, leveraging different Gemini models google gen AI SDK depending on the task:
- gemini-2.5-flash-native-audio-preview: Used by the Live Agent Orchestrator to power real-time, multimodal conversations via voice and audio streaming.
- gemini-2.5-flash: Used for rapid "Scout" queries, specifically grounding location searches using the Google Maps tool.
- gemini-3.1-pro-preview: Used for complex deep reasoning tasks via High Thinking levels, such as planning logistics, tours, generating fictional personas based on spatial context and performing deep "Strategic Analysis" (e.g., visa planning, historic deep dives) grounded by Google Search.
Top comments (0)