We Built a Multiplayer Voice AI That Actually Runs the Party
Live demo: thecruisegod.vercel.app
Six friends. One city. Zero agreement on what to do next.
Here's the thing nobody talks about: every AI voice assistant ever built assumes you're alone. You whisper into it. It whispers back. Clean, personal, useless for a group.
Real life is messier. Three problems hit at once:
Where are we going? Someone's Googling. Someone's scrolling TikTok for vibes. A third texted "idk, you pick." Nobody picks.
Who do we call? The group needs a DJ for Saturday or a barber open right now. Someone texts their cousin. The cousin doesn't respond. The moment dies.
What are we doing right now? You're already together but the energy's flat. The night either catches fire or it doesn't โ and it usually doesn't.
No AI handles any of this. And none of them can do it for a whole room at once.
So we built TCG โ The Cruise God.
What Is TCG?
TCG is the world's first multiplayer voice AI concierge, game master, and local guide. It's a live, conversational AI that physically drives its own React UI while it talks โ serving an entire group simultaneously.
You tap the character. TCG wakes up. You tell it where you are and what the vibe is. It runs the night โ out loud, with personality, without anyone touching a single button.
Three modes. Each purpose-built for a real group scenario.
๐ข Locator Mode
"Find us somewhere to go."
Tell TCG the vibe โ rooftop bar, late-night spot, live music โ and it:
Dynamically builds a contextual search query from your location and energy
Live-scrapes the web via Firecrawl
Reads back real venue recommendations while the UI snaps into a card layout
It's not a Google search. It knows your city, your crew size, and what kind of night you're trying to have.
๐ Plug Mode
"Find me someone who can handle this."
Need a DJ for next weekend? A same-day mechanic? A barber open right now?
Plug Mode rewrites queries with urgency context โ "available now open today same-day" โ runs a live search, and voices the results. It's the "who do you know?" answer, surfaced by AI.
๐ฎ Game Master Mode
"Let's play something."
This is where TCG earns its name.
Say "teach us a drinking game for 6 people, chaotic energy" and TCG doesn't just name a game. It scrapes the full rule set in Markdown via Firecrawl and reads the rules aloud, step by step, to the room.
The UI opens a live Game Session dashboard โ scoreboard, current turn, player list, rules summary โ that TCG updates as the game progresses.
Plus 9 built-in party tools available on command:
Tool Details
Spin the Bottle Synced with the room's live guest list
Truth or Dare 3 intensity levels
Charades 4 categories
Coin Flip / Dice Roll Voice-triggered
Randomizer Splits guests into named groups
Timer Countdown, voice-controlled
Scoreboard Live tracked
Bill Splitter Camera โ Gemini Vision โ split result read aloud
The Tech Stack
ElevenLabs โ The Heartbeat
We didn't use ElevenLabs just for TTS. We architected 10 custom Client Tools that let the agent autonomously control the React frontend in real-time:
switchMode โ Snaps the UI between Locator / Plug / Game Master / Tools
openTool โ Opens any party tool on command
showQR โ Displays the CruiseHQ join QR code
randomizeGroups โ Splits guests into named groups
setGroupLeader โ Elects and announces a group captain
updateGameState โ Syncs the live scoreboard
displayResults โ Populates venue/service result cards
analyzeImage โ Opens camera โ Gemini Vision
createMemory โ Screenshot โ Supabase Storage โ Trophy + viral caption
stopListening โ Cuts the mic at hardware level
TCG says "let me pull that up" โ and the UI snaps. The conversation never breaks.
Firecrawl โ The Search Pipeline
A production-grade 3-tier pipeline:
Supabase (7-day cache) โ Upstash Redis (15-min hot cache) โ Firecrawl live scrape
Queries are dynamically rewritten for context. "Chaotic game for 6" becomes:
"for 6 players wild hilarious high-energy party game rules how to play"
Game searches extract full Markdown rule sets. TCG teaches the room, rule by rule, out loud.
Supabase Realtime โ The Multiplayer Backbone
Friends scan a QR code from their phones and land in a full CruiseHQ interface. They're not just watching โ they're in the room.
Supabase Realtime handles:
Presence tracking โ live roster of who's in the room
Broadcast channels โ dares, group chats, poll votes, co-host voice transcripts
Sub-second latency across every connected device
When a guest sends TCG a dare from their phone, it's injected as a live user message into the active ElevenLabs voice session. TCG hears it. TCG acts on it. In front of the whole room.
Gemini 2.5 Flash Vision โ Eyes for the Room
Vision isn't just about reading textโit's TCG's way of physically "seeing" the party. It integrates directly with the live ElevenLabs voice session as natural context, so TCG actually sees the room and talks about it.
Act as a Referee: Show TCG a chaotic game board, a Charades drawing, or a physical challenge, and its Game Vision acts as an impartial, live judge.
Split the Check: Point the camera at a receipt โ TCG reads the total, splits it per person, auto-detects dynamically the currency from your GPS location, and reads the result aloud to the group.
Verify Items: Scan a barcode โ instantly validated against the Open Food Facts API to tell you exactly what you're holding.
What Makes It Actually Different
We didn't add "multiplayer" as a feature. We rebuilt the entire architecture around the assumption that the AI is serving a room, not a person.
CruiseHQ submissions: A word submitted from a guest's phone can be silently inserted into the agent's context โ invisible to the host's transcript, visible only to TCG. The AI knows the word. The host doesn't. The game works.IT can also be public, leaving the power in your hands. This lets you post dares, suggestions- anything that requires a submission, CruiseHQ is equipped to process.
The Groups trick: When the host says "randomize the groups," the randomizeGroups Client Tool runs, CruiseHQ auto-creates group tabs, every guest's interface updates, and TCG announces the result out loud with personality โ all in one voice command.
Wingman Protocols: Users can set secret instructions ("always suggest dive bars", "roast me constantly") that are injected into the ElevenLabs system prompt on every session start. TCG already knows you before you say a word.
Trophy Room: When a moment lands, TCG proactively captures it โ screenshot via html2canvas, uploaded to Supabase Storage, saved as a Trophy with a generated viral caption, shared from /trophy-room.
Try It
๐ thecruisegod.vercel.app
Open it on one device. Share the QR code with your friends. Tell TCG where you are.
Let it run the night.



Top comments (0)