Kyung-Hoon Kim

Posted on Mar 15

I Built an AI That Lets You Talk to Your Future Self — Here's How

#gemini #googlecloud #ai #hackathon

What if you could sit down and have a real conversation with your future self — not a chatbot, but a version of you who's been where you're going?

That's what I built with Mirror8. You upload a selfie, AI generates 8 possible future versions of you, and then you pick one and have a live voice conversation. Your future self can see you through your camera, hear your voice, and talk back — all in real time.

Here's how I built it with 4 Gemini models and Google Cloud.

The Idea

Everyone's seen that moment in interviews — "What would you tell your younger self?" People break down. They get real. But that conversation always looks backward.

Research by psychologist Hal Hershfield at UCLA shows that people treat their future selves like strangers — and this disconnect leads to worse decisions. But when participants interacted with age-progressed avatars of themselves, they allocated more than twice as much toward retirement savings.

Mirror8 takes this from a lab experiment to a live experience.

The 4-Model Gemini Pipeline

Mirror8 isn't powered by a single AI call. It orchestrates 4 different Gemini models, each handling a different part of the experience:

Phase A: Selfie Analysis (Gemini 3.1 Pro)

When you upload a selfie, gemini-3.1-pro-preview analyzes your appearance — age, features, vibe — and generates 8 personalized future-self backstories. Each backstory is tied to a different life path: The Visionary (tech founder), The Healer (humanitarian doctor), The Artist, The Explorer, The Sage, The Guardian, The Maverick, The Mystic.

If you also share something about yourself ("I want to start a company but I'm scared of leaving my job"), every backstory adapts to reference your situation.

Phase B: Portrait Generation (Gemini 3.1 Flash Image)

Next, gemini-3.1-flash-image-preview generates a photorealistic portrait for each future self, using your original selfie as a reference. That's 8 unique AI-generated faces — all variations of you on different paths.

I hit Gemini's rate limits early on when trying to generate all 8 simultaneously. The fix was a semaphore limiting concurrency to 2 at a time, with exponential backoff and a fallback from photorealistic to artistic style if needed.

Live Conversation (Gemini 2.5 Flash Native Audio via ADK)

This is the core experience. When you pick a future self, a WebSocket connection opens, and you enter the Mirror Room — a full-screen conversation where your future self's portrait glows and responds to you.

Using Google's Agent Development Kit (ADK), I create a unique agent for each conversation with a dynamically built system prompt:

agent = Agent(
    model="gemini-2.5-flash-native-audio-preview",
    instruction=system_prompt,  # Built from archetype + analysis + user context
    tools=[ask_reflection_question, save_conversation_insight],
)

The conversation is bidirectional — audio streams in both directions simultaneously. Your browser captures microphone audio at 16kHz PCM and camera frames at 1 FPS, sends them through the WebSocket, and the agent responds with generated speech. You can interrupt it mid-sentence, just like a real conversation.

The hardest part wasn't the tech — it was the prompt. Early versions made the future self behave like an interviewer, asking too many questions. I iterated extensively to make it lead with its story, share specific advice, and reference what it sees through the camera. The prompt is the product.

Emotion Judge (Gemini 3 Flash)

Here's a detail I'm proud of: the portrait evolves during the conversation.

A fourth model — gemini-3-flash-preview — monitors the conversation's emotional arc. Every few turns, it evaluates whether something meaningful happened: a breakthrough, a fear expressed, a dream shared. If so, it triggers a portrait regeneration that reflects the emotional direction of the conversation.

The portrait crossfades seamlessly in the browser. It's subtle, but people notice.

Architecture

The stack:

Frontend: Next.js 15 + React 19, deployed on Cloudflare Pages
Backend: FastAPI + Google ADK, deployed on Google Cloud Run
Auth & Storage: Supabase (Google OAuth, PostgreSQL, portrait storage)
Real-time: WebSocket with bidirectional PCM audio + camera frames + live transcription

The Moment That Made It Real

When someone first enters the Mirror Room, the future self greets them — and references something it sees through the camera. "I can see you sitting at your desk... I remember those late nights." or it notices what you're wearing.

That moment — when someone realizes it sees me — is visceral. That's when Mirror8 stops being a demo and starts feeling real.

What I Learned

Gemini's multimodal capabilities are deeper than they appear. The Live API's ability to process camera frames in real time creates a level of presence that text-only AI can't match.
ADK makes complex agent architectures simple. Per-session agents with dynamic system prompts, custom tools, and live audio streaming would have been months of work without it.
Prompt engineering is product design. The difference between "an AI that asks questions" and "a mentor who shares their journey" came down entirely to the system prompt.
Emotional design matters. The technical architecture enables the experience, but the moment that matters is when someone feels genuinely seen by a version of themselves they want to become.

Try It

Mirror8 is live at mirror8.me. The code is open source at github.com/beingcognitive/mirror8.

This post was written for my submission to the Gemini Live Agent Challenge 2026 on Devpost. Mirror8 uses Google Gemini models and Google Cloud Run.

#GeminiliveAgentChallenge

DEV Community