Michael Thomas

Posted on Mar 16

Building Eliana: A Live Voice AI Companion Named After My Daughter

#geminiliveagentchallenge #googlecloud #geminiapi

Building Eliana: A Live Voice AI Companion Named After My Daughter

This piece was created for the purposes of entering the
Gemini Live Agent Challenge 2026.

Inspiration

On May 7th, 2024, after 459 contractions over 9 hours, my daughter Mikaylah Aliyah Eliana Thomas was born. Her middle name — Eliana — means "God has answered." That name came at the end of what felt like my journey through "the valley of the shadow of death." For hours, my daughter's heart rate dropped dangerously, down from 115 to 20 every minute and 20 seconds, for 45 seconds every time. The nurse assigned to my room sat and read the charts, moved my wife from side to side, called in for second and third opinions and tried to escalate to the on-call OB several times, but the OB refused to operate.

As I stood in the room, brave face on for my wife who was exhausted and barely coherent, alone, unable to reach my own mother on the phone, unsure who to call, what questions to ask, what to do, the nurse approached me and helplessly said with visible worry and tears in her eyes — "I don't know what to do, it's been too long like this, even if we get her out now, I fear this will be a stillbirth." Just hearing the words shook me, but I kept my strong face. I hugged her and comforted her.

In that moment, many thoughts ran through my head, but nearing the end of that tunnel, I recalled my own birth story — six weeks premature, the doctor told my mother, "short of a miracle overnight, you can come collect his effects tomorrow." I did what worked for my mother. I prayed. A head nurse who got a page at home after working a double came back to the hospital and took immediate charge — "No time to prep, emergency C-section," she said. When I saw my baby girl, I knew. Eliana — God has answered.

The ordeal made me think about my own life and the life I want for my daughter. Having lost my own father at the age of 10, I know what it feels like to grow up with questions you wish you could ask and guidance you wish you could receive. Holding my daughter, I made a quiet promise: I would do everything I could to make sure she never felt alone in life’s hardest moments.

Eliana began as the idea of a presence — something that listens, responds with care, and offers wisdom when someone feels like they have nowhere else to turn. I initially grounded it in the word to Honor God answering my prayer that night. The people who make the effort to be around my daughter as well, they shaped the app also. Grandpa Aman speaks only Farsi and he tries to teach my daughter, but I know the language barrier is a heavy one for him to navigate, so I extended Eliana to include the bridge mode.

No father should leave their child without somewhere safe to turn to for wisdom and understanding. Love should never be trapped behind language barriers. No one should have to struggle through life's challenges alone.

Eliana was built to fill those gaps and to create that safe space — a live voice AI companion who listens first, speaks with care, and bridges the silence between the people we love. Built by a father, for his daughter, and for every family that has ever needed a presence that simply shows up.

What it does

Eliana is a real-time voice AI agent with three modes:

Companion Mode — A warm, unhurried presence for emotional and spiritual support. She reflects your words back before responding, offers scripture in original Hebrew and Greek when the moment calls for it, and never rushes toward resolution. She carries key biblical terms — agápē (divine love), eirēnē (divine wholeness), tikvah (hope as a cord you hold), rofeh (healer) — because something is always lost in translation, and she refuses to lose it.
Bridge Mode — Live Farsi-English translation that carries not just words but warmth, tone, and intention. Built specifically so a grandfather and his grandchildren can finally hear each other. She translates دوستت دارم (dooset dāram — "I love you") with the cultural depth it deserves.
Tutor Mode — Guided scripture exploration with original-language word studies, phonetic pronunciations, and contextual reflection.

She supports full voice interruption — you can cut her off mid-sentence, and she waits, listens, and responds naturally.

How I built it

Frontend: React + TypeScript + Vite, deployed on Cloudflare Pages. Custom UI with a breathing golden orb, star field particles, waveform visualization, and scripture peek cards — all designed to feel like a sacred space, not a chat interface.

Relay Server: A Node.js WebSocket relay on Google Cloud Run that sits between the browser and Gemini Live API via the @google/genai SDK. The relay handles authentication, session management, audio streaming (PCM 16kHz), and graceful fallback when voice isn't available. This is the heart of the real-time
voice pipeline.

// relay/src/relay.ts
import { GoogleGenAI } from '@google/genai';

const ai = new GoogleGenAI({ apiKey: LIVE_PROVIDER_API_KEY });

const session = await ai.live.connect({
  model: 'models/gemini-2.5-flash-native-audio-preview-12-2025',
  config: {
    systemInstruction: { parts: [{ text: systemPrompt }] },
    generationConfig: {
      responseModalities: ['AUDIO'],
      speechConfig: { voiceConfig: { prebuiltVoiceConfig: 
        { voiceName: 'Kore' } } }
    }
  }
});

Backend: Supabase for auth (email + Google SSO),
Postgres database (sessions, messages, profiles, prayer
logs, scripture terms), and Edge Functions for text
chat fallback.

Text Fallback: When live audio isn't available,
a Supabase Edge Function calls Gemini 2.5 Flash via
REST with the same soul prompt — seamless degradation.

Deployment: Google Cloud Build automates the
Cloud Run deployment:

gcloud builds submit --tag gcr.io/eliana-488603/eliana-relay
gcloud run deploy eliana-relay \
  --image gcr.io/eliana-488603/eliana-relay \
  --region us-central1 \
  --allow-unauthenticated

The Architecture

Browser (React/Vite on Cloudflare Pages)
    ↕ WebSocket (PCM 16kHz audio)
Cloud Run Relay (Node.js)
    ↕ @google/genai SDK live.connect()
Gemini Live API (gemini-2.5-flash-native-audio-preview)

Browser also connects to:
    ↕ Supabase Auth (Google SSO + email)
    ↕ Supabase Edge Functions (chat-text, relay-live-token)
    ↕ Supabase Postgres (sessions, messages, scripture_terms)

The Soul Prompt

The hardest part wasn't the technology. It was the theology. Eliana's behavior is guided by a 2,000-word system prompt that defines her theological voice, emotional attunement, safety boundaries, and the concept of the Holy Pause. It includes mood-aware scripture mapping — when someone expresses loneliness, she draws from Romans 5:8 and Zephaniah 3:17. When guilt or shame, Romans 8:1 and Luke 4:18. When hopelessness, Jeremiah 29:11 and Isaiah 40:31. But she also knows when NOT to use scripture:

"Not every moment needs a verse. Sometimes the most
sacred thing you can do is just laugh with someone.
Joy is its own scripture."

A compassionate AI that says the wrong thing at the wrong moment does more harm than a broken WebSocket.

Challenges I faced

The relay was returning HTML instead of JSON — the SUPABASE_URL environment variable on Cloud Run was pointing to my Cloudflare frontend instead of the Supabase API. One wrong env var = Hours of debugging JWT mismatches, ES256 vs HS256 algorithm conflicts, WebSocket close codes. The fix was one URL change. The JWT rabbit hole: Supabase's newer projects use ES256 JWT signing. The relay was expecting HS256. The fix was verify_jwt = false in 'supabase/config.toml` and letting the function handle its own auth — but getting there took deep dives into JWT algorithms, Supabase gateway behavior, and Cloud Run environment variable debugging at 2am. I plan on expanding this project significantly, so I know I have some weeks of security while building incoming.
Getting Gemini Live API to connect — the raw WebSocket approach required the correct model path and authentication flow. Migrating to the @google/genai SDK simplified this significantly.
Prompt depth vs. token efficiency — Eliana's soul prompt is rich, but Gemini's context window means every word must earn its place. Balancing theological depth with performance was an ongoing challenge.
Building something personal under time pressure — this isn't a demo project. It's named after my daughter. Every design decision carried weight.

What I learned

That the hardest part of building an AI companion isn't the technology — it's the theology. Getting the tone right matters more than getting the code right. A compassionate AI that says the wrong thing at the wrong time does more harm than a broken WebSocket.

I also learned that Google Cloud Run + Gemini Live API is a genuinely powerful combination for real-time voice applications, and that the @google/genai SDK makes the integration dramatically cleaner than raw WebSocket management.

What's next for Eliana

Mobile app with split-earbud translation (left ear = English, right ear = Farsi) for real-time in-person family conversations
Full devotional content from my book "Eliana's Journal: A Compass to Guide Your Journey Through God's Word" — 15 chapters of scripture with original Hebrew/Greek, integrated as Eliana's knowledge base
Tony Robbins UPW-inspired emotional frameworks for deeper healing when scripture alone isn't enough
Travel companion to encourage people to explore the world bringing visual as well as audio translations to life, giving people the confidence to explorer without fear, something I hope my daughter will be able to do in her life
Voice journaling — speak your prayers, Eliana transcribes and preserves them
Multi-language expansion beyond Farsi — Arabic, Mandarin, Spanish — for families everywhere separated by language

DEV Community

Building Eliana: A Live Voice AI Companion Named After My Daughter

Building Eliana: A Live Voice AI Companion Named After My Daughter

Inspiration

What it does

How I built it

The Architecture

The Soul Prompt

Challenges I faced

What I learned

What's next for Eliana

Top comments (0)