<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Talha Tahir</title>
    <description>The latest articles on DEV Community by Talha Tahir (@talha666tahir).</description>
    <link>https://dev.to/talha666tahir</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2393645%2Fca86f2f9-c178-4bc3-b8df-40d247d533aa.png</url>
      <title>DEV Community: Talha Tahir</title>
      <link>https://dev.to/talha666tahir</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/talha666tahir"/>
    <language>en</language>
    <item>
      <title>Building Momentum — AI Study Sprint with MeDo 🚀</title>
      <dc:creator>Talha Tahir</dc:creator>
      <pubDate>Wed, 20 May 2026 10:02:45 +0000</pubDate>
      <link>https://dev.to/talha666tahir/building-momentum-ai-study-sprint-with-medo-4k2l</link>
      <guid>https://dev.to/talha666tahir/building-momentum-ai-study-sprint-with-medo-4k2l</guid>
      <description>&lt;h2&gt;
  
  
  The Problem
&lt;/h2&gt;

&lt;p&gt;Learning has never been more accessible.&lt;/p&gt;

&lt;p&gt;But staying consistent?&lt;/p&gt;

&lt;p&gt;That’s still one of the hardest challenges for learners.&lt;/p&gt;

&lt;p&gt;Most people struggle with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Planning what to study next
&lt;/li&gt;
&lt;li&gt;Staying consistent over time
&lt;/li&gt;
&lt;li&gt;Maintaining focus during study sessions
&lt;/li&gt;
&lt;li&gt;Avoiding overwhelm from large goals
&lt;/li&gt;
&lt;li&gt;Turning knowledge into daily action
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Traditional productivity apps usually stop at task management.&lt;br&gt;&lt;br&gt;
Traditional learning platforms usually stop at content delivery.&lt;/p&gt;

&lt;p&gt;We wanted to build something that bridges the gap between:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“I want to learn this”&lt;br&gt;&lt;br&gt;
and&lt;br&gt;&lt;br&gt;
“I am consistently making progress every day”&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  The Idea: Momentum — AI Study Sprint
&lt;/h2&gt;

&lt;p&gt;Momentum is an AI-powered learning accelerator that transforms learning goals into structured daily study systems.&lt;/p&gt;

&lt;p&gt;Users simply enter:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;What they want to learn&lt;/li&gt;
&lt;li&gt;Their deadline&lt;/li&gt;
&lt;li&gt;Their current skill level&lt;/li&gt;
&lt;li&gt;Their available study hours per day&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The platform then generates:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;AI-powered learning roadmaps
&lt;/li&gt;
&lt;li&gt;Daily study sprints
&lt;/li&gt;
&lt;li&gt;Focus sessions (Pomodoro-based)
&lt;/li&gt;
&lt;li&gt;Progress tracking and analytics
&lt;/li&gt;
&lt;li&gt;AI-generated quizzes and revision workflows
&lt;/li&gt;
&lt;li&gt;Adaptive learning recommendations
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The goal was to make the app feel less like a planner and more like an &lt;strong&gt;AI study copilot&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why We Built It
&lt;/h2&gt;

&lt;p&gt;As developers and learners, we constantly face the same problem:&lt;/p&gt;

&lt;p&gt;Learning is not the hard part — consistency is.&lt;/p&gt;

&lt;p&gt;Whether it’s:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Preparing for interviews
&lt;/li&gt;
&lt;li&gt;Learning new frameworks like Angular or .NET
&lt;/li&gt;
&lt;li&gt;Studying AI/ML concepts
&lt;/li&gt;
&lt;li&gt;Completing certifications
&lt;/li&gt;
&lt;li&gt;Building side projects
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The real struggle is maintaining structure and momentum over time.&lt;/p&gt;

&lt;p&gt;That became the core idea:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Reduce the friction between learning goals and daily execution.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Building with MeDo
&lt;/h2&gt;

&lt;p&gt;This project was built using &lt;strong&gt;MeDo’s AI-powered no-code/low-code platform&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Instead of manually coding every feature, we used conversational prompts to iteratively build the application.&lt;/p&gt;

&lt;p&gt;We used MeDo to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Generate dashboard layouts
&lt;/li&gt;
&lt;li&gt;Design onboarding flows
&lt;/li&gt;
&lt;li&gt;Build study workflows
&lt;/li&gt;
&lt;li&gt;Create analytics dashboards
&lt;/li&gt;
&lt;li&gt;Implement AI-generated learning systems
&lt;/li&gt;
&lt;li&gt;Refine UI/UX and responsiveness
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The development process became highly iterative:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Generate initial app structure
&lt;/li&gt;
&lt;li&gt;Refine UI and layout
&lt;/li&gt;
&lt;li&gt;Improve AI roadmap generation
&lt;/li&gt;
&lt;li&gt;Add focus and analytics systems
&lt;/li&gt;
&lt;li&gt;Polish UX and responsiveness
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;One key insight:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The quality of prompts directly affects the quality of the application.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Treating MeDo like an AI development partner (not just a generator) made a huge difference.&lt;/p&gt;




&lt;h2&gt;
  
  
  Features We Focused On
&lt;/h2&gt;

&lt;h3&gt;
  
  
  🧠 AI Roadmap Generation
&lt;/h3&gt;

&lt;p&gt;Users can input goals like:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“Prepare for a .NET interview in 14 days”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The system generates:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Structured learning roadmap
&lt;/li&gt;
&lt;li&gt;Daily study tasks
&lt;/li&gt;
&lt;li&gt;Milestones and checkpoints
&lt;/li&gt;
&lt;li&gt;Revision cycles
&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  🎯 Daily Study Sprints
&lt;/h3&gt;

&lt;p&gt;A focused dashboard showing:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Today’s tasks
&lt;/li&gt;
&lt;li&gt;Priority items
&lt;/li&gt;
&lt;li&gt;Estimated study time
&lt;/li&gt;
&lt;li&gt;Completion tracking
&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  ⏳ Focus Mode
&lt;/h3&gt;

&lt;p&gt;A distraction-free Pomodoro-based focus system:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Study sessions
&lt;/li&gt;
&lt;li&gt;Break timers
&lt;/li&gt;
&lt;li&gt;Session tracking
&lt;/li&gt;
&lt;li&gt;Focus streaks
&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  📊 Analytics &amp;amp; Gamification
&lt;/h3&gt;

&lt;p&gt;To keep users motivated:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Study streaks
&lt;/li&gt;
&lt;li&gt;Progress charts
&lt;/li&gt;
&lt;li&gt;Completion stats
&lt;/li&gt;
&lt;li&gt;XP-style gamification system
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The goal was to make learning feel engaging and rewarding.&lt;/p&gt;




&lt;h2&gt;
  
  
  Challenges We Faced
&lt;/h2&gt;

&lt;p&gt;One of the biggest challenges was balancing &lt;strong&gt;feature richness with simplicity&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;We had many ideas, but we needed to keep the experience clean and intuitive.&lt;/p&gt;

&lt;p&gt;Another challenge was improving AI-generated study plans so they felt:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;realistic
&lt;/li&gt;
&lt;li&gt;actionable
&lt;/li&gt;
&lt;li&gt;structured
&lt;/li&gt;
&lt;li&gt;not generic
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We iterated heavily on prompts and workflow design to improve quality.&lt;/p&gt;

&lt;p&gt;UI/UX polish was also important — we wanted the app to feel like a real SaaS product, not a hackathon prototype.&lt;/p&gt;




&lt;h2&gt;
  
  
  What We Learned
&lt;/h2&gt;

&lt;p&gt;This project changed how we think about building software.&lt;/p&gt;

&lt;p&gt;Instead of focusing only on writing code, we spent more time on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Product thinking
&lt;/li&gt;
&lt;li&gt;UX design
&lt;/li&gt;
&lt;li&gt;Workflow design
&lt;/li&gt;
&lt;li&gt;Prompt engineering
&lt;/li&gt;
&lt;li&gt;Rapid iteration
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We learned that AI-assisted development shifts the developer’s role from “builder of everything” to “designer of systems.”&lt;/p&gt;




&lt;h2&gt;
  
  
  What’s Next
&lt;/h2&gt;

&lt;p&gt;We plan to evolve Momentum into a more advanced AI learning ecosystem with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Conversational AI tutors
&lt;/li&gt;
&lt;li&gt;Smarter adaptive learning systems
&lt;/li&gt;
&lt;li&gt;Spaced repetition memory features
&lt;/li&gt;
&lt;li&gt;Voice-based planning
&lt;/li&gt;
&lt;li&gt;Collaborative study rooms
&lt;/li&gt;
&lt;li&gt;Deeper analytics and insights
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The long-term vision is an &lt;strong&gt;AI study copilot&lt;/strong&gt; that actively helps users stay focused, motivated, and consistent throughout their entire learning journey.&lt;/p&gt;




&lt;h2&gt;
  
  
  Final Thoughts
&lt;/h2&gt;

&lt;p&gt;The most exciting part of this hackathon wasn’t just building the app.&lt;/p&gt;

&lt;p&gt;It was realizing how powerful AI-driven development has become.&lt;/p&gt;

&lt;p&gt;We were able to go from idea → working product extremely quickly, focusing more on experience and product thinking rather than boilerplate development.&lt;/p&gt;

&lt;p&gt;That shift feels like the future of software development.&lt;/p&gt;

&lt;h1&gt;
  
  
  BuiltWithMeDo 🚀
&lt;/h1&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>learning</category>
      <category>showdev</category>
    </item>
    <item>
      <title>An AI That Turns a Parent's Voice Into a Personalised Illustrated Storybook</title>
      <dc:creator>Talha Tahir</dc:creator>
      <pubDate>Mon, 16 Mar 2026 20:22:20 +0000</pubDate>
      <link>https://dev.to/talha666tahir/an-ai-that-turns-a-parents-voice-into-a-personalised-illustrated-storybook-4ik2</link>
      <guid>https://dev.to/talha666tahir/an-ai-that-turns-a-parents-voice-into-a-personalised-illustrated-storybook-4ik2</guid>
      <description>&lt;h1&gt;
  
  
  I Built an AI That Turns a Parent's Voice Into a Personalized Illustrated Storybook — Here's How
&lt;/h1&gt;

&lt;p&gt;&lt;em&gt;From Gemini Live API to real-time SSE streaming on Cloud Run — the full technical story of DreamBook&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;My daughter went through a dinosaur phase. Then a space phase. Then superheroes. No bookstore in the world can keep up with a six-year-old's rotating obsessions — and even if one could, it still wouldn't know her name, her fears, or the lesson I was trying to teach her that week.&lt;/p&gt;

&lt;p&gt;That's the problem DreamBook solves.&lt;/p&gt;

&lt;p&gt;You speak. You say something like &lt;em&gt;"Emma, age five, loves dinosaurs and painting, scared of thunder, I want her to learn that being brave doesn't mean not being scared."&lt;/em&gt; Ninety seconds later, you have a fully illustrated, narrated, personalized storybook — text streaming in live, illustrations fading in as they generate, audio narration ready to play per page, and a PDF you can print and keep forever.&lt;/p&gt;

&lt;p&gt;I built this for the Gemini Live Agent Hackathon. This post is the full technical story — what I built, every bug I hit, and what I learned along the way.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Stack
&lt;/h2&gt;

&lt;p&gt;Before we dive in, here's the full picture:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Backend:&lt;/strong&gt; NestJS + TypeScript on Google Cloud Run&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Frontend:&lt;/strong&gt; Next.js 15 (App Router) on Vercel&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AI:&lt;/strong&gt; &lt;code&gt;@google/genai&lt;/code&gt; SDK throughout

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;gemini-3.1-pro-preview&lt;/code&gt; for story generation&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;gemini-3.1-flash-image-preview&lt;/code&gt; (Nano Banana) for illustrations&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;gemini-2.5-flash-preview-tts&lt;/code&gt; for audio narration&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;gemini-2.5-flash-native-audio-preview&lt;/code&gt; via Live API for voice input&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;strong&gt;Infrastructure:&lt;/strong&gt; Firestore, Cloud Storage, Cloud Build, Firebase Auth&lt;/li&gt;

&lt;/ul&gt;




&lt;h2&gt;
  
  
  The Architecture
&lt;/h2&gt;

&lt;p&gt;The system has three layers of real-time communication happening simultaneously:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Browser
  ├── Socket.io WebSocket → NestJS /voice gateway → Gemini Live API
  │   (PCM audio chunks → real-time transcript → StoryRequest JSON)
  │
  └── fetch() SSE stream ← NestJS StoryController
      (page:text, page:image, page:audio events as they generate)

NestJS (Cloud Run)
  ├── GeminiService     → gemini-3.1-pro-preview (text streaming)
  ├── ImagenService     → Nano Banana (illustrations, concurrent)
  ├── TtsService        → gemini-2.5-flash-preview-tts (per page)
  └── PdfService        → pdf-lib → Cloud Storage
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The interesting part is that text, illustrations, and audio all generate concurrently. Gemini streams the story text; each time an &lt;code&gt;[IMAGE: ...]&lt;/code&gt; directive appears in the stream, an illustration job fires immediately without waiting for the story to finish. TTS runs the same way. By the time Gemini finishes generating the last page of text, most of the illustrations and narrations are already done.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Voice Input Pipeline
&lt;/h2&gt;

&lt;p&gt;This is where the Gemini Live API comes in — and it's genuinely impressive.&lt;/p&gt;

&lt;p&gt;The browser captures raw PCM audio from the microphone using an &lt;code&gt;AudioWorklet&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Inline AudioWorklet — converts Float32 mic samples to Int16 PCM&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;PCM_PROCESSOR_CODE&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;`
class PcmProcessor extends AudioWorkletProcessor {
  process(inputs) {
    const input = inputs[0];
    if (!input || !input[0]) return true;
    const float32 = input[0];
    const int16 = new Int16Array(float32.length);
    for (let i = 0; i &amp;lt; float32.length; i++) {
      const s = Math.max(-1, Math.min(1, float32[i]));
      int16[i] = s &amp;lt; 0 ? s * 0x8000 : s * 0x7FFF;
    }
    this.port.postMessage(int16.buffer, [int16.buffer]);
    return true;
  }
}
registerProcessor('pcm-processor', PcmProcessor);
`&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Those PCM chunks stream over Socket.io to NestJS, which opens a Gemini Live session per user:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;liveSession&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;live&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;connect&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;gemini-2.5-flash-native-audio-preview-12-2025&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;config&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;responseModalities&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;Modality&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;AUDIO&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="na"&gt;inputAudioTranscription&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{},&lt;/span&gt;  &lt;span class="c1"&gt;// enables real-time transcription&lt;/span&gt;
    &lt;span class="na"&gt;systemInstruction&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;parts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt; &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;You are a transcription assistant. Just transcribe.&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;}],&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;callbacks&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;onmessage&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="na"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;MessageEvent&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;LiveServerMessage&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="k"&gt;void&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;content&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nx"&gt;serverContent&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
      &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;content&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nx"&gt;inputTranscription&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;transcript&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nx"&gt;content&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;inputTranscription&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="nx"&gt;socket&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;emit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;voice:transcript&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;transcript&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When the user stops speaking, Gemini Flash extracts a structured &lt;code&gt;StoryRequest&lt;/code&gt; from the accumulated transcript:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generateContent&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;gemini-2.0-flash&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;contents&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt;
    &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;user&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;parts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt; &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`Extract story params from: "&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;transcript&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;". 
    Return raw JSON only: { childName, childAge, interests, pageCount, 
    illustrationStyle, language } — optionally lesson and fears.`&lt;/span&gt; &lt;span class="p"&gt;}],&lt;/span&gt;
  &lt;span class="p"&gt;}],&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;One thing I noticed: the &lt;code&gt;inputTranscription.finished&lt;/code&gt; flag doesn't always fire for the native audio model. The fix is to accumulate all transcript chunks continuously rather than waiting for a "final" marker.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Story Generation Pipeline
&lt;/h2&gt;

&lt;p&gt;This is the core of the app. &lt;code&gt;gemini-3.1-pro-preview&lt;/code&gt; generates the story with a specific format requirement:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;`
Write exactly &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;pageCount&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; pages for a storybook about &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;childName&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;.
After each page, add an image directive:
[IMAGE: &amp;lt;detailed illustration prompt in &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;style&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; style&amp;gt;]
Only narrative text and [IMAGE:] directives. Nothing else.
`&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;streamResult&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generateContentStream&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;gemini-3.1-pro-preview&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;contents&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt; &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;user&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;parts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt; &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;prompt&lt;/span&gt; &lt;span class="p"&gt;}]&lt;/span&gt; &lt;span class="p"&gt;}],&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;As chunks stream in, I scan for the &lt;code&gt;[IMAGE:]&lt;/code&gt; directive pattern:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;IMAGE_DIRECTIVE_RE&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sr"&gt;/&lt;/span&gt;&lt;span class="se"&gt;\[&lt;/span&gt;&lt;span class="sr"&gt;IMAGE:&lt;/span&gt;&lt;span class="se"&gt;\s&lt;/span&gt;&lt;span class="sr"&gt;*&lt;/span&gt;&lt;span class="se"&gt;([^\]]&lt;/span&gt;&lt;span class="sr"&gt;+&lt;/span&gt;&lt;span class="se"&gt;)\]&lt;/span&gt;&lt;span class="sr"&gt;/gi&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="k"&gt;await &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;chunk&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;streamResult&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;candidate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;candidates&lt;/span&gt;&lt;span class="p"&gt;?.[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;chunkText&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;candidate&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nx"&gt;content&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nx"&gt;parts&lt;/span&gt;
    &lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt; &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="dl"&gt;''&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;''&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="dl"&gt;''&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="nx"&gt;pageTextBuffer&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nx"&gt;chunkText&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;imageMatch&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;IMAGE_DIRECTIVE_RE&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;exec&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;pageTextBuffer&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;imageMatch&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;pageNumber&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;imagePrompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;imageMatch&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;trim&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;narrativeText&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;pageTextBuffer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;slice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;imageMatch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;index&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;trim&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

    &lt;span class="c1"&gt;// Emit text immediately over SSE&lt;/span&gt;
    &lt;span class="nx"&gt;subject&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;next&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;event&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;page:text&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;pageNumber&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;narrativeText&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;

    &lt;span class="c1"&gt;// Fire illustration job concurrently (don't await)&lt;/span&gt;
    &lt;span class="nx"&gt;subject&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;next&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;event&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;page:image&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;pageNumber&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;imagePrompt&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;

    &lt;span class="nx"&gt;pageTextBuffer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;pageTextBuffer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;slice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;imageMatch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;index&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;imageMatch&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="nx"&gt;IMAGE_DIRECTIVE_RE&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;lastIndex&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Important:&lt;/strong&gt; In &lt;code&gt;@google/genai&lt;/code&gt;, streaming chunks don't have a &lt;code&gt;.text&lt;/code&gt; property — you have to extract text from &lt;code&gt;candidates[0].content.parts[].text&lt;/code&gt;. This is different from the old &lt;code&gt;@google/generative-ai&lt;/code&gt; SDK where &lt;code&gt;chunk.text()&lt;/code&gt; was a method. This bug cost me several hours.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Illustrations with Nano Banana
&lt;/h2&gt;

&lt;p&gt;Nano Banana (&lt;code&gt;gemini-3.1-flash-image-preview&lt;/code&gt;) is a huge improvement over the old Imagen 4 Vertex AI REST approach. No access token fetching, no endpoint construction, just the SDK:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generateContent&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;gemini-3.1-flash-image-preview&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;contents&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;fullPrompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;config&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;responseModalities&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;TEXT&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;IMAGE&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;imagePart&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;candidates&lt;/span&gt;&lt;span class="p"&gt;?.[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]?.&lt;/span&gt;&lt;span class="nx"&gt;content&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nx"&gt;parts&lt;/span&gt;
  &lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nf"&gt;find&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;inlineData&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nx"&gt;mimeType&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nf"&gt;startsWith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;image/&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;imageBase64&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;imagePart&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;inlineData&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The image comes back as base64 inline data. I upload it to Cloud Storage and return a signed URL.&lt;/p&gt;




&lt;h2&gt;
  
  
  Audio Narration
&lt;/h2&gt;

&lt;p&gt;TTS uses &lt;code&gt;generateContent&lt;/code&gt; with audio response modality — no Live API needed here:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generateContent&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;gemini-2.5-flash-preview-tts&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;contents&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt; &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;user&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;parts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt; &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;narratePrompt&lt;/span&gt; &lt;span class="p"&gt;}]&lt;/span&gt; &lt;span class="p"&gt;}],&lt;/span&gt;
  &lt;span class="na"&gt;config&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;responseModalities&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;AUDIO&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="na"&gt;speechConfig&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;voiceConfig&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;prebuiltVoiceConfig&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;voiceName&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Kore&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Gemini TTS returns raw &lt;code&gt;audio/L16;codec=pcm;rate=24000&lt;/code&gt; — raw PCM bytes. Browsers can't play raw PCM directly, so I wrap it in a WAV header:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="nf"&gt;pcmToWav&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;pcm&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Buffer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;sampleRate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;24000&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nx"&gt;Buffer&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;header&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;Buffer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;alloc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;44&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="nx"&gt;header&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;RIFF&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="nx"&gt;header&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;writeUInt32LE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;36&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;pcm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="nx"&gt;header&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;WAVE&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="nx"&gt;header&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;fmt &lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;12&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="nx"&gt;header&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;writeUInt32LE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;   &lt;span class="c1"&gt;// PCM chunk size&lt;/span&gt;
  &lt;span class="nx"&gt;header&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;writeUInt16LE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;    &lt;span class="c1"&gt;// PCM format&lt;/span&gt;
  &lt;span class="nx"&gt;header&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;writeUInt16LE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;22&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;    &lt;span class="c1"&gt;// mono&lt;/span&gt;
  &lt;span class="nx"&gt;header&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;writeUInt32LE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;sampleRate&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;24&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="nx"&gt;header&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;writeUInt32LE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;sampleRate&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;28&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// byte rate&lt;/span&gt;
  &lt;span class="nx"&gt;header&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;writeUInt16LE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;32&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;    &lt;span class="c1"&gt;// block align&lt;/span&gt;
  &lt;span class="nx"&gt;header&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;writeUInt16LE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;34&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;   &lt;span class="c1"&gt;// 16-bit&lt;/span&gt;
  &lt;span class="nx"&gt;header&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;data&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;36&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="nx"&gt;header&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;writeUInt32LE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;pcm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;40&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;Buffer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;concat&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nx"&gt;header&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;pcm&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  The SSE Streaming Problem
&lt;/h2&gt;

&lt;p&gt;This was the most painful bug of the entire project.&lt;/p&gt;

&lt;p&gt;Everything worked perfectly locally. On Cloud Run, the frontend received pings every 15 seconds but zero page events — even though the server logs showed pages being emitted correctly.&lt;/p&gt;

&lt;p&gt;The cause: &lt;strong&gt;Cloud Run's Google load balancer buffers HTTP/2 responses&lt;/strong&gt; for compression efficiency. SSE events were being held in a buffer and only released when the buffer filled or the connection closed — which meant the entire story arrived in one batch after generation finished, not page by page.&lt;/p&gt;

&lt;p&gt;The fix required three things together:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// 1. Disable compression entirely&lt;/span&gt;
&lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;setHeader&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Content-Encoding&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;identity&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// 2. Disable nginx/proxy buffering&lt;/span&gt;
&lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;setHeader&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;X-Accel-Buffering&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;no&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;setHeader&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Cache-Control&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;no-cache, no-store, no-transform&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// 3. Explicitly flush after EVERY write&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;flush&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;typeof &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;res&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="kr"&gt;any&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nx"&gt;flush&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;function&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;res&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="kr"&gt;any&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;flush&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;write&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;unknown&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`event: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;\ndata: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;&lt;span class="s2"&gt;\n\n`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="nf"&gt;flush&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt; &lt;span class="c1"&gt;// ← this is the critical one&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Any one of these alone wasn't sufficient. All three together fixed it.&lt;/p&gt;




&lt;h2&gt;
  
  
  The GCS Signed URL Problem
&lt;/h2&gt;

&lt;p&gt;On Cloud Run, &lt;code&gt;getSignedUrl()&lt;/code&gt; from &lt;code&gt;@google-cloud/storage&lt;/code&gt; throws:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Permission 'iam.serviceAccounts.signBlob' denied
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This doesn't happen locally because your service account JSON file handles signing. On Cloud Run, the compute service account needs explicit permission:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;gcloud projects add-iam-policy-binding YOUR_PROJECT &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--member&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"serviceAccount:YOUR_PROJECT_NUMBER-compute@developer.gserviceaccount.com"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--role&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"roles/iam.serviceAccountTokenCreator"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You also need to tell the GCS client which service account to use when signing. I auto-fetch this from the GCP metadata server:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/email&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Metadata-Flavor&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Google&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;serviceAccountEmail&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;text&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then pass it to &lt;code&gt;getSignedUrl()&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;bucket&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;file&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;gcsPath&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;getSignedUrl&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;action&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;read&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;expires&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;serviceAccountEmail&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;serviceAccountEmail&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  The Firebase Auth Race Condition
&lt;/h2&gt;

&lt;p&gt;On Vercel production, every API call on first page load returned &lt;code&gt;401 Not authenticated&lt;/code&gt; — even for logged-in users.&lt;/p&gt;

&lt;p&gt;Firebase's &lt;code&gt;auth.currentUser&lt;/code&gt; is &lt;code&gt;null&lt;/code&gt; on first render, even when a session exists in localStorage. Firebase needs a tick to rehydrate. The &lt;code&gt;onAuthStateChanged&lt;/code&gt; listener is the right way to wait for it, but there's a subtlety: on first load, Firebase emits &lt;code&gt;null&lt;/code&gt; immediately before emitting the real user. If you reject on the first emission, you'll always get 401.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;waitForAuth&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;User&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Promise&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;resolve&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;reject&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;auth&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;currentUser&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nf"&gt;resolve&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;auth&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;currentUser&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;settled&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="c1"&gt;// Wait for definitive state — ignore initial null&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;unsub&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;auth&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;onAuthStateChanged&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;user&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;settled&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
      &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;user&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;settled&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="nf"&gt;unsub&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
        &lt;span class="nf"&gt;resolve&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;user&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
      &lt;span class="c1"&gt;// If null: Firebase still initialising — keep waiting&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;

    &lt;span class="c1"&gt;// 8-second timeout&lt;/span&gt;
    &lt;span class="nf"&gt;setTimeout&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;settled&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;settled&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="nf"&gt;unsub&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
        &lt;span class="nx"&gt;auth&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;currentUser&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="nf"&gt;resolve&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;auth&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;currentUser&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;reject&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Not authenticated&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="mi"&gt;8000&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  The Secret With a Newline
&lt;/h2&gt;

&lt;p&gt;After fixing auth, every token verification still returned 401 with this message:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Firebase ID token has incorrect "aud" claim.
Expected "live-agent-challenge-489310\n" but got "live-agent-challenge-489310"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That &lt;code&gt;\n&lt;/code&gt; at the end of &lt;code&gt;Expected&lt;/code&gt; is a literal newline character embedded in the secret. When I created the secret in Secret Manager, I used &lt;code&gt;echo&lt;/code&gt; without the &lt;code&gt;-n&lt;/code&gt; flag:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# WRONG — adds a trailing newline&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"live-agent-challenge-489310"&lt;/span&gt; | gcloud secrets create FIREBASE_PROJECT_ID &lt;span class="nt"&gt;--data-file&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;-

&lt;span class="c"&gt;# CORRECT — no newline&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="nt"&gt;-n&lt;/span&gt; &lt;span class="s2"&gt;"live-agent-challenge-489310"&lt;/span&gt; | gcloud secrets create FIREBASE_PROJECT_ID &lt;span class="nt"&gt;--data-file&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;-
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Fix: add a new version of the secret with the correct value and redeploy.&lt;/p&gt;




&lt;h2&gt;
  
  
  React StrictMode Double-Firing
&lt;/h2&gt;

&lt;p&gt;In development, React 18 StrictMode mounts components twice to help catch side effects. This caused the story generation pipeline to fire twice — two competing Gemini calls, two sets of concurrent Imagen calls hitting rate limits, random pages missing illustrations.&lt;/p&gt;

&lt;p&gt;The fix was using a &lt;code&gt;ref&lt;/code&gt; instead of &lt;code&gt;state&lt;/code&gt; to guard the pipeline:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// ❌ State resets on StrictMode remount&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;streamStarted&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;setStreamStarted&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;useState&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// ✅ Ref survives StrictMode remount&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;hasStarted&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;useRef&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="nf"&gt;useEffect&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;storyId&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;hasStarted&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;current&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nx"&gt;hasStarted&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;current&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nf"&gt;startStream&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;storyId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;storyId&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I also added a server-side guard using a &lt;code&gt;Set&lt;/code&gt; of active pipeline IDs — if a pipeline is already running for a storyId, subsequent requests complete immediately without starting another generation.&lt;/p&gt;




&lt;h2&gt;
  
  
  Deploying to Cloud Run
&lt;/h2&gt;

&lt;p&gt;The setup that works:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Memory:&lt;/strong&gt; 2 GiB — concurrent TTS + Imagen + PDF generation needs headroom&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CPU:&lt;/strong&gt; 2 — parallel processing per story&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Request timeout:&lt;/strong&gt; 3600 seconds — SSE stream stays open during full generation (default 300s kills it mid-story)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Session affinity:&lt;/strong&gt; enabled — required for stateful WebSocket voice sessions&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Execution environment:&lt;/strong&gt; 2nd gen — better network performance for streaming&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CI/CD:&lt;/strong&gt; GitHub repo → Cloud Build → Cloud Run on every push to main&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The most important setting people miss is the request timeout. 300 seconds sounds like a lot until you have an 8-page story with TTS and illustrations running concurrently.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I'd Do Differently
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Start with &lt;code&gt;@google/genai&lt;/code&gt; from day one.&lt;/strong&gt; Migrating from the deprecated &lt;code&gt;@google/generative-ai&lt;/code&gt; mid-project cost real time. The streaming API is different, the response shape is different, and assuming parity between the two is a mistake.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Test SSE streaming through a reverse proxy early.&lt;/strong&gt; I spent hours debugging a problem that only existed in production because I didn't simulate the Cloud Run load balancer locally. &lt;code&gt;ngrok&lt;/code&gt; with compression enabled would have caught the buffering issue much earlier.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use structured logging from the start.&lt;/strong&gt; The Cloud Logging queries I ran to diagnose production issues (&lt;code&gt;resource.type="cloud_run_revision" AND resource.labels.service_name="dream-book-api"&lt;/code&gt;) only worked because I had consistent log formatting throughout. Good logging is not optional for cloud deployments.&lt;/p&gt;




&lt;h2&gt;
  
  
  Try It
&lt;/h2&gt;

&lt;p&gt;🌐 &lt;strong&gt;Live:&lt;/strong&gt; &lt;a href="https://dream-book-web.vercel.app" rel="noopener noreferrer"&gt;https://dream-book-web.vercel.app&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;📦 &lt;strong&gt;Backend repo:&lt;/strong&gt; &lt;a href="https://github.com/Talha-Tahir2001/dream-book-api" rel="noopener noreferrer"&gt;https://github.com/Talha-Tahir2001/dream-book-api&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;📦 &lt;strong&gt;Frontend repo:&lt;/strong&gt; &lt;a href="https://github.com/Talha-Tahir2001/dream-book-web" rel="noopener noreferrer"&gt;https://github.com/Talha-Tahir2001/dream-book-web&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Tags
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;#googlecloud&lt;/code&gt; &lt;code&gt;#gemini&lt;/code&gt; &lt;code&gt;#nestjs&lt;/code&gt; &lt;code&gt;#nextjs&lt;/code&gt; &lt;code&gt;#typescript&lt;/code&gt; &lt;code&gt;#ai&lt;/code&gt; &lt;code&gt;#hackathon&lt;/code&gt; &lt;code&gt;#webdev&lt;/code&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Built for the Gemini Live Agent Hackathon 2026. If you have questions about any part of the implementation, drop them in the comments — happy to dig into the details.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>geminiliveagentchallenge</category>
      <category>vertexai</category>
    </item>
    <item>
      <title>Compiling 2025: My Developer Roadmap 🛠️🚀</title>
      <dc:creator>Talha Tahir</dc:creator>
      <pubDate>Sat, 25 Jan 2025 19:09:11 +0000</pubDate>
      <link>https://dev.to/talha666tahir/compiling-2025-my-developer-roadmap-50nf</link>
      <guid>https://dev.to/talha666tahir/compiling-2025-my-developer-roadmap-50nf</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/newyear"&gt;2025 New Year Writing challenge&lt;/a&gt;: Compiling 2025.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Compiling 2025
&lt;/h2&gt;

&lt;p&gt;What does your personal roadmap for 2025 look like? Here's me highlighting my goals, aspirations, and plans for the year ahead. From mastering advanced Angular concepts to building AI-powered projects, I’m embracing growth opportunities and aiming for a successful 2025!  &lt;/p&gt;

&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;As a software developer passionate about creating impactful solutions, I’m looking forward to making 2025 a year of growth, learning, and creativity. This roadmap outlines the skills I plan to master, the projects I aim to tackle, and the career aspirations I’m working toward.  &lt;/p&gt;

&lt;h2&gt;
  
  
  Career Aspirations 🎯
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Land a full-time role as a &lt;strong&gt;Full-stack Developer&lt;/strong&gt; specializing in &lt;strong&gt;Angular&lt;/strong&gt; and modern web technologies.
&lt;/li&gt;
&lt;li&gt;Collaborate on impactful &lt;strong&gt;AI/ML-integrated web applications&lt;/strong&gt;, leveraging &lt;strong&gt;OpenAI&lt;/strong&gt; and &lt;strong&gt;Google Gemini APIs&lt;/strong&gt; for next-gen user experiences.
&lt;/li&gt;
&lt;li&gt;Expand my expertise into &lt;strong&gt;backend development&lt;/strong&gt; with &lt;strong&gt;NestJS&lt;/strong&gt; and explore microservices architecture.
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Skills to Learn 📚
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Advanced Angular Concepts&lt;/strong&gt;:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Optimize performance in large-scale applications.
&lt;/li&gt;
&lt;li&gt;Dive deeper into &lt;strong&gt;SSR (Server-Side Rendering)&lt;/strong&gt; and &lt;strong&gt;SSG (Static Site Generation)&lt;/strong&gt;.
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Generative AI and ML&lt;/strong&gt;:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Master &lt;strong&gt;Retrieval-Augmented Generation (RAG)&lt;/strong&gt; for AI-driven applications.
&lt;/li&gt;
&lt;li&gt;Build &lt;strong&gt;AI Agents&lt;/strong&gt; for automation and personalized experiences.
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Backend Development&lt;/strong&gt;:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Strengthen my understanding of &lt;strong&gt;NestJS&lt;/strong&gt; and explore &lt;strong&gt;GraphQL&lt;/strong&gt;.
&lt;/li&gt;
&lt;li&gt;Improve database management skills with &lt;strong&gt;PostgreSQL&lt;/strong&gt; and &lt;strong&gt;Supabase&lt;/strong&gt;.
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Cloud &amp;amp; DevOps&lt;/strong&gt;:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Learn how to deploy scalable apps on &lt;strong&gt;Microsoft Azure&lt;/strong&gt;.
&lt;/li&gt;
&lt;li&gt;Get hands-on with &lt;strong&gt;CI/CD pipelines&lt;/strong&gt; for seamless deployments.
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Data Structures and Algorithms (DSA)&lt;/strong&gt;: &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Enhancing my problem-solving abilities through DSA is high on my priority list. &lt;/li&gt;
&lt;li&gt;Mastery in this area will serve as a significant asset for placement preparation.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  💻 Side Projects to Pursue
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;AI-Powered Habit Tracker App&lt;/strong&gt;:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Build a feature-rich &lt;strong&gt;Angular app&lt;/strong&gt; with reminders, progress tracking, and seamless local storage integration.
&lt;/li&gt;
&lt;li&gt;Incorporate &lt;strong&gt;AI Agents&lt;/strong&gt; for personalized habit suggestions and daily goals.
&lt;/li&gt;
&lt;li&gt;Use &lt;strong&gt;RAG&lt;/strong&gt; to deliver motivational content or actionable tips tailored to user habits.
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;AI-Driven Learning Platform&lt;/strong&gt;:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Create an Angular platform that provides personalized course recommendations based on user interests.
&lt;/li&gt;
&lt;li&gt;Use &lt;strong&gt;RAG&lt;/strong&gt; for generating curated learning paths from multiple sources.
&lt;/li&gt;
&lt;li&gt;Integrate &lt;strong&gt;AI Agents&lt;/strong&gt; to answer user queries, schedule study sessions, and manage assignments.
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;GenAI-Powered Content Creation App&lt;/strong&gt;:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Build an Angular app that generates engaging blog posts, social media captions, or marketing content.
&lt;/li&gt;
&lt;li&gt;Use &lt;strong&gt;OpenAI APIs&lt;/strong&gt; and &lt;strong&gt;Google Gemini&lt;/strong&gt; to provide real-time editing suggestions and creative brainstorming.
&lt;/li&gt;
&lt;li&gt;Incorporate &lt;strong&gt;AI Agents&lt;/strong&gt; to automate publishing workflows and track content performance.
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;AI-Powered NBA Analytics Dashboard&lt;/strong&gt;:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Develop a dynamic dashboard for basketball fans to explore player stats, match predictions, and game insights.
&lt;/li&gt;
&lt;li&gt;Use &lt;strong&gt;RAG&lt;/strong&gt; to allow users to query historical stats and simulate matchups.
&lt;/li&gt;
&lt;li&gt;Integrate &lt;strong&gt;AI Agents&lt;/strong&gt; to suggest fantasy lineup recommendations or provide key game strategies.
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;AI-Powered Productivity Booster App&lt;/strong&gt;:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Create an Angular app that helps users optimize their daily tasks, using AI to prioritize and schedule work.
&lt;/li&gt;
&lt;li&gt;Incorporate &lt;strong&gt;RAG&lt;/strong&gt; to provide actionable insights from user task history.
&lt;/li&gt;
&lt;li&gt;Use &lt;strong&gt;AI Agents&lt;/strong&gt; for reminders, deadline management, and collaboration.
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  🌟 Personal Goals
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Publish at least &lt;strong&gt;one technical blog per month&lt;/strong&gt; to share insights and give back to the developer community.
&lt;/li&gt;
&lt;li&gt;Contribute to &lt;strong&gt;open-source projects&lt;/strong&gt;, especially those involving AI or web development.
&lt;/li&gt;
&lt;li&gt;Stay consistent with &lt;strong&gt;self-learning&lt;/strong&gt; by completing a minimum of two professional certifications this year.
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  🚀 The Bigger Picture
&lt;/h2&gt;

&lt;p&gt;By the end of 2025, I aspire to not only grow as a developer but also contribute to meaningful projects that blend technology and creativity. Whether it’s building innovative apps, contributing to open source, or mentoring budding developers, my goal is to make an impact and keep evolving in this ever-changing industry.  &lt;/p&gt;




&lt;p&gt;What’s your roadmap for 2025? Let me know in the comments or connect with me—let’s grow together! 🌟  &lt;/p&gt;

</description>
      <category>devchallenge</category>
      <category>newyearchallenge</category>
      <category>career</category>
      <category>ai</category>
    </item>
  </channel>
</rss>
