DEV Community

Cover image for How Gemma 4 Helps My Son with ASD Understand His Day
HIROTAKA SAITO
HIROTAKA SAITO

Posted on

How Gemma 4 Helps My Son with ASD Understand His Day

Gemma 4 Challenge: Build With Gemma 4 Submission

This is a submission for the Gemma 4 Challenge: Build with Gemma 4

What I Built

Yurito App — an AI-powered visual schedule PWA for my 10-year-old son, who has Autism Spectrum Disorder (ASD).

Many children with ASD have difficulty with verbal communication — my son is one of them. His teachers carry thick bundles of picture cards wherever they go: a card showing the classroom, one for the gym, one for the lunchroom. Every time they need to tell him "we're going here next," they flip through the stack to find the right card and hold it up. That picture card is the conversation.

Watching this gave me the idea for the app. If we could put those cards — and a full visual schedule — on a single device, teachers wouldn't need to carry stacks of paper, and my son could see his whole day laid out in pictures at any moment.

For children like him, not knowing what comes next is itself a source of anxiety. A sudden change — gym cancelled because of rain, a substitute teacher, a field trip rescheduled — can trigger real distress. They need to see what's happening, explained in pictures, not just told in words.

Yurito App gives parents and teachers two AI superpowers, both powered by Gemma 4 (gemma-4-31b-it):


Feature 1: Timetable Import — One Photo, Full Week Schedule

School timetables in Japan are printed sheets — and at my son's school, a new one is handed out every single week. That means a fresh set of periods, different days for PE or music, changing event schedules. Manually re-entering all of it every week into a digital visual schedule would be unsustainable.

With Yurito App, you snap a photo of the timetable and upload it. Gemma 4 reads the image, extracts every period across the week, and writes a complete set of daily schedules into Firestore — automatically linked to Yurito's personal picture card library (custom photos and icons for familiar activities like "Math", "PE", "Morning Assembly").

Result: A full week of visual schedules from a single photo in about 1 minute.


Feature 2: Change Comics — A Visual Story That Explains "Why"

When something changes, Yurito doesn't just need to know what changed. He needs to understand why — in a visual, step-by-step way, without walls of text.

The parent or teacher types a short sentence like:

"It's raining, so gym class is cancelled. We're going to the library instead."

Gemma 4 turns that into a 2–4 panel visual comic strip using the app's picture card library. Each panel combines a card image (e.g., the PE card, the rain card, the library card) with a symbol overlay (❌ cross = cancelled, 🌧️ rain, ✅ check = OK, → arrow = next).

The panels are displayed full-screen, one at a time, on a calm blue background. Yurito taps through at his own pace and taps "I understand!" when he's ready. If he wants to review it again from the activity detail screen, it's always one tap away.

Result: Fewer surprises. Fewer meltdowns. A child who feels respected and informed.


The app is a React 19 + TypeScript PWA backed by Firebase (Firestore, Storage, Auth, Functions v2). Schedule and progress data are available offline via Firestore's persistentLocalCache. Full offline AI support — running Gemma on-device so visual stories can be generated without any internet connection — is on the roadmap, and choosing Gemma was partly motivated by this future: Gemma models are designed to run on-device, keeping that path open.

The app follows ASD-friendly UX principles:

  • Minimum 120×120dp tap targets
  • Warm off-white background (#F5F0E8) — never pure white
  • No per-second countdown timers (prevents behavioral fixation)
  • Change alerts always show Before → After in full-screen sequence

Demo


Code

Repository: github.com/pirotaka0204/public_yurito_app

The AI features run as Firebase Functions v2 (asia-northeast1), triggered by new Firestore documents. Both functions use @google/genai with gemma-4-31b-it.


Timetable Import (yurito/functions/src/index.ts)

export const parseScheduleJob = onDocumentCreated(
  {
    document: 'users/{uid}/importJobs/{jobId}',
    timeoutSeconds: 300,
    memory: '512MiB',
    secrets: ['GEMMA_API_KEY'],
  },
  async (event) => {
    // Download timetable image from Firebase Storage
    const bucket = getStorage().bucket()
    const [buffer] = await bucket.file(imagePath).download()
    const base64 = buffer.toString('base64')

    // Send image + structured prompt to Gemma 4
    const ai = new GoogleGenAI({ apiKey: process.env.GEMMA_API_KEY })
    const result = await ai.models.generateContent({
      model: 'gemma-4-31b-it',
      contents: [{
        role: 'user',
        parts: [
          { inlineData: { mimeType: 'image/jpeg', data: base64 } },
          { text: buildPrompt(year) },
        ],
      }],
    })

    // extractJson() strips markdown fences, then parse
    const parsed = JSON.parse(extractJson(result.text ?? '')) as { schedules: unknown }
    await jobRef.update({ status: 'done', schedules: parsed.schedules })
  }
)
Enter fullscreen mode Exit fullscreen mode

The prompt instructs Gemma 4 to extract each day as a ParsedSchedule, group all school activities under a single "授業" (class) place, merge consecutive identical periods, and return strict JSON matching the app's type schema.


Change Comics (yurito/functions/src/generateStory.ts)

export const generateStoryJob = onDocumentCreated(
  {
    document: 'users/{uid}/storyJobs/{jobId}',
    timeoutSeconds: 120,
    memory: '256MiB',
    secrets: ['GEMMA_API_KEY'],
  },
  async (event) => {
    const ai = new GoogleGenAI({ apiKey: process.env.GEMMA_API_KEY })
    const cardListText = job.cardList.map((c) => `${c.id}: ${c.label}`).join('\n')

    const result = await ai.models.generateContent({
      model: 'gemma-4-31b-it',
      contents: [{
        role: 'user',
        parts: [{ text: buildStoryPrompt(job.sourceText, job.blockLabel, cardListText) }],
      }],
    })

    const parsed = JSON.parse(extractJson(result.text ?? '')) as { panels: VisualPanel[] }
    await jobRef.update({ status: 'done', panels: parsed.panels })
  }
)
Enter fullscreen mode Exit fullscreen mode

The story prompt passes the full card library (ID + label pairs) and instructs Gemma 4 to select appropriate cards and symbols, producing a coherent 2–4 panel narrative following the pattern: [before + cross] → [reason (weather, etc.)] → [after + check].


Async Job Pattern (client side)

Both features use the same Firestore async job pattern: the client creates a job document, the Firebase Function picks it up, runs Gemma 4, and writes the result back. The client watches for completion via onSnapshot.

// Create job
await setDoc(jobRef, { sourceText, cardList, status: 'pending', createdAt: Date.now() })

// Watch for completion
const unsubscribe = onSnapshot(jobRef, (snap) => {
  const job = snap.data() as StoryJob
  if (job.status === 'done') {
    setStory({ panels: job.panels! })
    unsubscribe()
  }
})
Enter fullscreen mode Exit fullscreen mode

extractJson utility

LLMs sometimes wrap JSON in markdown code fences even when instructed not to. This small utility handles both cases reliably:

export function extractJson(text: string): string {
  const fenced = text.match(/```
{% endraw %}
(?:json)?\s*([\s\S]*?)\s*
{% raw %}
```/)
  if (fenced) return fenced[1]
  const obj = text.match(/\{[\s\S]*\}/)
  if (obj) return obj[0]
  return text
}
Enter fullscreen mode Exit fullscreen mode

How I Used Gemma 4

Model choice: gemma-4-31b-it (31B Dense)

I chose the 31B Dense model for both features. Here's why each feature specifically benefits from it:

Timetable Import relies on Gemma 4's multimodal (vision) capability. Japanese school timetables are dense grids with small kanji, subject abbreviations, time columns, and varying layouts across schools. The 31B model handles this reliably — it reads the visual structure, not just OCR text. A smaller model would struggle with the density and the structured reasoning needed to output a clean JSON schedule.

Change Comics is a constrained structured generation task. The model must:

  1. Understand a short Japanese sentence describing a situation
  2. Pick the most appropriate cards from a provided list of card IDs and labels
  3. Assign meaningful symbols (cross/check/rain/arrow/etc.) to each panel
  4. Tell a coherent 2–4 step visual story that a child can follow

The 31B model consistently selects contextually appropriate cards (e.g., rain card + cross over the PE card for "gym cancelled because of rain") and outputs clean JSON. Smaller models would produce hallucinated card IDs or incoherent panel sequences.

Why Firebase Functions async pattern

Timetable import takes about 1 minute end-to-end — fine for an async background job, but completely wrong as a blocking request from a PWA that Yurito might be using mid-morning. The Firestore job pattern decouples inference from the UI: the app stays responsive while the Function runs, and the result appears as soon as it's ready. It also gives us retry logic and error visibility for free (the status field).


What's next: preference inference

The feature I'm most excited to build next is activity preference inference. Yurito sometimes resists a scheduled activity — for example, he may not want to do a worksheet during a class period. Today, a parent has to manually find an alternative.

The vision: Gemma reads my son's history of past choices and reactions, reasons about what he's likely to engage with right now, and suggests a replacement activity — "How about coloring instead?" — that he'll actually say yes to. For a child who can't negotiate in words, this kind of picture-card-mediated, AI-reasoned negotiation could make a real difference. And the on-device inference path (no internet required at the moment of the meltdown) is what makes it trustworthy for real-world use.


Building this app taught me that AI doesn't have to be flashy to be meaningful. For Yurito, a weekly timetable scan and a four-panel comic strip are the difference between a calm morning and a difficult one. That's the kind of impact I hope Gemma 4 keeps enabling.

Top comments (0)