This is a submission for the Gemma 4 Challenge: Build with Gemma 4
What I Built
Traditional first aid apps often fail during high-stress crises because they bury instructions in dense text or rely on a stable internet connection. Conversely, standard cloud AI chatbots are highly prone to hallucinations, posing a massive safety risk if they invent incorrect medical dosages or dangerous treatments when a user is vulnerable.
AI-dy bridges this gap. It is an offline-first mobile application that combines proactive first aid education with instant, on-device AI assistance. By using spaced repetition (the SM-2 algorithm) in its "Drill Mode," it helps users build lasting knowledge before a crisis hits.
Built as a Turborepo monorepo, AI-dy combines an Expo 54 (React Native) mobile frontend with a NestJS/PostgreSQL backend for cross-device sync. Crucially, the app bundles 58 lessons across 24 topics as static JSON, meaning the core application and its underlying AI features remain fully operational even with zero network bars.
Demo
Code
AI-dy — On-Device Emergency First Aid Assistant
AI-dy is a mobile-first first aid companion that teaches you when you're calm and guides you when you're not. It runs Gemma 4 entirely on your device — no cloud, no connectivity required — so it works in the moments that matter most.
What AI-dy Does
Learn Mode
Structured, searchable first aid education across 58 lessons, 24 topics, and 3 age groups (Adult / Child / Infant).
- Browse and filter lessons by topic and age group
- Step-by-step illustrated instructions with Cloudinary-hosted images
- Watch video demonstrations for key lessons (YouTube, in-app player)
- Take per-lesson quizzes to test comprehension
- All content is bundled offline — no network required
Drill Mode
Spaced-repetition practice across 400+ quiz questions using the SM-2 algorithm.
- Configurable sessions: Quick (5), Standard (10), or Full (20) questions
- Filter by topic and age group before starting
- SM-2 scheduling surfaces weak spots first and…
How I Used Gemma 4
AI-dy runs entirely on-device, prioritizing user privacy, speed, and absolute reliability during natural disasters or in rural areas.
Model Selection
I chose Gemma 4 E4B IT with Q4_K_M quantization (~5 GB) executed via llama.rn. The E4B variant is the perfect sweet spot: it comfortably fits within the memory constraints of modern flagship smartphones, yet retains the rigorous instruction-following precision needed for safety-critical interactions that smaller models lack. Users download the multimodal vision projector (mmproj, ~990 MB) during an initial setup to unlock full offline visual capabilities.
How Gemma 4 Powers the Experience
- Context-Aware Lesson Assistant: When a user asks a question while browsing a lesson, the app dynamically injects the current lesson title and step data into Gemma 4's context window. This ensures the model gives immediate, hyper-relevant answers tailored exactly to the procedure the user is looking at (e.g., adult burn treatments).
- Crisis Chat with Multimodal Vision: In an emergency, users can take a photo of an injury. Gemma 4’s vision capabilities analyze the image to identify the type of trauma and guide the user through immediate care steps. If the AI detects that visual context would help, it proactively prompts the user to snap a picture.
- Voice-First Emergency Navigation: For hands-free operation, a hybrid intent-extraction layer processes voice input. Local keyword matching handles instant navigation for clear emergencies (e.g., shouting "choking!"), while on-device Gemma 4 handles ambiguous natural language inputs—all without a single network round-trip.
Overcoming Technical Challenges
-
Bypassing Latency: Gemma 4's native reasoning chain can cause a delay. To minimize response latency when seconds count, I utilized the
<|channel|>answer\nmarker to skip the "thinking" phase, forcing the model to start streaming the final, actionable response instantly. -
Stable Vision Inference: To prevent standard context shifting from corrupting critical image data during an interaction, I locked the image embeddings in place by setting
ctx_shift: false. - Guaranteed Medical Safety: To eliminate the risk of AI hallucinating medical protocols, I enforced a strict architectural boundary. Structured decision trees for medical procedures remain hardcoded and verified; Gemma 4 is strictly limited to intent extraction, navigation, and clarifying context. It can guide you to the right protocol, but it can never "invent" a new one.
What's Next
- Medical & Organizational Partnerships: I plan to partner with medical professionals and organizations like the American Red Cross and AHA to continuously validate our content and offer certifiable course tracks.
- Emergency Services Integration: I aim to build direct dispatch integration into the app. This will allow a user to call emergency services natively while automatically transmitting location data, incident types, and a structured summary of the first aid steps already taken.
- A Purpose-Built First Aid Model: While Gemma 4 is an incredible general base, our ultimate goal is to fine-tune a specialized, multilingual first aid model on peer-reviewed protocols to optimize safety-critical performance in multiple languages.
Summary
AI-dy demonstrates that powerful, multimodal AI can be a responsible tool in high-stakes domains. By running Gemma 4 E4B entirely on-device, bypassing the thinking chain to cut latency, and constraining the AI to navigation rather than text generation, we’ve built a first aid companion that teaches better in calm moments and thinks faster in a crisis—without ever needing a signal bar to do either.
Team
- Aaliyah Junaid (@aaliyahjunaid) - Software Engineer
Top comments (0)