DEV Community

Cover image for Aasa: The Phone That Finally Notices
naveen g
naveen g

Posted on

Aasa: The Phone That Finally Notices

Gemma 4 Challenge: Build With Gemma 4 Submission

This is a submission for the Gemma 4 Challenge: Build with Gemma 4

What I Built

Aasa is a voice-first, local-first safety companion for elders living alone. It runs entirely on a Pixel-class Android phone, with Gemma 4 doing the conversational reasoning on-device.

The product is grounded in a 2022 peer-reviewed study (Kwan & Tam, IJERPH) of 47 older adults living alone in poverty. The dominant fear they reported was not death,it was "What if I die and no one notices?" Aasa is built to be the thing that finally notices, without stripping the elder's autonomy.

Concretely, Aasa handles:

  • Daily Heartbeat & Wellness check : watches for a single tap, conversation, or spoken word each day. If the morning passes in silence, it prepares a check-in to a trusted contact. Never auto-sends.
  • Document Reader (multimodal) : point the camera at a hospital bill, government letter, or scam mail. On-device OCR feeds the image's text to Gemma 4, which returns: what it is, what it's asking, what to worry about, what to ignore. The page never leaves the table.
  • Medicine Lens (multimodal) : photograph a pill strip; OCR + Gemma 4 produce a conservative safety receipt with duplicate-dose warnings against the local medication log.
  • Grounded medication & memory : "Did I take my medicine today?" reads from a Room database row, not from model hallucination. Family facts ("Ananya's birthday is May 12") persist locally.
  • Scam & Fraud Shield : flags urgency, gift-card asks, and "don't tell family" patterns in respectful, non-shaming language.
  • Safety triage with deferred confirmation β€” "I cannot breathe" prepares an ACTION_DIAL card. The elder still taps call. Aasa never holds CALL_PHONE.
  • Mobility Shield & Fall Triage : short on-device sensor checks. Confidence, not diagnosis.
  • Morning Briefing : Health Connect when available, honest "Demo data β€” no wearable connected" pill when not.

The core design rule across every feature: AI prepares help. The elder confirms.

Demo


πŸ“Ί 3-minute submission video (canonical cut): [https://www.youtube.com/watch?v=_5nvdQms7d4]

Code

πŸ”— GitHub: https://github.com/navng0405/aasa **

Repository layout:

aasa/
β”œβ”€β”€ Aasa/                    # Active Android app (Kotlin, Compose, Room, LiteRT-LM)
β”œβ”€β”€ aasa-gemma-server/       # Optional Mac FastAPIβ†’Ollama dev fallback bridge
β”œβ”€β”€ AasaGemmaBridgePoc/      # Legacy PoC, not active
└── AASA_PROJECT_OVERVIEW.md # Full architecture + demo doc
Enter fullscreen mode Exit fullscreen mode

Key entry points for reviewers:

  • agent/AgentOrchestrator.kt β€” single entry for every turn; runs deterministic safety overrides before trusting the model.
  • model/GemmaRouter.kt + model/OnDeviceGemmaRunner.kt β€” LiteRT-LM integration; on-device by default.
  • model/OnDevicePromptBuilder.kt β€” strict JSON contract the model must return (tool, intent, riskLevel, arguments, assistantResponse).
  • tools/ β€” 13 local tools. None of them can dial, text, or alert directly; they return ToolResult payloads the UI renders as confirmable action cards.
  • document/DocumentReaderAnalyzer.kt & medicine/MedicineLensAnalyzer.kt β€” multimodal OCR β†’ Gemma reasoning pipelines.

To run:

cd Aasa
./gradlew :app:installDebug
# Side-load the model (not bundled in APK):
adb push gemma-4-E2B-it.litertlm \
  /sdcard/Android/data/com.aasa.eldercare/files/models/
Enter fullscreen mode Exit fullscreen mode

Model: gemma-4-E2B-it.litertlm from litert-community/gemma-4-E2B-it-litert-lm.

How I Used Gemma 4

Model chosen: Gemma 4 E2B (Small Sizes, ~2B effective parameters)

Running via LiteRT-LM 0.11.0 on Android, side-loaded as a .litertlm artifact, with an optional FastAPI→Ollama Mac bridge as a developer fallback only.

Why E2B was the right tool for this job

The challenge brief asks for intentional model selection. For an elder-care safety companion, the decision wasn't close β€” E2B was the only honest choice, and here's the reasoning:

Requirement of the product What it forces in the model Why E2B fits
The phone of an elder living alone is the worst possible time to depend on a network. A scam call at 9pm, a fall at 3am, a hospital bill on a Sunday β€” none of these can wait for a cloud round-trip or a working Wi-Fi router. Must run fully offline on a Pixel-class device, with no degraded fallback path. E2B is purpose-built for "ultra-mobile, edge, and browser deployment (e.g., Pixel)." A 31B dense or 26B MoE model would have required a server, which would have broken the entire trust model.
Personal medical, family, and scam-message data must never leave the device. This isn't a marketing line β€” it's why the elder's daughter can recommend the app. Inference must happen on the same device that holds Room storage. E2B fits comfortably in the memory and thermal envelope of a Pixel 4a/6a. The user's bill photo, the granddaughter's birthday, the scam SMS β€” none of it touches a server.
Latency must feel like a conversation, not a query. Elders abandon apps that pause. Sub-2-second first-token latency on commodity hardware. E2B delivers this on-device. A larger Gemma 4 variant would have meant either a server (breaks rule #1) or unusably slow local inference.
The model is the conversational shape; deterministic local rules own safety. Aasa's AgentOrchestrator runs keyword-based overrides before the model's action is dispatched β€” for fall, scam, medication, and emergency phrases. The model needs to be good enough at intent + tool selection + tone, not a doctor or a lawyer. E2B is more than capable of: (a) emitting a structured JSON action, (b) writing warm, plain-language replies, and (c) summarizing OCR'd documents. We don't need 31B-grade world knowledge because the safety facts live in deterministic code and Room.
Multimodal document/medicine understanding without sending images to a cloud. A model small enough to pair with on-device ML Kit OCR and still respond in seconds. E2B handles the OCR-extracted text reasoning step locally. The pipeline is: camera β†’ ML Kit OCR on-device β†’ E2B reasoning on-device β†’ safety receipt. The image never leaves the phone.

What Gemma 4 actually does in the codebase

Gemma 4 E2B is doing real, load-bearing work on every turn:

  1. Intent + tool routing. Given the elder's raw utterance ("I feel weak and missed my medicine"), Gemma emits a JSON object with intent, riskLevel, tool, arguments, and assistantResponse. This drives which of the 13 tools runs.
  2. Tone shaping. Aasa's voice β€” calm, non-shaming, never diagnostic β€” comes from Gemma's reply text, constrained by the prompt contract in OnDevicePromptBuilder.kt.
  3. Document & medicine reasoning. After ML Kit OCRs a bill or pill strip, Gemma 4 produces the four-section summary (what it is / what it's asking / what to worry about / what you can ignore) and the medicine safety receipt.
  4. Scam pattern explanation. Deterministic keyword scanning catches gift-card/urgency phrases; Gemma 4 explains them in respectful language so the elder is informed without being shamed.
  5. Companion replies. When the elder says "I feel lonely today," Gemma 4 generates the gentle reply and offers the trusted-contact action card β€” but never auto-sends it.

What E2B unlocked

Picking the smallest Gemma 4 variant wasn't a compromise β€” it was the enabling constraint that let Aasa make three promises it couldn't have made with a bigger model:

  1. It works on the elder's existing phone. No second device, no cloud account, no monthly subscription.
  2. The data stays home. Medical context, family memories, and scam messages never leave the device. That is the product.
  3. It works at 3am with no Wi-Fi. Which, for an elder living alone, is the only deployment scenario that actually matters.

Honest limits

  • The model is side-loaded via adb, not downloaded in-app (hackathon scope).
  • I use prompt-engineered JSON rather than native constrained tool calling β€” a deliberate trade-off so the same prompt contract works for both the on-device runner and the optional Mac bridge.
  • Deterministic local rules outrank the model on safety-critical phrases. If Gemma says "low risk" but the user said "I cannot breathe," the local rule wins. This is by design, not a workaround.

Grounding study: Kwan, C. & Tam, H. C. (2022). "What If I Die and No One Notices?" A Qualitative Study Exploring How Living Alone and in Poverty Impacts the Health and Well-Being of Older People in Hong Kong. Int J Environ Res Public Health, 19(23), 15856. PMID 36497930.

Reach Out
Built by Naveen Gnanavel. If you have any feedback, questions, or are interested in collaborating on Aasa or similar AI-driven developer tooling, I'd love to hear from you!

Top comments (0)