<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: John Paul Curada</title>
    <description>The latest articles on DEV Community by John Paul Curada (@jpcurada).</description>
    <link>https://dev.to/jpcurada</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3948424%2F02103542-0bbc-4b9c-a9ff-d42e35aeca84.jpeg</url>
      <title>DEV Community: John Paul Curada</title>
      <link>https://dev.to/jpcurada</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/jpcurada"/>
    <language>en</language>
    <item>
      <title>LIKAS: An offline disaster companion for the Philippines, powered by on-device Gemma 4 E2B</title>
      <dc:creator>John Paul Curada</dc:creator>
      <pubDate>Sun, 24 May 2026 02:17:53 +0000</pubDate>
      <link>https://dev.to/jpcurada/likas-an-offline-disaster-companion-for-the-philippines-powered-by-on-device-gemma-4-e2b-ao1</link>
      <guid>https://dev.to/jpcurada/likas-an-offline-disaster-companion-for-the-philippines-powered-by-on-device-gemma-4-e2b-ao1</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/google-gemma-2026-05-06"&gt;Gemma 4 Challenge: Build with Gemma 4&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Built
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;LIKAS&lt;/strong&gt; — &lt;em&gt;your companion when calamity strikes the nation.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Likas&lt;/em&gt;, in Filipino, means &lt;strong&gt;nature&lt;/strong&gt; — and also &lt;strong&gt;to evacuate&lt;/strong&gt;. LIKAS is an offline, AI-powered disaster companion that helps Filipinos get to safety when nature turns against them.&lt;/p&gt;

&lt;p&gt;In the Philippines, the disaster &lt;em&gt;is&lt;/em&gt; the connectivity outage. The 2024 typhoon season alone brought six major storms in 30 days; the country averages ~20 typhoons, 100–150 felt earthquakes, and recurring volcanic activity yearly, with 74% of the population in disaster-prone areas. When a typhoon makes landfall or a fault ruptures, cell towers fail and the internet drops &lt;strong&gt;at the exact moment a family needs to know where the nearest evacuation center is and how to reach it.&lt;/strong&gt; Every "disaster app" I tried assumed the one thing never true in a disaster: a working network.&lt;/p&gt;

&lt;p&gt;LIKAS is a React Native app (Android/iOS) that turns a phone into a self-contained survival tool. Maps, evacuation centers, a pre-computed OSM pedestrian routing graph, NDRRMC/PAGASA/PHIVOLCS protocols, and a fine-tuned &lt;strong&gt;Gemma 4 E2B&lt;/strong&gt; model are bundled at install time. &lt;strong&gt;Zero network calls at runtime — by mandate, not as a fallback.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The centerpiece is an AI assistant that deliberately does &lt;strong&gt;not&lt;/strong&gt; behave like a chatbot. A 2B model that free-forms safety advice is a liability; one that &lt;em&gt;routes intent to grounded data&lt;/em&gt; is an asset. So LIKAS uses Gemma 4 as a &lt;strong&gt;task-specific tool-dispatcher&lt;/strong&gt;: every turn it emits exactly one JSON envelope — a tool call or a spoken reply, never prose around it.&lt;/p&gt;

&lt;p&gt;One real query, end to end: the user types, in Taglish, &lt;em&gt;"Saan kami pwedeng lumikas? May aso at lola ako"&lt;/em&gt; ("Where can we evacuate? I have a dog and a grandmother"). Turn 1 — the model emits &lt;code&gt;{"action":"tool","name":"route_to_nearest_evacuation","args":{}}&lt;/code&gt;. The app resolves this &lt;strong&gt;entirely on-device&lt;/strong&gt;: Dijkstra over the pedestrian graph finds walkable routes, then a weighted scorer ranks centers (&lt;code&gt;distance·0.4 + pwd·0.3 + pet·0.2 + capacity·0.1&lt;/code&gt;). Turn 2 — fed the result, the model produces the final answer &lt;strong&gt;personalized from the on-device profile&lt;/strong&gt;: it surfaces the pet-friendly, PWD-accessible center &lt;em&gt;because&lt;/em&gt; the question named a dog and a lola, in Taglish. No byte left the phone.&lt;/p&gt;

&lt;p&gt;Four tools ground every safety-critical answer in authority data, not the model's parameters:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;get_protocol&lt;/code&gt;&lt;/strong&gt; quotes NDRRMC/PHIVOLCS/PAGASA steps &lt;em&gt;verbatim&lt;/em&gt; (inventing safety steps is forbidden)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;route_to_nearest_evacuation&lt;/code&gt;&lt;/strong&gt; does offline, profile-aware routing&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;find_nearby&lt;/code&gt;&lt;/strong&gt; is offline POI search (hospital, school, gym, multi-purpose hall, covered court)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;get_user_profile&lt;/code&gt;&lt;/strong&gt; supplies on-device personalization (conditions, companions, meeting points)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Demo
&lt;/h2&gt;

&lt;p&gt;📺 &lt;strong&gt;Video walkthrough:&lt;/strong&gt; &lt;a href="https://www.youtube.com/watch?v=kHHcDSyip-Q" rel="noopener noreferrer"&gt;https://www.youtube.com/watch?v=kHHcDSyip-Q&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Code
&lt;/h2&gt;

&lt;p&gt;🔗 &lt;strong&gt;GitHub:&lt;/strong&gt; &lt;a href="https://github.com/JpCurada/likas" rel="noopener noreferrer"&gt;https://github.com/JpCurada/likas&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Supporting artifacts (all public):&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Resource&lt;/th&gt;
&lt;th&gt;Link&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Training Dataset&lt;/td&gt;
&lt;td&gt;&lt;a href="https://www.kaggle.com/datasets/jeypiic/likas-ai-datasets/" rel="noopener noreferrer"&gt;https://www.kaggle.com/datasets/jeypiic/likas-ai-datasets/&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Fine-Tuning Notebook (Unsloth)&lt;/td&gt;
&lt;td&gt;&lt;a href="https://www.kaggle.com/code/jeypiic/likas-fine-tuning-gemma-4-e2b-with-unsloth" rel="noopener noreferrer"&gt;https://www.kaggle.com/code/jeypiic/likas-fine-tuning-gemma-4-e2b-with-unsloth&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GGUF Export Notebook&lt;/td&gt;
&lt;td&gt;&lt;a href="https://www.kaggle.com/code/jeypiic/likas-with-llama-ccp-llama-rn-gguf-export" rel="noopener noreferrer"&gt;https://www.kaggle.com/code/jeypiic/likas-with-llama-ccp-llama-rn-gguf-export&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Sample Prompts Notebook&lt;/td&gt;
&lt;td&gt;&lt;a href="https://www.kaggle.com/code/jeypiic/likas-sample-prompts-for-the-fine-tuned-model" rel="noopener noreferrer"&gt;https://www.kaggle.com/code/jeypiic/likas-sample-prompts-for-the-fine-tuned-model&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Fine-Tuned Model (GGUF)&lt;/td&gt;
&lt;td&gt;&lt;a href="https://huggingface.co/jpcurada/likas-gemma4-e2b-gguf" rel="noopener noreferrer"&gt;https://huggingface.co/jpcurada/likas-gemma4-e2b-gguf&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;LoRA Adapter&lt;/td&gt;
&lt;td&gt;&lt;a href="https://huggingface.co/jpcurada/likas-gemma4-e2b-lora" rel="noopener noreferrer"&gt;https://huggingface.co/jpcurada/likas-gemma4-e2b-lora&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  How I Used Gemma 4
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;I chose Gemma 4 E2B — the 2B effective-parameter edge variant — because the entire premise of LIKAS is that the model has to live inside a phone that just lost signal.&lt;/strong&gt; The 4B and 31B variants would have been more capable; they would also have been impossible. E2B was the only flavor where a Q4_K_M quantization (~1.8 GB) fits comfortably in the RAM budget of a mid-range Android phone someone already owns, alongside MapLibre, the pedestrian graph, and the OSM POI data. The judges' question — "show us why your model was the right tool for the job" — has a sharp answer here: &lt;strong&gt;the disaster is the connectivity outage, so the model must run where the disaster is, and only E2B can.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;But putting a 2B model in the safety-critical path required two design moves that I think are the most interesting things about this submission:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Fine-tuning Gemma 4 E2B for one specific, life-or-death task — not as a better chatbot.&lt;/strong&gt; Using &lt;strong&gt;Unsloth's LoRA pipeline&lt;/strong&gt; I trained Gemma 4 E2B against the &lt;em&gt;same&lt;/em&gt; 9-rule system prompt the app ships: verbatim protocol quoting, mandatory tool use for safety/evacuation queries, profile personalization, off-topic refusal, same-language response across English / Filipino / Taglish. Because training and inference share one prompt verbatim, there is zero distribution shift between the evaluation notebook and the phone. The dataset was the hard part — v3 generated 692 conversations but only ~30% of assistant turns were unique (the verbatim-quote rule mapped many paraphrased questions to the &lt;em&gt;same&lt;/em&gt; protocol texts). The signature was unmistakable: train loss collapsed 3.47→0.31 in 24 steps while validation barely moved. The fix was a &lt;strong&gt;rule-preserving paraphrase generator&lt;/strong&gt; whose &lt;code&gt;extract_rules()&lt;/code&gt; pass parses every NDRRMC/PHIVOLCS step out of the source and asserts its presence in every variant — so surface form varies (terse / numbered / urgent / reassuring) without ever dropping a safety instruction.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Running Gemma 4 on-device via &lt;code&gt;llama.cpp&lt;/code&gt; with a GBNF grammar around it.&lt;/strong&gt; The fine-tuned model is quantized to &lt;strong&gt;Q4_K_M GGUF&lt;/strong&gt; and runs entirely on-device through &lt;strong&gt;&lt;code&gt;llama.rn&lt;/code&gt;&lt;/strong&gt; (&lt;code&gt;^0.12.0&lt;/code&gt;) — the full &lt;code&gt;llama.cpp&lt;/code&gt; engine in-process on the phone (&lt;code&gt;n_ctx: 4096&lt;/code&gt;, &lt;code&gt;n_threads: 4&lt;/code&gt;, full GPU-layer offload, no server). Sampling is locked low (&lt;code&gt;temperature 0.4&lt;/code&gt;): a disaster dispatcher should not be creative. Every turn must be exactly one parseable envelope, on a phone, with no server to retry against. Prompting alone can't guarantee that — so the decoder is constrained by a &lt;strong&gt;GBNF grammar&lt;/strong&gt; built from the live tool registry: one &lt;code&gt;speak&lt;/code&gt; production or one of four tool productions, with tool names and argument enums baked in as literals. The decoder physically cannot emit an invalid tool name or a half-formed object.&lt;/p&gt;

&lt;p&gt;A grammar guarantees &lt;em&gt;shape&lt;/em&gt;, not &lt;em&gt;choice&lt;/em&gt;. The fine-tuned model is reliable but inconsistent — for an evacuation question it sometimes emits the clean tool call and sometimes &lt;em&gt;narrates&lt;/em&gt; the tool inside a &lt;code&gt;speak&lt;/code&gt; reply, leaving nothing on the map at the exact moment it matters most. So the dispatch loop is wrapped in a &lt;strong&gt;deterministic rescue&lt;/strong&gt;: if the user's message unambiguously asks for evacuation or a nearby POI and the matching tool never ran, the app invokes that tool itself and pushes the route/pins to the map regardless of which shape the model chose. There is also a full no-LLM fallback: if the model fails to load or the battery is below 15%, the same keyword router still resolves evacuation and POI queries against the on-device data. &lt;strong&gt;The grounding survives even a dead model.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This separation is the heart of my use of Gemma 4: &lt;strong&gt;native function calling on the edge keeps a small model honest.&lt;/strong&gt; Gemma owns what it is good at — parsing messy, code-switched Filipino intent and producing fluent, personalized replies. The device owns what must never be hallucinated — routes, distances, protocol text, personal data. The model never &lt;em&gt;recalls&lt;/em&gt; a safety step; it &lt;em&gt;fetches&lt;/em&gt; one. That is what makes a 2B model trustworthy enough to put in front of someone mid-evacuation.&lt;/p&gt;

&lt;p&gt;Everything is verifiable: &lt;code&gt;Likas/src/services/aiAssistantService.ts&lt;/code&gt; runs the dispatch loop, &lt;code&gt;aiGrammar.ts&lt;/code&gt; builds the GBNF from the tool registry, &lt;code&gt;Likas/scripts/build_dataset_v4.py&lt;/code&gt; contains the overfit numbers, and &lt;code&gt;notebooks/Likas_Sample_Prompts.ipynb&lt;/code&gt; reproduces the exact production prompt and grammar at the same &lt;code&gt;n_ctx=4096&lt;/code&gt; — so on-device behavior is inspectable without building the app.&lt;/p&gt;

</description>
      <category>devchallenge</category>
      <category>gemmachallenge</category>
      <category>gemma</category>
      <category>ai</category>
    </item>
  </channel>
</rss>
