<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Rosius Ndimofor</title>
    <description>The latest articles on DEV Community by Rosius Ndimofor (@rosius_ndimofor_a6fdb4ebb).</description>
    <link>https://dev.to/rosius_ndimofor_a6fdb4ebb</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3918353%2Fc1727906-c4c1-4cce-881c-0caa977bb154.jpg</url>
      <title>DEV Community: Rosius Ndimofor</title>
      <link>https://dev.to/rosius_ndimofor_a6fdb4ebb</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/rosius_ndimofor_a6fdb4ebb"/>
    <language>en</language>
    <item>
      <title>What 10 University Visits in Cameroon Taught Me About Building AI for the Real World, and Why Gemma 4 Was the Answer</title>
      <dc:creator>Rosius Ndimofor</dc:creator>
      <pubDate>Thu, 21 May 2026 19:59:35 +0000</pubDate>
      <link>https://dev.to/rosius_ndimofor_a6fdb4ebb/educloud-a-fully-offline-ai-study-assistant-powered-by-gemma-4-e2b-mg1</link>
      <guid>https://dev.to/rosius_ndimofor_a6fdb4ebb/educloud-a-fully-offline-ai-study-assistant-powered-by-gemma-4-e2b-mg1</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/google-gemma-2026-05-06"&gt;Gemma 4 Challenge: Build with Gemma 4&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Built
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Why I built it
&lt;/h3&gt;

&lt;p&gt;I run &lt;a href="https://educloud.academy" rel="noopener noreferrer"&gt;&lt;strong&gt;Educloud Academy&lt;/strong&gt;&lt;/a&gt;, a learning platform focused on cloud and AI skills for African students. Over the last year my team and I have done university outreach across &lt;strong&gt;10 Cameroonian universities&lt;/strong&gt; ,including my own alma mater, the &lt;strong&gt;University of Buea&lt;/strong&gt;, running on-the-ground sessions on how to break into cloud and AI careers.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Outreach evidence (LinkedIn):&lt;/em&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://www.linkedin.com/posts/rosius_continuing-our-university-outreach-this-activity-7418985734780473345-gc7r" rel="noopener noreferrer"&gt;Continuing our university outreach&lt;/a&gt; — University of Buea session&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.linkedin.com/posts/educloud-academy_educloudacademy-ugcPost-7345085251271991296-W11K" rel="noopener noreferrer"&gt;EduCloud Academy: outreach announcement&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.linkedin.com/posts/kenn-brayan-nya-527846336_ndimoforatehrosius-cabreldomfang-ugcPost-7344652264915046401--ErQ" rel="noopener noreferrer"&gt;Student testimonial post&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.linkedin.com/feed/update/urn:li:activity:7351200292027129856/" rel="noopener noreferrer"&gt;Earlier outreach activity&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Two problems kept showing up at every single campus, no matter the topic:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;The internet is unreliable.&lt;/strong&gt; Students who'd love to use ChatGPT, NotebookLM, Claude, or Cohere Coral simply can't, because their connection drops mid-prompt, the data is expensive, or campus Wi-Fi can't sustain a real session. AI tools that require a cloud round-trip are &lt;em&gt;unusable&lt;/em&gt; for the majority of the students I've met.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Most of Cameroon is French-speaking.&lt;/strong&gt; A huge chunk of every audience I've sat in front of doesn't have English as a first language. But almost every AI study tool ships English-first, with French either missing or relegated to lossy machine translation that strips technical nuance.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;EduCloud (the app)&lt;/strong&gt; is my answer to both. It's a fully offline, on-device AI study assistant that turns a learner's own PDFs, textbooks, and lecture screenshots into interactive study materials — multiple-choice quizzes, flashcards, multi-lesson workshops, mind maps, and summaries — &lt;strong&gt;without a single byte ever leaving the device&lt;/strong&gt;. Because Gemma 4 is genuinely multilingual, the &lt;em&gt;same&lt;/em&gt; binary that serves an English-speaking student in Yaoundé serves a French-speaking student in Douala without an extra translation hop.&lt;/p&gt;

&lt;h3&gt;
  
  
  What it does
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Local RAG&lt;/strong&gt;: PDFs are chunked, embedded via Gecko 512, and indexed in ObjectBox with an HNSW vector index. Searches are millisecond-fast over thousands of chunks.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Vision ingestion&lt;/strong&gt;: Snap a photo of a textbook page or whiteboard; Gemma 4's vision modality extracts the text, and the same RAG pipeline takes it from there. (Critical for students working from photocopied notes.)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Structured generation via tool calling&lt;/strong&gt;: Quizzes, flashcards, and workshop outlines are emitted as constrained function calls so the model literally cannot produce malformed JSON. The runtime rejects invalid tokens at generation time.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Streaming Markdown lessons&lt;/strong&gt; with visible chain-of-thought: workshop lessons stream sentence-by-sentence, and Gemma 4's reasoning channel renders in a collapsible "thinking" panel so students can &lt;em&gt;see how the model arrived at an answer&lt;/em&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Interactive GenUI tutor&lt;/strong&gt;: a chat companion that can call action tools mid-conversation — award a badge, start a Pomodoro timer, narrate an explanation aloud, launch a subject-specific mini-game.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Adaptive memory tiers&lt;/strong&gt;: the app detects the device's total RAM at startup (&lt;code&gt;ultraLean / lean / balanced / full&lt;/code&gt;) and adjusts token budgets, RAG context size, and max question count so it runs cleanly on the iPhone 13 Pro Max I tested on as well as on lower-end mid-range Androids that students actually own.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Background task system&lt;/strong&gt;: generation and ingestion run on non-blocking queues with hard timeouts, per-chunk progress counters, and a Details dialog that surfaces the exact reason on failure.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Defense-in-depth JSON repair&lt;/strong&gt;: a 9-pass state-aware repair pipeline still runs as a fallback for the rare cases when tool calling isn't honored — so the app degrades gracefully instead of crashing.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Everything is built in Flutter + Dart, with native Gemma 4 inference via the &lt;code&gt;flutter_gemma&lt;/code&gt; plugin (LiteRT-LM under the hood), and persistence via ObjectBox. iOS, Android, macOS, Linux, and Windows are all supported from one codebase — important because campus device fleets aren't uniform.&lt;/p&gt;

&lt;h2&gt;
  
  
  Demo
&lt;/h2&gt;

&lt;p&gt;Watch the walkthrough on YouTube: &lt;strong&gt;&lt;a href="https://youtu.be/dVYz8xq2L_8" rel="noopener noreferrer"&gt;https://youtu.be/dVYz8xq2L_8&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;  &lt;iframe src="https://www.youtube.com/embed/dVYz8xq2L_8"&gt;
  &lt;/iframe&gt;
&lt;/p&gt;

&lt;h2&gt;
  
  
  Code
&lt;/h2&gt;

&lt;p&gt;Full source: &lt;strong&gt;&lt;a href="https://github.com/trey-rosius/Local-Educational-App" rel="noopener noreferrer"&gt;https://github.com/trey-rosius/Local-Educational-App&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Key directories to explore:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://github.com/trey-rosius/Local-Educational-App/blob/master/lib/services/educational_tool_service.dart" rel="noopener noreferrer"&gt;&lt;code&gt;lib/services/educational_tool_service.dart&lt;/code&gt;&lt;/a&gt; — all 8 function-call schemas (quiz, flashcards, workshop, plus 5 interactive tutor tools).&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/trey-rosius/Local-Educational-App/blob/master/lib/services/study_material_service.dart" rel="noopener noreferrer"&gt;&lt;code&gt;lib/services/study_material_service.dart&lt;/code&gt;&lt;/a&gt; — generation entry points + the JSON repair pipeline + semantic validators.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/trey-rosius/Local-Educational-App/blob/master/lib/services/rag_service.dart" rel="noopener noreferrer"&gt;&lt;code&gt;lib/services/rag_service.dart&lt;/code&gt;&lt;/a&gt; — PDF/image ingestion, batched embedding, HNSW search.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/trey-rosius/Local-Educational-App/blob/master/lib/services/background_generation_service.dart" rel="noopener noreferrer"&gt;&lt;code&gt;lib/services/background_generation_service.dart&lt;/code&gt;&lt;/a&gt; — the non-blocking task queue that branches between tool-call and text-mode paths.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/trey-rosius/Local-Educational-App/blob/master/architecture.excalidraw" rel="noopener noreferrer"&gt;&lt;code&gt;architecture.excalidraw&lt;/code&gt;&lt;/a&gt; — full visual architecture diagram (open at &lt;a href="https://excalidraw.com" rel="noopener noreferrer"&gt;https://excalidraw.com&lt;/a&gt;).&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/trey-rosius/Local-Educational-App/blob/master/README.md" rel="noopener noreferrer"&gt;&lt;code&gt;README.md&lt;/code&gt;&lt;/a&gt; — complete feature + architecture + failure-mode documentation.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  How I Used Gemma 4
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Model chosen: Gemma 4 E2B (&lt;code&gt;.task&lt;/code&gt;).&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;E2B was the only viable choice for a &lt;em&gt;truly&lt;/em&gt; on-device application of this scope:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Variant&lt;/th&gt;
&lt;th&gt;Size&lt;/th&gt;
&lt;th&gt;Fits on phone?&lt;/th&gt;
&lt;th&gt;Why I didn't pick it&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Gemma 4 E2B&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;~4 GB quantized&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;Chosen — see below&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Gemma 4 E4B&lt;/td&gt;
&lt;td&gt;~8 GB&lt;/td&gt;
&lt;td&gt;Borderline&lt;/td&gt;
&lt;td&gt;Too large for mid-range Android devices; iOS memory pressure during inference&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Gemma 4 31B Dense&lt;/td&gt;
&lt;td&gt;~62 GB&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;Cloud-scale only&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;E2B hits the sweet spot for what EduCloud needs:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Fits in mobile RAM and storage&lt;/strong&gt;: ~4 GB on disk, comfortable inference on a 6-8 GB device. Development and live testing was done on an &lt;strong&gt;iPhone 13 Pro Max&lt;/strong&gt; (6 GB RAM, A15 Bionic, 2021 hardware) — generation, vision OCR, and HNSW vector search all run smoothly there, which means the app targets phones that have been on the market for several years rather than only the latest flagship.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Vision modality&lt;/strong&gt;: ingest pages from photos via &lt;code&gt;Message.withImage(...)&lt;/code&gt; — critical for the "snap a textbook page" feature.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Function calling&lt;/strong&gt;: this is the linchpin of the architecture. With &lt;code&gt;ToolChoice.required&lt;/code&gt; + a &lt;code&gt;Tool&lt;/code&gt; schema, structured generation is &lt;em&gt;guaranteed valid&lt;/em&gt; — no JSON parsing failures possible. Before migrating to tool calling I had to maintain a ~700-line state-aware JSON repair pipeline to handle every quirk the model produced (asymmetric quotes, missing braces, Python-style single quotes, &lt;code&gt;\X&lt;/code&gt; escapes outside strings, &lt;code&gt;0.0&lt;/code&gt; as &lt;code&gt;answerIndex&lt;/code&gt; instead of &lt;code&gt;0&lt;/code&gt;, citation paste-throughs…). After the tool-calling migration, that pipeline is now a fallback that almost never fires.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;4096-token context&lt;/strong&gt;: large enough to inject 4-6 RAG chunks plus the tool schema plus a 10-question quiz response — but small enough to keep latency reasonable.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multilingual quality out of the box&lt;/strong&gt;: this matters more than any feature on this list for my actual user base. Cameroon has 10 French-speaking and 2 English-speaking regions. Gemma 4 handles French content (and reasoning &lt;em&gt;in&lt;/em&gt; French) at quality that's roughly on par with English — so the same app, with no extra translation layer or per-language fork, serves a francophone student in Yaoundé as well as an anglophone student in Buea. Cloud APIs charge per token; switching languages on-device is free.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;How Gemma 4 is wired into the app, end to end:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Embedding (Gecko 512)&lt;/strong&gt;: 512-dim sentence embeddings via &lt;code&gt;flutter_gemma&lt;/code&gt;'s &lt;code&gt;Embedder.generateEmbeddings(List&amp;lt;String&amp;gt;)&lt;/code&gt; batch API. Ingestion runs ~10× faster than per-chunk thanks to batching.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;RAG retrieval&lt;/strong&gt;: HNSW nearest-neighbor search over &lt;code&gt;Float32List(512)&lt;/code&gt; embeddings in ObjectBox.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Generation with tool calling&lt;/strong&gt;: &lt;code&gt;model.createChat(supportsFunctionCalls: true, tools: [...], toolChoice: ToolChoice.required)&lt;/code&gt; returns a &lt;code&gt;FunctionCallResponse&lt;/code&gt; with already-parsed &lt;code&gt;Map&amp;lt;String, dynamic&amp;gt;&lt;/code&gt; args. The runtime constrains generation at the token level.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Streaming text generation&lt;/strong&gt;: lesson bodies use &lt;code&gt;chat.generateChatResponseAsync()&lt;/code&gt; so the user sees Markdown materialize in real time.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Vision OCR&lt;/strong&gt;: image ingestion uses &lt;code&gt;Message.withImage&lt;/code&gt; with the same Gemma 4 model — no separate OCR engine needed.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Resource lifecycle&lt;/strong&gt;: &lt;code&gt;chat.close()&lt;/code&gt; after each generation releases the KV cache; the model is reused as a singleton across tasks (closing it triggers a native double-free at the LiteRT layer).&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The most valuable Gemma 4 feature for a project like this was &lt;strong&gt;tool calling&lt;/strong&gt;. Pre-Gemma-4 on-device LLMs would emit free-form JSON that I'd have to regex-and-state-machine my way through. With function calling, structured output is &lt;em&gt;guaranteed&lt;/em&gt; — which is what made the offline study-material generation feel as reliable as a cloud product.&lt;/p&gt;

&lt;p&gt;Sampling tuned for this app: &lt;code&gt;temperature: 0.3, topK: 40, topP: 0.95&lt;/code&gt; for tool calling (Gemma's published guidance for structured outputs — greedy decoding &lt;code&gt;temp~0, topK=1&lt;/code&gt; is prone to short repetition loops on long structured responses).&lt;/p&gt;




&lt;h2&gt;
  
  
  What this unlocks for the students I've actually met
&lt;/h2&gt;

&lt;p&gt;The next time I walk onto a Cameroonian campus to talk about cloud and AI careers, I won't have to caveat the AI portion of the talk with "…of course you'll need stable internet and an English-fluent prompt." I can hand a student a Gemma-4-powered app, watch them point it at their own French-language course PDF, and watch them get a quiz in French, generated on-device, with no data plan involved.&lt;/p&gt;

&lt;p&gt;That's the bar for AI tools that actually work for the next billion learners — and that's the bar Gemma 4 cleared.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Built with Flutter, Dart, &lt;code&gt;flutter_gemma&lt;/code&gt;, ObjectBox, LiteRT-LM, and a healthy disregard for cloud dependencies.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Follow me on &lt;a href="https://www.linkedin.com/in/rosius/" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt; or check out &lt;a href="https://educloud.academy" rel="noopener noreferrer"&gt;EduCloud Academy&lt;/a&gt; for what comes next.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>gemmachallenge</category>
      <category>gemma</category>
    </item>
  </channel>
</rss>
