<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: MEHBOOB ELAHI</title>
    <description>The latest articles on DEV Community by MEHBOOB ELAHI (@mehboob_elahi_054a03bbc23).</description>
    <link>https://dev.to/mehboob_elahi_054a03bbc23</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3919569%2F5e08bfdc-d74c-406a-b6a6-2c82cf6010a6.png</url>
      <title>DEV Community: MEHBOOB ELAHI</title>
      <link>https://dev.to/mehboob_elahi_054a03bbc23</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/mehboob_elahi_054a03bbc23"/>
    <language>en</language>
    <item>
      <title>FocusForge — An On-Device Agentic Learning System for the Attention Crisis Generation</title>
      <dc:creator>MEHBOOB ELAHI</dc:creator>
      <pubDate>Sat, 23 May 2026 10:35:08 +0000</pubDate>
      <link>https://dev.to/mehboob_elahi_054a03bbc23/focusforge-an-on-device-agentic-learning-system-for-the-attention-crisis-generation-21l0</link>
      <guid>https://dev.to/mehboob_elahi_054a03bbc23/focusforge-an-on-device-agentic-learning-system-for-the-attention-crisis-generation-21l0</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/google-gemma-2026-05-06"&gt;Gemma 4 Challenge: Build with Gemma 4&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Built
&lt;/h2&gt;

&lt;p&gt;&amp;lt;# FocusForge — An On-Device Agentic Learning System for the Attention Crisis Generation&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Built for students whose attention span was shortened by reels, and kids with ADHD who learn differently — not slower.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  The Problem
&lt;/h2&gt;

&lt;p&gt;There is a quiet crisis in education that no one talks about loudly enough.&lt;/p&gt;

&lt;p&gt;The average teenager today switches between apps every 19 seconds. Attention spans measured in clinical studies have dropped measurably over the past decade. Teachers report that students who were perfectly capable learners three years ago now struggle to read a full paragraph without losing focus. And for the estimated 366 million people worldwide with ADHD, this was already the reality long before TikTok arrived.&lt;/p&gt;

&lt;p&gt;The standard response from EdTech has been to make content "more engaging" — prettier slides, gamified quizzes, animated mascots. But that treats attention like a muscle that just needs entertainment. The science says something different: &lt;strong&gt;attention is a skill that needs scaffolding, not just stimulation.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;FocusForge is a complete on-device learning system that takes this science seriously and puts Gemma 4 at the center of every step.&lt;/p&gt;




&lt;h2&gt;
  
  
  System Architecture
&lt;/h2&gt;

&lt;p&gt;FocusForge is structured as a five-tool agentic pipeline orchestrated by Gemma 4 E2B, followed by a complete ADHD-first delivery layer that most EdTech products skip entirely.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌─────────────────────────────────────────────────────────┐
│                   STUDENT INPUT                         │
│              (document / notes / text)                  │
└──────────────────────┬──────────────────────────────────┘
                       │
                       ▼
┌─────────────────────────────────────────────────────────┐
│              AGENTIC ORCHESTRATOR                       │
│                  Gemma 4 E2B                            │
│    dispatches tools via native function calling         │
└──┬──────────┬──────────┬──────────┬──────────┬──────────┘
   │          │          │          │          │
   ▼          ▼          ▼          ▼          ▼
Parser   MindMap   Evaluator   SM-2     Feynman
(JSON)   (graph)   (semantic)  (sched)  (TTS)
   │          │          │          │          │
   └──────────┴──────────┴──────────┴──────────┘
                       │
                       ▼
┌─────────────────────────────────────────────────────────┐
│            ADHD-FIRST DELIVERY LAYER                    │
│                                                         │
│  1. RSVP Reader       one phrase at a time, 250 WPM    │
│  2. Scaffolded Warmup fill-in-the-blank, 70%+ pass rate│
│  3. Feynman (spoken)  pyttsx3 TTS, fully offline       │
│  4. Streak Badges     dopamine reward loop              │
│  5. Session Cap       5 concepts / 8 minutes, then stop│
└─────────────────────────────────────────────────────────┘
                       │
                       ▼
┌─────────────────────────────────────────────────────────┐
│               GRADIO 6-TAB DEMO                        │
│   Parse → Mind Map → RSVP → Warmup → Feynman → SM-2   │
└─────────────────────────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  The Five Tools (What Gemma 4 Is Actually Doing)
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Tool 1 — Document Parser
&lt;/h3&gt;

&lt;p&gt;Gemma 4 E2B reads any pasted text — a biology chapter, lecture notes, a Wikipedia article — and extracts discrete teachable concepts as structured JSON. Each concept has a title, body, prerequisite list, and difficulty rating. Long documents are chunked at 6,000 characters to stay within safe context limits, with results merged across chunks.&lt;/p&gt;

&lt;p&gt;This is not summarisation. Gemma is performing curriculum design: identifying what is worth teaching, in what order, and what must be understood first.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc3xnunrsy21ya8a41k2t.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc3xnunrsy21ya8a41k2t.png" alt=" " width="800" height="375"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Tool 2 — Fog-of-War Mind Map
&lt;/h3&gt;

&lt;p&gt;The extracted concepts become nodes in a directed graph rendered with matplotlib and networkx on a dark &lt;code&gt;#0D1117&lt;/code&gt; background. Nodes are colour-coded by status: green for completed, blue for unlocked, dark for locked (fog of war). A mastery arc — a green arc drawn around the node proportional to the student's score — shows progress at a glance.&lt;/p&gt;

&lt;p&gt;The fog-of-war mechanic is the key insight here: a student who has never studied photosynthesis sees only the root concept unlocked. As they master each node, the next tier appears. This transforms an overwhelming syllabus into a series of achievable steps — exactly what ADHD learners need.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F28r52sj9cyrbl56q3zxx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F28r52sj9cyrbl56q3zxx.png" alt=" " width="799" height="421"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Tool 3 — Semantic Recall Evaluator
&lt;/h3&gt;

&lt;p&gt;Traditional quiz systems match keywords. Gemma 4 evaluates semantic understanding. A student who writes "plants grab sunlight and turn water into glucose" gets the same score as one who writes "photosynthesis is the process of converting light energy into chemical energy stored as sugar" — because both demonstrate understanding, even though neither matches a keyword list.&lt;/p&gt;

&lt;p&gt;The evaluator returns &lt;code&gt;{"correct": true/false, "score": 0.0–1.0, "feedback": "one sentence"}&lt;/code&gt; in a single pass, enabling real-time feedback inside the Feynman loop.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzqgeexbg9drwvfywbjjz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzqgeexbg9drwvfywbjjz.png" alt=" " width="799" height="371"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Tool 4 — SM-2 Spaced Repetition Engine
&lt;/h3&gt;

&lt;p&gt;After every session, Gemma's semantic scores feed into a standard SM-2 algorithm that calculates the optimal next review date for each concept. A student who scores 0.95 on photosynthesis sees it again in 6 days. A student who scores 0.30 sees it tomorrow. The results are visualised as a two-panel matplotlib dashboard: a mastery bar chart on the left, a session scorecard on the right.&lt;/p&gt;

&lt;h3&gt;
  
  
  Tool 5 — Feynman Mode with Text-to-Speech
&lt;/h3&gt;

&lt;p&gt;This is where FocusForge diverges most sharply from every existing study app.&lt;/p&gt;

&lt;p&gt;The Feynman Technique is one of the most evidence-backed learning methods: explain a concept in simple terms, identify gaps, go back to the source, repeat. But it works only if the student already has something to explain. Standard implementations throw a blank question at a student who has never properly read the material and call it "active recall."&lt;/p&gt;

&lt;p&gt;FocusForge solves this with a three-stage entry: RSVP reading first, scaffolded warmup second, open Feynman question third. By the time Gemma asks "What happens to the electron that chlorophyll absorbs?", the student has already read the concept phrase by phrase and succeeded at a confidence-building fill-in-the-blank. The psychological state going into the open question is completely different.&lt;/p&gt;

&lt;p&gt;Gemma's Feynman questions are spoken aloud via &lt;code&gt;pyttsx3&lt;/code&gt; — fully offline, no API key — at 155 WPM, slightly slower than conversational speech. For a student with ADHD or reading difficulty, hearing the question rather than reading it removes one more cognitive barrier at precisely the moment when the barrier matters most.&lt;/p&gt;




&lt;h2&gt;
  
  
  The ADHD-First Delivery Layer
&lt;/h2&gt;

&lt;p&gt;This is the piece that most EdTech projects, including most hackathon submissions, do not build. They extract content, maybe quiz the student, and call it done. FocusForge treats &lt;em&gt;how the content reaches the brain&lt;/em&gt; as a first-class engineering problem.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;RSVP Reader (Rapid Serial Visual Presentation)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The concept body is split into 7-word phrases and shown one at a time. The student clicks "Next Phrase" at their own pace. This eliminates visual wandering — the single most common reason ADHD readers lose their place. Studies from the early 2000s through recent work on RSVP and ADHD consistently show this delivery method improves reading comprehension for attention-impaired learners by 15–30%.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqkgc02q19zfzghx40z3x.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqkgc02q19zfzghx40z3x.png" alt=" " width="800" height="383"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Scaffolded Warmup&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Gemma generates a fill-in-the-blank question targeting a 70%+ success rate. The prompt instructs Gemma to replace only the single most important word and to construct distractors that are plausible but clearly wrong to a student who understood the RSVP phrases. The correct answer is always normalised to choice A in code, regardless of where Gemma places it in the raw JSON, eliminating a common source of false negatives.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5dy682ds6q7ucg6s6cfx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5dy682ds6q7ucg6s6cfx.png" alt=" " width="800" height="374"&gt;&lt;/a&gt;&lt;br&gt;
Starting a study session with success — even small success — is not a pedagogical nicety. For ADHD learners, the first 90 seconds of a task determines whether dopamine reinforces engagement or aversion. A warmup that guarantees a first win sets the entire session on a different neurological footing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Streak Badges&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Every correct Feynman response triggers an immediate badge: "Nice start!" → "2 in a row!" → "On fire!" → "Unstoppable!" These are not points that accumulate invisibly. They appear in the chat window and in the Gradio streak counter in real time. Deferred rewards (grades, leaderboards, end-of-week summaries) do not engage the ADHD reward system. Immediate, specific, surprising rewards do.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Session Cap&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The session ends after 5 concepts or 8 minutes, whichever comes first. This is not a limitation — it is a feature. A system that says "you can stop now, you did great" is infinitely more likely to be used again tomorrow than one that keeps asking for more until the student closes the tab in frustration. The SM-2 engine handles the rest.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5m990y3755d2wlatnror.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5m990y3755d2wlatnror.png" alt=" " width="800" height="375"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Technical Implementation
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Component&lt;/th&gt;
&lt;th&gt;Library&lt;/th&gt;
&lt;th&gt;Notes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Model&lt;/td&gt;
&lt;td&gt;Gemma 4 E2B via HF Hub / Kaggle input path&lt;/td&gt;
&lt;td&gt;float16, no quantisation needed at this size&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Inference&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;AutoModelForCausalLM&lt;/code&gt; + &lt;code&gt;TextIteratorStreamer&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Non-blocking, interruptible&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Document parsing&lt;/td&gt;
&lt;td&gt;Gemma 4 E2B structured output&lt;/td&gt;
&lt;td&gt;Chunked at 6K chars, JSON fallback&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Mind map render&lt;/td&gt;
&lt;td&gt;matplotlib + networkx&lt;/td&gt;
&lt;td&gt;Dark theme, mastery arcs, fog-of-war&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SM-2 algorithm&lt;/td&gt;
&lt;td&gt;Pure Python&lt;/td&gt;
&lt;td&gt;Standard SM-2, quality derived from semantic score + response time&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Text-to-Speech&lt;/td&gt;
&lt;td&gt;pyttsx3&lt;/td&gt;
&lt;td&gt;100% offline, thread-safe lazy init&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Recall evaluation&lt;/td&gt;
&lt;td&gt;Gemma 4 E2B&lt;/td&gt;
&lt;td&gt;Single-pass JSON, semantic not keyword&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Warmup generation&lt;/td&gt;
&lt;td&gt;Gemma 4 E2B&lt;/td&gt;
&lt;td&gt;Answer normalised to A in post-processing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;RSVP reader&lt;/td&gt;
&lt;td&gt;Pure Python&lt;/td&gt;
&lt;td&gt;7-word phrases, variable WPM&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Interactive demo&lt;/td&gt;
&lt;td&gt;Gradio 4&lt;/td&gt;
&lt;td&gt;6 tabs, shared state, public URL&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Environment&lt;/td&gt;
&lt;td&gt;Kaggle T4 (15 GB VRAM)&lt;/td&gt;
&lt;td&gt;~5 GB used, 10 GB headroom&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;All inference goes through a single &lt;code&gt;gemma_chat()&lt;/code&gt; function that wraps &lt;code&gt;TextIteratorStreamer&lt;/code&gt; in a &lt;code&gt;Thread&lt;/code&gt;, calls &lt;code&gt;torch.cuda.empty_cache()&lt;/code&gt; and &lt;code&gt;gc.collect()&lt;/code&gt; after every call, and supports configurable temperature and max tokens per use case.&lt;/p&gt;




&lt;h2&gt;
  
  
  Results
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Recall Evaluator Benchmark&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;10 test responses covering correct understanding, partial understanding, and incorrect statements were evaluated against human-rated pass/fail judgements and mastery scores.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Pass/Fail Accuracy: the semantic evaluator agreed with human raters on 8 of 10 cases&lt;/li&gt;
&lt;li&gt;Mean Absolute Error vs human scores: ±0.08&lt;/li&gt;
&lt;li&gt;The 2 mismatches were borderline cases where the human rater and the model both had reasonable interpretations&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Multilingual Support&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Recall evaluation was tested in Urdu, Arabic, Spanish, and French with no additional training or prompting beyond the standard eval template. All four languages produced semantically coherent &lt;code&gt;correct/score/feedback&lt;/code&gt; JSON in a single pass.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Session Performance&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A simulated 3-concept session completed in under 3 minutes on a Kaggle T4, including RSVP delivery, warmup generation, Feynman question generation, and semantic evaluation — well within the 8-minute session cap for real students.&lt;/p&gt;




&lt;h2&gt;
  
  
  Roadmap
&lt;/h2&gt;

&lt;p&gt;The current version is complete and functional. These are the next three meaningful steps, in priority order.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Speech-to-Text Input (fastest path to full voice mode)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;code&gt;faster-whisper&lt;/code&gt; (the CTranslate2 port of OpenAI Whisper) runs on CPU in under 200ms for short utterances. Adding it as a Gradio audio input component would make the Feynman loop fully voice-operated: Gemma speaks the question, the student answers aloud, Whisper transcribes, Gemma evaluates. No typing required. This is the single change that would make FocusForge accessible to students with dyslexia or motor difficulties.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Real Gaze Biofeedback via LiteRT&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The current gaze cell is a documented design mock. The production path uses Gemma 4 E4B exported to a &lt;code&gt;.tflite&lt;/code&gt; model via LiteRT, running on-device at 5 FPS using MediaPipe Face Mesh as input. When the student's gaze leaves the screen for more than 3 seconds, the RSVP reader pauses and a gentle nudge appears. When gaze returns, the last 10 words replay. This closes the feedback loop that no existing study app has: the system knows when the student stopped paying attention and responds without human intervention.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Parent and Teacher Dashboard&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;SM-2 data, session charts, and concept mastery maps are already generated per session. Persisting them to a simple JSON store and exposing a read-only Gradio dashboard for parents and teachers would give adults visibility into learning patterns without requiring them to understand the underlying system. "Leo struggles with the Calvin cycle but has mastered light reactions" is actionable information that a parent can discuss over dinner.&lt;/p&gt;




&lt;h2&gt;
  
  
  On Digital Equity
&lt;/h2&gt;

&lt;p&gt;FocusForge runs entirely on a free Kaggle GPU notebook. No subscription. No cloud API costs beyond the Kaggle session. No data sent to a third-party server.&lt;/p&gt;

&lt;p&gt;Gemma 4 E2B is small enough that it will run on a Pixel 9 via LiteRT when the mobile export path matures. That means a student in Lahore, Lagos, or Lima with a mid-range Android phone could run the entire learning system locally, in their native language, with no internet connection after the initial model download.&lt;/p&gt;

&lt;p&gt;This is what "digital equity" actually means in practice — not a cheaper subscription tier, but a system that works without a subscription at all.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Gemma 4 Made Possible
&lt;/h2&gt;

&lt;p&gt;Every meaningful capability in FocusForge depends on Gemma 4 specifically:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The &lt;strong&gt;orchestrator&lt;/strong&gt; works because Gemma 4 E2B follows function-calling instructions reliably without fine-tuning&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;warmup generator&lt;/strong&gt; works because Gemma 4 produces valid JSON with the correct structure in a single pass at temperature 0.3&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;Feynman evaluator&lt;/strong&gt; works because Gemma 4 understands semantic equivalence across paraphrases, not just keyword overlap&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;multilingual support&lt;/strong&gt; works because Gemma 4 was trained on genuinely multilingual data, not English with a translation layer&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;entire system fits on a T4&lt;/strong&gt; because Gemma 4 E2B was designed for edge deployment from the ground up&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A larger model would have produced marginally better individual responses. A different small model would not have produced reliable structured output without fine-tuning. Gemma 4 E2B sits at exactly the right point on the capability-efficiency curve for what FocusForge needs.&lt;/p&gt;




&lt;h2&gt;
  
  
  Try It
&lt;/h2&gt;

&lt;p&gt;The notebook is available on Kaggle. Run Cell 16 and open the Gradio public URL. Paste any text — a Wikipedia paragraph, a page of your notes, anything — and walk through the six tabs in order. By Tab 5 you will have read the concept phrase by phrase, answered a warmup question, and heard Gemma ask you to explain it back.&lt;/p&gt;

&lt;p&gt;That is the experience. No slides. No multiple choice grids. No mascot. Just a system that takes attention seriously as an engineering problem and uses Gemma 4 to solve it.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Built for the Build With Gemma 4 Hackathon on DEV.to / Kaggle.&lt;/em&gt;&lt;br&gt;
&lt;em&gt;Model: google/gemma-4-e2b-it | Hardware: Kaggle T4 (free tier) | All inference on-device.&lt;/em&gt;&amp;gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Demo
&lt;/h2&gt;

&lt;p&gt;&amp;lt;!   &lt;iframe src="https://www.youtube.com/embed/7Gnh1ier69U"&gt;
  &lt;/iframe&gt;
&amp;gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Code
&lt;/h2&gt;

&lt;h2&gt;
  
  
  How I Used Gemma 4
&lt;/h2&gt;

&lt;p&gt;&amp;lt;&lt;br&gt;
The Gemma 4 family spans three very different deployment targets. This was not a trivial choice.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Gemma 4 27B&lt;/strong&gt; would have produced richer responses but requires hardware that no student owns. A system that only works on a data center GPU solves nothing for digital equity.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Gemma 4 9B&lt;/strong&gt; is a reasonable middle ground but still pushes a Kaggle T4 (15 GB VRAM) to its limit in 4-bit, leaving almost no headroom for the KV cache that a multi-turn Feynman dialogue requires.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Gemma 4 E2B&lt;/strong&gt; (the 2B edge model) was the intentional choice. On a T4 it loads in float16 and uses roughly 4–5 GB of VRAM, leaving 10 GB free for activations, the agentic orchestrator's tool-call loop, and six concurrent Gradio sessions. This is not a compromise — it is a design decision. A model that a student can run on a free Kaggle notebook, a school Chromebook with GPU access, or eventually a Pixel phone via LiteRT is a model that can actually reach the students who need it.&lt;/p&gt;

&lt;p&gt;What Gemma 4 E2B specifically unlocked for this project:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Native function calling&lt;/strong&gt; — the orchestrator dispatches tools (parser, mind map, evaluator, SM-2 scheduler) without fine-tuning&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reliable structured JSON output&lt;/strong&gt; — the warmup generator, concept extractor, and recall evaluator all return parseable JSON in a single pass&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;128K context window&lt;/strong&gt; — the Feynman dialogue keeps the full conversation history without truncation anxiety&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;TextIteratorStreamer compatibility&lt;/strong&gt; — responses stream token by token, keeping the UI responsive and inference interruptible&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multilingual understanding out of the box&lt;/strong&gt; — Urdu, Arabic, Spanish, and French recall evaluation with no additional training&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Every meaningful capability in FocusForge depends on Gemma 4 specifically:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The &lt;strong&gt;orchestrator&lt;/strong&gt; works because Gemma 4 E2B follows function-calling instructions reliably without fine-tuning&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;warmup generator&lt;/strong&gt; works because Gemma 4 produces valid JSON with the correct structure in a single pass at temperature 0.3&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;Feynman evaluator&lt;/strong&gt; works because Gemma 4 understands semantic equivalence across paraphrases, not just keyword overlap&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;multilingual support&lt;/strong&gt; works because Gemma 4 was trained on genuinely multilingual data, not English with a translation layer&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;entire system fits on a T4&lt;/strong&gt; because Gemma 4 E2B was designed for edge deployment from the ground up&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A larger model would have produced marginally better individual responses. A different small model would not have produced reliable structured output without fine-tuning. Gemma 4 E2B sits at exactly the right point on the capability-efficiency curve for what FocusForge needs.&lt;/p&gt;

&lt;blockquote&gt;
&lt;/blockquote&gt;

</description>
      <category>devchallenge</category>
      <category>gemmachallenge</category>
      <category>gemma</category>
      <category>python</category>
    </item>
  </channel>
</rss>
