Temiloluwa Valentine

Posted on May 24

Every Time She Got Confused Online, She Called Me. I Got Tired of Answering. So I Built This.

#devchallenge #gemmachallenge #gemma

Gemma 4 Challenge: Build With Gemma 4 Submission

This is a submission for the Gemma 4 Challenge: Build with Gemma 4

My cousin has a learning disability.

Not the kind people notice immediately. She holds a conversation fine. She laughs at the right moments. She is sharp in ways that matter.

But put her in front of a dense webpage, a medical article, a GitHub README, a LinkedIn thread and something shifts. The words blur. The structure overwhelms. She closes the tab and calls me.

For two years, I was her human filter for the internet.

The Problem Nobody Talks About

The internet assumes you can:

Read fast
Parse dense structure
Context-switch without losing the thread
Understand jargon on sight

A lot of people cannot. And nobody is building for them.

My cousin is not alone. People with dyslexia, ADHD, processing disorders, low digital literacy, and non-native English speakers all hit the same wall every day. They just hit it quietly.

I got tired of being the workaround. So I built Aura.

What Aura Is

Aura is a Chrome extension that puts Gemma 4 directly on every webpage.

No tab switching. No copy-pasting into ChatGPT. No context lost.

You click the floating orb. A panel slides in. You pick what you need:

Summarize Page — get the key points in seconds
Explain Code — understand what it does and why
Draft Reply — reply to LinkedIn messages that match the tone of the conversation
Create Post — turn any article into LinkedIn post ideas
Highlight & Ask — select any text on the page and ask Aura anything about it

The AI lives on the page with you. You never leave.

The Demo

Why I Migrated from Llama to Gemma 4

I originally built Aura with Llama 3.1 8B via Cloudflare Workers AI.

It worked. Responses came back. Features ran.

But when I swapped to Gemma 4 31B, I felt the difference in the first response.

Llama told me what the code did. Gemma 4 told me why it was written that way.

Llama drafted a generic professional reply. Gemma 4 read the tone of the conversation and matched it.

For a general tool, that gap is a nice-to-have. For a tool built for people who struggle with comprehension that gap is everything.

Why Gemma 4 31B Specifically

Gemma 4 comes in three variants. I did not pick 31B by default. I picked it deliberately.

Model	Why I didn't pick it
2B / 4B	Too shallow for the reasoning depth Aura needs across wildly different content types
26B MoE	Great for edge inference but Aura needs consistent quality across all content types, not specialized routing
31B Dense	✅ Full parameter activation. Maximum reasoning quality. Consistent across every content type.

Here is why dense architecture matters for Aura specifically:

MoE models route tokens through specialized subnetworks they activate only some parameters depending on the input. That is efficient. But Aura handles a GitHub README, a LinkedIn thread, a medical article, and a Stack Overflow answer sometimes in the same session.

Dense models activate all parameters for every token. Gemma 4 31B does not guess which expert to wake up. It brings everything it knows to every single interaction.

For a tool where the content changes every tab and the user cannot afford an inconsistent experience that consistency is not optional.

The Technical Implementation

Aura is plain HTML, CSS, and JavaScript. No framework. No backend. No server.

The Gemma 4 API call lives directly in the content script:

const GEMMA_API_URL = 'https://generativelanguage.googleapis.com/v1beta/models/gemma-4-31b-it:generateContent';

async function callNova(prompt) {
  const systemTurn = {
    role: 'user',
    parts: [{ text: `You are Aura, a helpful AI assistant in a browser extension. 
    The user is on: ${window.location.href}. 
    Page context:\n\n${currentPageContent}\n\n
    Respond concisely and directly. Never introduce yourself. 
    Never mention you are an AI. Output only the final answer.` }]
  };

  const systemAck = {
    role: 'model',
    parts: [{ text: 'Understood.' }]
  };

  const response = await fetch(`${GEMMA_API_URL}?key=${GEMMA_API_KEY}`, {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({
      contents: [systemTurn, systemAck, ...history, ...messages],
      generationConfig: { temperature: 0.7 },
      thinkingConfig: { thinkingBudget: 0 }
    }),
  });

  const data = await response.json();
  return data.candidates?.[0]?.content?.parts?.[0]?.text || 'No response.';
}

A few things worth noting:

The system turn pattern — Gemma 4's API does not have a native system role. I simulate it by injecting a user turn with the system context, followed by a model acknowledgment. This grounds the model before the actual conversation starts.

thinkingBudget: 0 — Gemma 4 31B is a reasoning model. Left unconstrained, it outputs its full reasoning trace — tasks, constraints, drafts, self-checks — before the final answer. Setting thinkingBudget: 0 suppresses that and returns only the final output to the user.

Page content extraction — Aura reads the page using a priority selector chain before falling back to document.body.innerText, capped at 4000 characters to stay within context limits.

Conversation history — Follow-up chat is supported. Every user and model turn is stored in conversationHistory and injected into the next request, giving Aura memory within a session.

What Changed When I Migrated

Feature	Llama 3.1 8B	Gemma 4 31B
Page summarization	✅ Decent bullets	✅ Structured, context-aware
Code explanation	✅ Describes what code does	✅ Explains why it was written that way
Reply drafting	⚠️ Generic professional tone	✅ Matches the tone of the actual conversation
LinkedIn post creation	⚠️ Template-like output	✅ Distinct voice per post
Highlight & Ask	✅ Works	✅ Deeper reasoning on complex selections

The migration took under 10 minutes. One endpoint. One model string. The quality difference was not subtle.

Who This Is Actually For

People with dyslexia who need content restructured instantly
People with ADHD who lose the thread switching tabs
Non-native English speakers navigating professional content
Elderly users overwhelmed by dense web pages
Anyone the internet was not designed for

My cousin does not need a faster browser. She needs the information to come to her in a form she can hold.

Aura does that. Gemma 4 31B makes it good enough to actually help.

What Is Next

The next version adds multimodal support sending a page screenshot alongside the text so Gemma 4 can reason about charts, diagrams, and images, not just words.

My cousin once sent me a screenshot of a medical form she could not understand. I read it to her over the phone.

Aura will eventually do that too.

Links

🔗 GitHub: https://github.com/Valentinetemi/Aura

Aura was originally built for the Airia AI Agents Hackathon. The Gemma 4 migration was done for this challenge and honestly, it should have been Gemma from the start.

DEV Community