<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Manglesh </title>
    <description>The latest articles on DEV Community by Manglesh  (@a1_funmotivationcreati).</description>
    <link>https://dev.to/a1_funmotivationcreati</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3931778%2Fd809040d-0dd8-47d9-9ad9-e6f2d70a2c4c.png</url>
      <title>DEV Community: Manglesh </title>
      <link>https://dev.to/a1_funmotivationcreati</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/a1_funmotivationcreati"/>
    <language>en</language>
    <item>
      <title>I Spent 3 Days Fighting Gemma 4's API So You Don't Have To: The Honest Developer Guide</title>
      <dc:creator>Manglesh </dc:creator>
      <pubDate>Sun, 24 May 2026 18:17:55 +0000</pubDate>
      <link>https://dev.to/a1_funmotivationcreati/i-spent-3-days-fighting-gemma-4s-api-so-you-dont-have-to-the-honest-developer-guide-27k4</link>
      <guid>https://dev.to/a1_funmotivationcreati/i-spent-3-days-fighting-gemma-4s-api-so-you-dont-have-to-the-honest-developer-guide-27k4</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/google-gemma-2026-05-06"&gt;Gemma 4 Challenge: &lt;br&gt;
Write About Gemma 4&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  The Honest Truth Nobody Tells You About Building With Gemma 4
&lt;/h2&gt;

&lt;p&gt;I just spent 3 days building a full-stack app &lt;br&gt;
(Bondmap — a relationship network mapper) with &lt;br&gt;
Gemma 4 as its AI brain.&lt;/p&gt;

&lt;p&gt;I hit every wall possible.&lt;/p&gt;

&lt;p&gt;Wrong model names. Thinking mode leaking 500 words &lt;br&gt;
of internal reasoning into my UI. The &lt;code&gt;systemInstruction&lt;/code&gt; &lt;br&gt;
field being ignored. 404s, 400s, and a lot of confusion.&lt;/p&gt;

&lt;p&gt;This post is everything I wish I'd known on Day 1.&lt;/p&gt;


&lt;h2&gt;
  
  
  First — Which Gemma 4 Model Do You Actually Need?
&lt;/h2&gt;

&lt;p&gt;The official docs list these variants:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model ID&lt;/th&gt;
&lt;th&gt;Size&lt;/th&gt;
&lt;th&gt;Best For&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;gemma-4-e2b-it&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;2B&lt;/td&gt;
&lt;td&gt;Edge, mobile, Raspberry Pi&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;gemma-4-e4b-it&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;4B&lt;/td&gt;
&lt;td&gt;Browser, lightweight apps&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;gemma-4-31b-it&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;31B Dense&lt;/td&gt;
&lt;td&gt;Server, complex reasoning ✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;gemma-4-26b-a4b-it&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;26B MoE&lt;/td&gt;
&lt;td&gt;High throughput + thinking mode&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;The mistake I made:&lt;/strong&gt; I used &lt;code&gt;gemma-4-9b-it&lt;/code&gt; — a model &lt;br&gt;
that doesn't exist. The API returned a 404 and I spent &lt;br&gt;
an hour debugging the wrong thing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Rule 1: Copy model names from the official docs exactly.&lt;/strong&gt;&lt;br&gt;
There is no 9B. There is no 7B. The four above are it.&lt;/p&gt;


&lt;h2&gt;
  
  
  Setting Up The API (The Right Way)
&lt;/h2&gt;

&lt;p&gt;Get your free API key at &lt;strong&gt;aistudio.google.com&lt;/strong&gt;. &lt;br&gt;
No credit card. No setup. Just a Google account.&lt;/p&gt;

&lt;p&gt;The base endpoint:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight http"&gt;&lt;code&gt;&lt;span class="err"&gt;https://generativelanguage.googleapis.com/v1beta/models/{MODEL_ID}:generateContent?key={YOUR_KEY}
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Minimal working request:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="s2"&gt;`https://generativelanguage.googleapis.com/v1beta/models/gemma-4-31b-it:generateContent?key=&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;API_KEY&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;method&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;POST&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Content-Type&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;application/json&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
      &lt;span class="na"&gt;contents&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt;
        &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;user&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;parts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt; &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Hello Gemma 4!&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;}]&lt;/span&gt;
      &lt;span class="p"&gt;}]&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;candidates&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nx"&gt;content&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;parts&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's literally all you need to get started.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Biggest Gotcha: Thinking Mode
&lt;/h2&gt;

&lt;p&gt;The &lt;code&gt;gemma-4-26b-a4b-it&lt;/code&gt; model (the MoE variant) has &lt;br&gt;
&lt;strong&gt;thinking mode enabled by default.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This means instead of responding with:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Rahul is your 1st-degree connection — he's your &lt;br&gt;
brother!"&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;It responds with 400 words of internal reasoning like:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"* Persona: Bondmap AI. * Core Task: Explain connection.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Draft 1: Rahul is... * Tone Check: Warm? Yes.&lt;/li&gt;
&lt;li&gt;Word Count: 68. Perfect. * Final answer: Rahul is..."&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;

&lt;p&gt;All of that showed up in my app's UI. Users would have &lt;br&gt;
seen the model's entire thought process.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Two ways to fix this:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fix 1 — Use &lt;code&gt;gemma-4-31b-it&lt;/code&gt; instead&lt;/strong&gt;&lt;br&gt;
This model doesn't have thinking mode. Clean output &lt;br&gt;
every time. This is what I ultimately switched to.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fix 2 — Disable thinking budget (only for 26b-a4b)&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="c1"&gt;// In your generation config&lt;/span&gt;
&lt;span class="n"&gt;thinkingConfig&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;put&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"thinkingBudget"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Note: This only works on &lt;code&gt;gemma-4-26b-a4b-it&lt;/code&gt;. &lt;br&gt;
Using it on &lt;code&gt;gemma-4-31b-it&lt;/code&gt; throws a 400 error.&lt;/p&gt;


&lt;h2&gt;
  
  
  systemInstruction: Separate Field, Not Combined Text
&lt;/h2&gt;

&lt;p&gt;This one took me a while. I was combining my system &lt;br&gt;
prompt and user query into one message like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="c1"&gt;// ❌ WRONG — model treats instructions as conversation&lt;/span&gt;
&lt;span class="n"&gt;userPart&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;put&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"text"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;systemPrompt&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s"&gt;"\n\n"&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;userQuery&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The model would then &lt;strong&gt;analyze&lt;/strong&gt; the instructions &lt;br&gt;
instead of following them. It would respond to &lt;br&gt;
"Keep responses under 100 words" as if it were a &lt;br&gt;
question to answer.&lt;/p&gt;

&lt;p&gt;The fix is to use &lt;code&gt;systemInstruction&lt;/code&gt; as a &lt;br&gt;
&lt;strong&gt;completely separate field&lt;/strong&gt; in the request body:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="c1"&gt;// ✅ CORRECT — model treats this as binding rules&lt;/span&gt;
&lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Object&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;systemInstruction&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;LinkedHashMap&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&amp;gt;();&lt;/span&gt;
&lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Object&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;systemPart&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;LinkedHashMap&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&amp;gt;();&lt;/span&gt;
&lt;span class="n"&gt;systemPart&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;put&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"text"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"Your rules here..."&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;span class="n"&gt;systemInstruction&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;put&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"parts"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;systemPart&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;
&lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;put&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"systemInstruction"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;systemInstruction&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// User message is ONLY the question&lt;/span&gt;
&lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Object&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;userContent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;LinkedHashMap&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&amp;gt;();&lt;/span&gt;
&lt;span class="n"&gt;userContent&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;put&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"role"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"user"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;span class="n"&gt;userPart&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;put&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"text"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;userQuery&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// Just the question&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When structured this way, Gemma 4 follows the system &lt;br&gt;
instructions reliably and only responds to the &lt;br&gt;
actual user question.&lt;/p&gt;


&lt;h2&gt;
  
  
  The 128K Context Window Is The Real Superpower
&lt;/h2&gt;

&lt;p&gt;Everyone talks about multimodal. The feature that &lt;br&gt;
actually changed how I architect apps is the &lt;br&gt;
&lt;strong&gt;128,000 token context window.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;For my relationship network app, I load the user's &lt;br&gt;
entire social graph — every person, every &lt;br&gt;
relationship, every label — directly into the &lt;br&gt;
context window. Then Gemma 4 reasons across the &lt;br&gt;
whole graph in one shot.&lt;/p&gt;

&lt;p&gt;No RAG. No vector database. No chunking.&lt;/p&gt;

&lt;p&gt;Just:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;systemPrompt = rules + ENTIRE network graph as text
userMessage = "How am I connected to Rahul?"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Gemma 4 traces multi-hop paths (A knows B who knows C) &lt;br&gt;
and explains them in warm natural language. &lt;/p&gt;

&lt;p&gt;For reference — 128K tokens fits roughly:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;An entire novel (90,000 words)&lt;/li&gt;
&lt;li&gt;A full codebase (hundreds of files)&lt;/li&gt;
&lt;li&gt;Months of conversation history&lt;/li&gt;
&lt;li&gt;Your entire relationship network&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This changes what's architecturally possible. You don't &lt;br&gt;
need a search layer for many use cases — just load &lt;br&gt;
the data and let the model reason.&lt;/p&gt;


&lt;h2&gt;
  
  
  Multimodal: It Just Works
&lt;/h2&gt;

&lt;p&gt;Sending an image to Gemma 4 is straightforward:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;contents&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt;
    &lt;span class="na"&gt;parts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;What's in this image?&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;inline_data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="na"&gt;mime_type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;image/jpeg&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
          &lt;span class="na"&gt;data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;base64ImageString&lt;/span&gt; &lt;span class="c1"&gt;// remove the data:image/jpeg;base64, prefix&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;}]&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I used this for a "photograph your group photo" &lt;br&gt;
feature. Gemma 4 reads body language, setting, and &lt;br&gt;
context to suggest what relationships the people &lt;br&gt;
in the photo might have. No extra model needed — &lt;br&gt;
same API, same endpoint.&lt;/p&gt;




&lt;h2&gt;
  
  
  Which Model Should YOU Use?
&lt;/h2&gt;

&lt;p&gt;After building with all of them, here's my honest take:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use &lt;code&gt;gemma-4-e2b-it&lt;/code&gt; (2B) if:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You're building for Raspberry Pi or mobile edge&lt;/li&gt;
&lt;li&gt;Latency matters more than response quality&lt;/li&gt;
&lt;li&gt;Simple Q&amp;amp;A or classification tasks&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Use &lt;code&gt;gemma-4-e4b-it&lt;/code&gt; (4B) if:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Browser-based deployment&lt;/li&gt;
&lt;li&gt;Moderate reasoning tasks&lt;/li&gt;
&lt;li&gt;You want fast responses on a laptop (via Ollama)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Use &lt;code&gt;gemma-4-31b-it&lt;/code&gt; (31B) if:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Server-side application ← This is probably you&lt;/li&gt;
&lt;li&gt;Complex reasoning, multi-hop logic&lt;/li&gt;
&lt;li&gt;You need clean output without thinking mode&lt;/li&gt;
&lt;li&gt;Best balance of quality and reliability&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Use &lt;code&gt;gemma-4-26b-a4b-it&lt;/code&gt; (26B MoE) if:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You specifically want thinking/reasoning mode&lt;/li&gt;
&lt;li&gt;High-throughput use cases&lt;/li&gt;
&lt;li&gt;You don't mind managing the thinkingBudget setting&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  What Open-Source At This Level Actually Means
&lt;/h2&gt;

&lt;p&gt;Gemma 4 31B runs on a single high-end GPU. &lt;br&gt;
The 4B model runs on a laptop.&lt;/p&gt;

&lt;p&gt;That means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Your users' data never leaves their device&lt;/li&gt;
&lt;li&gt;No per-token cost at scale&lt;/li&gt;
&lt;li&gt;No vendor lock-in&lt;/li&gt;
&lt;li&gt;Full control over the model behavior&lt;/li&gt;
&lt;li&gt;Deploy in countries with data residency requirements&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For my relationship network app, the privacy angle &lt;br&gt;
is real — people's family and social connections are &lt;br&gt;
sensitive data. Running Gemma 4 locally means that &lt;br&gt;
data stays local. That's a genuine competitive &lt;br&gt;
advantage over apps that send everything to OpenAI.&lt;/p&gt;

&lt;p&gt;We're at a point where open-source models are &lt;br&gt;
genuinely competitive with proprietary ones for &lt;br&gt;
real production use cases. Gemma 4 31B isn't &lt;br&gt;
"almost as good as GPT-4." For focused tasks with &lt;br&gt;
good prompting, it's indistinguishable.&lt;/p&gt;

&lt;p&gt;That changes the calculus for every developer &lt;br&gt;
building AI-powered products.&lt;/p&gt;




</description>
      <category>devchallenge</category>
      <category>gemmachallenge</category>
      <category>gemma</category>
    </item>
    <item>
      <title>Bondmap: AI-Powered Relationship Network That Maps How You're Connected to Everyone Using Gemma 4</title>
      <dc:creator>Manglesh </dc:creator>
      <pubDate>Sun, 24 May 2026 18:14:29 +0000</pubDate>
      <link>https://dev.to/a1_funmotivationcreati/bondmap-ai-powered-relationship-network-that-maps-how-youre-connected-to-everyone-using-gemma-4-18b5</link>
      <guid>https://dev.to/a1_funmotivationcreati/bondmap-ai-powered-relationship-network-that-maps-how-youre-connected-to-everyone-using-gemma-4-18b5</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/google-gemma-2026-05-06"&gt;Gemma 4 Challenge: &lt;br&gt;
Build with Gemma 4&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Built
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Bondmap&lt;/strong&gt; is a social relationship network where anyone &lt;br&gt;
can map their real-world connections — family, friends, &lt;br&gt;
colleagues, romantic partners, long-distance relationships &lt;br&gt;
— as a beautiful interactive visual graph.&lt;/p&gt;

&lt;p&gt;The core idea: you add people and define how you know them. &lt;br&gt;
Bondmap's AI (powered by Gemma 4) then reasons across your &lt;br&gt;
entire network to explain hidden connections, trace &lt;br&gt;
relationship paths, and suggest people you may know through &lt;br&gt;
others.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The problem it solves:&lt;/strong&gt; Most people have no clear picture &lt;br&gt;
of how everyone in their life is actually connected. &lt;br&gt;
LinkedIn shows work connections. WhatsApp shows contacts. &lt;br&gt;
But nobody shows you that your college friend's brother &lt;br&gt;
works with your current colleague — until now.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key Features:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Interactive D3.js network map with color-coded connections
(purple = family, green = friends, blue = work, pink = romantic)&lt;/li&gt;
&lt;li&gt;Add people and define relationship types and labels
(brother, mentor, childhood friend, long-distance etc)&lt;/li&gt;
&lt;li&gt;Ask Gemma 4 in plain English: "How am I connected to Rahul?"&lt;/li&gt;
&lt;li&gt;Upload a group photo — Gemma 4 vision analyzes who might 
be in it and suggests relationship types from context&lt;/li&gt;
&lt;li&gt;1st, 2nd, and 3rd degree connection discovery&lt;/li&gt;
&lt;li&gt;Spring Boot backend + React frontend&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Demo
&lt;/h2&gt;

&lt;p&gt;[🔗 Live App — bondmap.web.app]&lt;/p&gt;

&lt;p&gt;[📹 Video Walkthrough — paste your video link here]&lt;/p&gt;

&lt;p&gt;Key demo moments:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Adding people with different relationship types&lt;/li&gt;
&lt;li&gt;Asking "How am I connected to [person]?" 
and getting a warm natural language answer from Gemma 4&lt;/li&gt;
&lt;li&gt;Uploading a group photo and watching Gemma 4 
analyze the context&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Code
&lt;/h2&gt;

&lt;p&gt;[🐙 GitHub Repository — github.com/MangleshKumar1/bondmap]&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tech Stack:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Frontend: React + Vite + D3.js&lt;/li&gt;
&lt;li&gt;Backend: Java Spring Boot&lt;/li&gt;
&lt;li&gt;Database: Firebase Firestore&lt;/li&gt;
&lt;li&gt;AI: Gemma 4 via Google AI Studio API&lt;/li&gt;
&lt;li&gt;Hosting: Firebase Hosting + Railway (backend)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  How I Used Gemma 4
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Model chosen: &lt;code&gt;gemma-4-31b-it&lt;/code&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I specifically chose the &lt;strong&gt;Gemma 4 31B Dense&lt;/strong&gt; model for &lt;br&gt;
three reasons:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Relationship Path Reasoning&lt;/strong&gt;&lt;br&gt;
When a user asks "How am I connected to Anjali?", Gemma 4 &lt;br&gt;
receives the entire relationship network as context and &lt;br&gt;
traces the path across multiple hops — A knows B who knows C &lt;br&gt;
— in a single inference call. This multi-hop graph reasoning &lt;br&gt;
in natural language is exactly where the 31B model's &lt;br&gt;
reasoning depth matters. A smaller model gave vague answers; &lt;br&gt;
31B gave precise, warm, human-readable explanations every time.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. The 128K Context Window&lt;/strong&gt;&lt;br&gt;
The entire relationship network — every person, every &lt;br&gt;
connection, every label — is loaded into Gemma 4's context &lt;br&gt;
window in one shot. No RAG, no chunking, no vector database. &lt;br&gt;
Gemma 4 holds the whole graph in memory and reasons across &lt;br&gt;
it holistically. For a social graph with dozens of people &lt;br&gt;
and relationships, this is only possible with a large &lt;br&gt;
context window.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Multimodal Vision for Photo Analysis&lt;/strong&gt;&lt;br&gt;
When a user uploads a group photo, Gemma 4 analyzes the &lt;br&gt;
image — reads the occasion, body language, and positioning &lt;br&gt;
— and suggests who these people might be and what &lt;br&gt;
relationship types they have. This multimodal capability &lt;br&gt;
is native to Gemma 4 and required no additional models.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why not a smaller Gemma 4 model?&lt;/strong&gt;&lt;br&gt;
I tested &lt;code&gt;gemma-4-e4b-it&lt;/code&gt; (4B) for the connection reasoning &lt;br&gt;
task. The responses were correct but shallow — it couldn't &lt;br&gt;
reliably trace 2nd and 3rd degree connections across a &lt;br&gt;
larger network. The 31B model handled complex multi-hop &lt;br&gt;
paths accurately every time, which is the core AI feature &lt;br&gt;
of the app.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Architecture:&lt;/strong&gt;&lt;br&gt;
User asks question&lt;br&gt;
↓&lt;br&gt;
React frontend → Spring Boot API&lt;br&gt;
↓&lt;br&gt;
AIService.java builds prompt:&lt;br&gt;
systemInstruction = rules + full network graph&lt;br&gt;
user message = the question only&lt;br&gt;
↓&lt;br&gt;
Gemma 4 31B reasons across entire graph&lt;br&gt;
↓&lt;br&gt;
Clean natural language response&lt;br&gt;
↓&lt;br&gt;
Displayed in UI&lt;/p&gt;

&lt;p&gt;Gemma 4 is not a wrapper here — it IS the relationship &lt;br&gt;
intelligence engine. Every insight, every path explanation, &lt;br&gt;
every photo analysis runs through Gemma 4.&lt;/p&gt;

</description>
      <category>devchallenge</category>
      <category>gemmachallenge</category>
      <category>gemma</category>
    </item>
  </channel>
</rss>
