DEV Community: Hemalatha Nambiradje

🔮 Beat the Oracle: A FIFA World Cup 2026 AI Prediction Duel

Hemalatha Nambiradje — Wed, 17 Jun 2026 00:27:49 +0000

This is a submission for the June Solstice Game Jam

What I Built

Beat the Oracle is a daily FIFA World Cup 2026 prediction game where you go head-to-head against an AI — Google's Gemini 1.5 Flash — to call match scores before kickoff. Out-predict the machine and you win the day's Turing Test. Lose, and the Oracle has outsmarted you... until tomorrow.

Every day you're served the same 5 matches as everyone else: some already played (scored instantly), some upcoming (lock in your call and come back). The Oracle reads each team's recent World Cup form and makes its own prediction with written reasoning — which you only see after you've locked in yours. No peeking, no cheating. Just you versus the machine.

Scoring:

Result	Points
Exact scoreline	3	🎯 "Enigma Cracked!"
Correct result (W/D/L)	1	✅
Miss	0	❌

Why this fits the June Solstice Game Jam

This jam asked for a game inspired by the solstice or any June celebration — and Beat the Oracle is stitched to June on two threads the challenge itself calls out:

The World Cup is June's global celebration. The prompt names it directly: "the electric teamwork and high stakes of the World Cup, bringing the entire planet together in the spirit of play." That's the playground this game lives in — and as a bonus, I built it from 🇨🇦 Canada, a 2026 host nation.
June is Alan Turing's month. Born June 23rd, Turing is the reason this jam has a "father of computing" prize at all. So I didn't bolt a Turing reference onto a football game — I built a playable Turing Test and gave it a World Cup costume. Turing's 1950 question, "Can machines think?", becomes a question you answer with your gut every single day: can a machine predict football better than you?

And the "daily" loop — new matches each day, your score reset, the machines winning "for today" — leans into the solstice's own theme of cycles and the passage of time. Every day is a fresh test. Every match is a new cipher.

🔗 Live demo: hema-nambi.github.io/BeatTheOracle

Code

Hema-Nambi / BeatTheOracle

Daily FIFA World Cup 2026 prediction duel vs Gemini AI. Beat the Oracle to win the Turing Test. Built for DEV.to June Solstice Game Jam 2026.

🔮 Beat the Oracle

Can you out-predict Gemini AI at the FIFA World Cup 2026?

A daily prediction duel where you go head-to-head against an AI Oracle — powered by Google Gemini 1.5 Flash — to predict FIFA World Cup 2026 match scores. Beat the Oracle and you win the Turing Test. Lose, and the machines have won... for today.

🎮 Play now →

How It Works

Every day you get 5 matches — a mix of recently played games (score immediately) and upcoming fixtures (save your prediction and come back after kickoff).

The Oracle analyses real World Cup form data and makes its own prediction. You only see its reasoning after you've locked in yours.

Points	Condition
🎯 3 pts	Exact scoreline match — "Enigma Cracked!"
✅ 1 pt	Correct result (win / draw / loss)
❌ 0 pts	Miss

Beat the Oracle's total → 🏆 Human wins…

View on GitHub

How I Built It

The stack — 100% free, zero backend

Single HTML file — no framework, no build step, no server
ESPN API — free, CORS-friendly, no key required (site.api.espn.com/apis/site/v2/sports/soccer/fifa.world/scoreboard)
Gemini 1.5 Flash — the Oracle's brain, running entirely client-side (players bring their own free key from Google AI Studio)
Vanilla JS + CSS — glassmorphism UI, a confetti canvas, animated score counters

The whole thing is one file you can open in a browser. That constraint kept the game honest and the architecture transparent — exactly what you want when the judges might read the source.

How the Oracle thinks

This is the heart of the game. The Oracle isn't a random number generator wearing a robot emoji — it reasons from real tournament data. I pull each team's live W/L/D form from ESPN and hand Gemini a structured prompt:

You are the Oracle — an AI sports analyst at the FIFA World Cup 2026.
Recent form:
• Canada: W W D
• Morocco: W L W
Predict the exact final score. Respond with JSON: {"home": N, "away": N, "reason": "..."}

Gemini returns a specific scoreline and a sentence of reasoning you get to read and judge. Sometimes it's frighteningly sharp. Sometimes you school it. That tension — is the machine actually smarter than me here? — is the entire game. Without Gemini, this is just a form. With it, you have an opponent.

The daily challenge mechanic

Everyone gets the same 5 matches each day, chosen by a date-seeded PRNG (xorshift32 seeded on the current date). That does three things:

Scores are globally comparable — your 7 points means the same as everyone else's
Return visits are rewarded — new matches drop daily
Finished matches score instantly, while upcoming ones save for later — instant payoff and a reason to come back

The share card

🔮 Beat the Oracle ⚽ — Day #5
FIFA World Cup 2026 🇨🇦
🎯 🇨🇦🇲🇦  You: 2–1 · Oracle: 1–0 🔐 Enigma Cracked!
✅ 🇧🇷🇪🇸  You: 2–1 · Oracle: 2–0
👤 Me: 7pts  vs  🔮 Oracle: 4pts
🏆 Human wins the Turing Test!
#BeatTheOracle #WorldCup2026 #JuneSolsticeGameJam

Wordle-style emoji grid, one-click copy, built-in share button. Designed to travel.

Challenges I hit

CORS walls. My first data source (football-data.org) blocks browser requests. I swapped the whole pipeline to ESPN's public API — free, fast, and CORS-open.
Quota cliff. Gemini 2.0 Flash's free quota was exhausted on launch day. I dropped to 1.5 Flash, which has a more generous free tier and is more than smart enough to be a worthy Oracle.
Game-flow confusion. Early testers got lost — "we're both predicting, but the match hasn't happened yet?" The mixed queue fixed it: finished matches for instant gratification, upcoming matches for anticipation.

Prize Category

I'm submitting to both additional categories — and in this game they're the same mechanic seen from two angles.

🤖 Best Ode to Alan Turing

The entire game is a Turing Test you run yourself, daily. Turing's Imitation Game asked whether you could tell human from machine; Beat the Oracle asks whether human intuition can still out-predict one. The scoring language, the intro screen, and the win states all frame the duel through his legacy:

Beat Gemini → 🏆 Human wins the Turing Test
Gemini beats you → 🤖 The Oracle has outsmarted you
Nail an exact scoreline → 🔐 "Enigma Cracked" — a direct nod to Turing's wartime work breaking the Enigma cipher at Bletchley Park

Every match is a new cipher. Every session is a new test. It's not a tribute added to the game — it's the game's spine.

🌟 Best Google AI Usage

Gemini 1.5 Flash isn't a feature here — it's the opponent. It:

Analyses real, live match data from the current tournament
Generates natural-language reasoning you can read, judge, and argue with
Returns structured JSON predictions you can verify and score against your own
Runs entirely client-side through the Gemini API — no backend, no server, no secrets stored

The AI is the difference between "a football prediction form" and "a game with a worthy rival." Google AI Studio's free tier made it possible to ship that rival to anyone with a 30-second API key signup.

What's Next

🌐 Global leaderboard (Neon serverless Postgres) — rank against other humans, not just the Oracle
📊 Season-long accuracy tracking across the tournament
🔔 Match reminders so you never miss scoring a saved prediction

Try It

▶️ Play Beat the Oracle →

You'll need a free Gemini API key from Google AI Studio — takes about 30 seconds. Then go find out whether the machines have won... today.

Made with ❤️ in Canada 🇨🇦 for the June Solstice Game Jam 2026.

I Built a Daily News Newsletter Bot with Hermes Agent — Here's Everything That Went Wrong (and Right)

Hemalatha Nambiradje — Wed, 27 May 2026 02:54:58 +0000

Submitted for the Hermes Agent Challenge

The Idea

I wanted one simple thing: a daily email that lands in my inbox every morning with the top news from Canada, the world, India, and the AI/tech space — plus a motivational quote and a health tip. One email. Everything in one place. No scrolling through five different apps before my coffee.

Sounds simple. It wasn't.

This is the honest story of building a daily briefing bot using Hermes Agent — including every wall I hit, every config I lost, and every moment where it finally clicked.

The Stack (Zero Cost, Fully Open Source)

Hermes Agent — the brain that fetches news and generates the newsletter
GitHub Codespaces — free cloud dev environment (no touching my personal Mac)
TypeScript + Nodemailer — sends the email via Gmail SMTP
OpenRouter — free LLM API for Hermes to use

No paid services. No cloud bills. Just open source tooling wired together.

Why GitHub Codespaces?

I specifically didn't want this running on my personal Mac. I wanted it isolated — something I could destroy and rebuild without affecting my machine. GitHub Codespaces gave me a free Linux environment in the browser. Perfect.
Or so I thought.

The Problems (The Real Story)

Problem 1: The Typo That Took 30 Minutes
After getting Nodemailer set up and running my send script, the terminal just... hung. No error. No output. Just silence.
I thought it was a firewall issue in Codespaces blocking SMTP port 587. I switched to port 465. Still hanging on some runs. I added verbose logging. I tried verify() calls.

Then I spotted it:
hostname: 'smpt.gmail.com'

smpt instead of smtp. One transposed letter. Thirty minutes of debugging.
Lesson: Always print your env vars before debugging the code.

Problem 2: Rebuilding the Container Deleted Everything
This one hurt. I had Hermes installed, my .env configured, my SMTP secrets set up, everything working. Then I rebuilt my Codespace container to fix an unrelated issue.
Gone. All of it.
Hermes was uninstalled. My environment variables vanished. My .env file disappeared. I had to reinstall everything from scratch and reconfigure all my secrets.
The fix was two things:
First, create a .devcontainer/devcontainer.json so Hermes auto-installs on every rebuild:

{
  "name": "daily-brief-hermes",
  "postCreateCommand": "pip install hermes-agent && npm install"
}

Second, keep secrets in ~/.hermes/.env and your project .env committed to a safe location — not just floating in your shell session.
Lesson: Never trust your shell session. Anything not written to a file is gone the moment the container rebuilds.

Problem 3: ts-node Fighting with "type": "module"
My package.json had "type": "module" in it, which made ts-node throw:
TypeError: Unknown file extension ".ts"
Three different error messages, two config changes, one Stack Overflow rabbit hole. The fix was switching from ts-node to tsx — a drop-in replacement that handles both ESM and CommonJS without any config:

npm install --save-dev tsx
npx tsx send-newsletter.ts briefings/2026-05-26.md

Lesson: Use tsx over ts-node for TypeScript in modern Node projects. It just works.

*Problem 4: * "Newsletter Sent!" But No Email
The most confusing moment. The script printed Newsletter sent! — Nodemailer was happy, no errors thrown. But my inbox was empty.
Three possible culprits I had to rule out one by one:

Spam folder — it was there. Gmail flagged it as spam.
Wrong app password — Gmail requires a 16-character App Password, not your regular login password. Easy to get wrong.
Empty file being sent — the file path resolved correctly but the content hadn't been written yet.

The spam issue was fixed by adding a proper sender name and dynamic subject line:

await transporter.sendMail({
  from: `"Daily Brief 📰" <${SMTP_USER}>`,
  to: RECIPIENTS.join(", "),
  subject: `Daily Brief — ${new Date().toLocaleDateString("en-CA", {
    weekday: "long",
    year: "numeric",
    month: "long",
    day: "numeric"
  })}`,
  text: body,
});

Lesson: Always check spam. Always use Gmail App Passwords, not your account password. Always mark the first email as "Not Spam" to train Gmail.

Problem 5: Hermes Had No Model Configured
After getting email working, I opened hermes chat and pasted my newsletter prompt. Hermes responded with:

No inference provider configured. Run 'hermes model' to choose a provider and model,
or set an API key in ~/.hermes/.env

Then after adding the OpenRouter key:

HTTP 400: No models provided

The API key was there but no model was set. The fix was adding HERMES_MODEL to ~/.hermes/.env:

OPENROUTER_API_KEY=sk-or-xxxxxxxxxxxxxxxx
HERMES_MODEL=owlobot/owl-7b

Lesson: Hermes needs both an API key AND a model specified. One without the other gives cryptic errors.

What Actually Worked Beautifully

Once everything was configured, Hermes was genuinely impressive to use. I pasted a plain English prompt:

Today is 2026-05-26. Search the web for today's top 5 headlines for 
Canada news, World news, India news, and AI/tech news. Add one 
motivational quote and one health tip. Format as Markdown and save to 
/workspaces/daily-brief-hermes/briefings/2026-05-26.md

And Hermes:

Searched the web for current headlines across all four categories
Summarized each story in readable bullet points
Added a motivational quote and health tip
Formatted everything as clean Markdown
Saved it to the exact file path I specified

That's the part that made the whole painful setup worth it. I didn't write a single line of news-fetching code. No RSS parsers, no news APIs, no scraping. Hermes handled all of it through natural language.

The Final Architecture

~/.hermes/.env
  └── OPENROUTER_API_KEY + HERMES_MODEL
  └── SMTP credentials

hermes chat (manual trigger or cron)
  └── Reads skills/daily_brief.md prompt
  └── Searches web for today's news
  └── Generates Markdown newsletter
  └── Saves to briefings/YYYY-MM-DD.md

npx tsx send-newsletter.ts briefings/YYYY-MM-DD.md
  └── Reads ~/.hermes/.env for SMTP credentials
  └── Sends email via Gmail SMTP port 465
  └── Delivers to all recipients in the list

The Hermes Cron Setup (For Fully Automatic Daily Runs)

hermes cron start

Inside hermes chat:

/cron add "0 8 * * *" "Read the skill at /workspaces/daily-brief-hermes/skills/daily_brief.md 
and generate today's newsletter. Save it to /workspaces/daily-brief-hermes/briefings/
$(date +%Y-%m-%d).md then run: npx tsx /workspaces/daily-brief-hermes/send-newsletter.ts 
/workspaces/daily-brief-hermes/briefings/$(date +%Y-%m-%d).md"

Every day at 8am UTC, Hermes generates and sends the newsletter automatically.

Note: Codespaces sleeps when idle, so for a truly always-on setup you'd move this to a small VPS. But for prototyping and learning, Codespaces works perfectly.

What I'd Tell Someone Starting This Today

Create .devcontainer/devcontainer.json on day one. Don't wait until you've lost your setup to a rebuild.
Keep all secrets in files, never just in shell exports. Shell exports vanish. Files don't.
Use tsx instead of ts-node. It handles modern Node module systems without fighting your package.json.
Test the email script completely before touching Hermes cron. Get email working first. Then add the AI layer.
Print your env vars before debugging network issues. echo $SMTP_HOST takes two seconds and would have saved me thirty minutes.
Check spam. Seriously. Check spam first.

Final Thoughts on Hermes Agent

The setup friction is real — especially in a Codespaces environment where rebuilds wipe your state. But once Hermes is configured, the experience of writing plain English instructions and watching it search the web, reason about content, and produce structured output is genuinely different from anything I've built before.

I didn't write a news aggregator. I didn't build a scraper. I wrote a prompt and a TypeScript email script, and I get a daily briefing in my inbox every morning.

That's the part that sticks with me.

Built for the Hermes Agent Challenge | GitHub: daily-brief-hermes

Google I/O 2026 Blew My Mind — Here's What It Means for the Family App I'm Building

Hemalatha Nambiradje — Thu, 21 May 2026 18:37:46 +0000

This is a submission for the Google I/O Writing Challenge

I went into Google I/O 2026 as someone still finding her footing in app development. I came out the other side genuinely excited — and a little overwhelmed — about how fast everything is moving.

I am an SDET with 9 years in software testing. A few months ago I started learning AI fundamentals and building my first real app: a family super-app that combines grocery tracking, shopping budgets, family calendar planning, and weekend activity planning all in one place. I am building it in Google AI Studio using Gemini 3 Flash Preview.

So when I watched Google I/O this week, I wasn't watching as a passive observer. I was watching as someone actively building — and almost every announcement had me pausing the stream thinking "wait, that changes what I'm building."

Here are the moments that hit hardest.

1. Antigravity Built a Full OS in 12 Hours for Under $1,000

This was the jaw-drop moment of the entire keynote for me.

Google unveiled Antigravity 2.0, their agent-first development platform with a new Antigravity CLI for orchestrating and building agents. But the live demo went further than any announcement slide could capture — an AI agent was given a single prompt and built a functioning operating system in approximately 12 hours at a cost of under $1,000.

Let that sink in. A full OS. Twelve hours. Less than a thousand dollars.

As someone coming from a quality engineering background with limited development experience, this felt like the ground shifting. The barrier to building software just got dramatically lower. I have been spending weeks learning how to wire up a Node.js backend. Antigravity is showing a future where you describe what you want and the agent scaffolds it, writes it, and tests it.

What this means for my app:

I am building a family app with multiple moving parts — grocery lists, budget tracking, a family calendar, a weekend planner. Normally that scope would take a solo developer months. With agentic coding tools like Antigravity, I can see a future where I describe a feature in plain English, the agent writes the code, and I use my QE skills to review and test what it produced.

That is a workflow I genuinely understand. I have been testing other people's code for 9 years. Now the "other developer" is an AI agent, and my job is the same: find the gaps, validate the output, ship with confidence.

2. Agentic Coding in Search — Robby Stein's Demo

The Search keynote by Robby Stein was the section I rewatched twice.

Google Search introduced an AI-powered experience that generates contextual answers, images, and short videos, making it more assistant-like in function. Generative UI was also unveiled, dynamically adjusting how results appear based on user intent.

What struck me was how Search is no longer just a lookup tool. It is starting to reason. It doesn't just find — it understands what you are trying to accomplish and adapts the results to help you get there.

As someone building a family app, this matters because families don't search like developers do. A parent doesn't type "quinoa recipe low sodium 4 servings." They type "something healthy for dinner the kids won't complain about." The new Search understands that intent. It adapts.

What this means for my app:

I want to build a search experience inside my family app that works the same way. Not a filter. Not a dropdown. A natural language input where a family member types "we have chicken and we're on a budget this week" and the app understands the whole context — the ingredients they have, their budget remaining, what's already on the grocery list — and surfaces a meal plan that fits.

That's the bar Google just set. And it's the bar I'm now designing toward.

3. Gemini Spark — The 24/7 Background Agent

Gemini Spark is a new AI agent that lives in the cloud and works proactively on tasks in the background, continuing to work even when you're not actively using it.

This one I am still processing. An agent that doesn't wait for you to ask. It monitors, plans, and acts — all while you go about your day.

For a family app, the implications are significant. Most family planning apps are reactive — you open the app, you add something, you check a list. Gemini Spark represents a shift to proactive. The agent notices that your grocery budget is running low mid-week, checks the family calendar and sees there's a dinner party on Saturday, and quietly starts building a shopping list without you having to ask.

What this means for my app:

The Weekend Planner feature I've been designing fits perfectly here. Right now I was planning it as a manual feature — family members add activities, the app shows a view. But after watching Spark, I'm rethinking it as a proactive agent:

Monday: Spark notices the weekend is approaching and checks the family calendar
Tuesday: Spark suggests 3 weekend activity options based on past preferences and local weather
Wednesday: Family approves a plan — Spark generates the shopping list and budget estimate
Friday: Spark sends a reminder with the full weekend plan and what's still needed

That's not a feature. That's a family assistant. And Google just showed us it's possible.

4. Universal Cart — The Feature I Am Most Excited to Integrate

Universal Cart uses AI to proactively check your cart and understands context — like knowing which parts you're buying for a PC build — across multiple retailers.

This was the announcement I immediately started sketching integration ideas for.

Families don't shop at one store. We go to Costco for bulk, Metro for fresh produce, Shoppers for pharmacy items. Every week we're mentally tracking prices across multiple places. Universal Cart is Google saying: you shouldn't have to do that.

What this means for my app — in detail:

Here is exactly how I see Universal Cart fitting into the family app I am building:

Smart grocery sourcing:
A family member adds "Greek yogurt" to the shared grocery list. Universal Cart checks across nearby stores, finds the best price per unit, and flags it. The app shows: "Metro has this $1.20 cheaper this week."

Budget-aware shopping:
The family sets a $200 weekly grocery budget. As items get added to the list, the app estimates the Universal Cart total in real time. When you hit $180, the app flags it: "You're close to your budget — here are 3 items with cheaper alternatives."

Weekend planner tie-in:
Planning a family BBQ this weekend? The weekend planner generates a suggested menu. Universal Cart sources every ingredient across nearby stores, finds the best combination of stores to hit, and estimates the total cost before anyone leaves the house.

Receipt reconciliation with ReceiptMind:
After shopping, you photograph the receipt in ReceiptMind. The app compares what Universal Cart estimated vs what you actually spent. Over time, this builds a real picture of where estimates were off and which stores are consistently cheaper for your family's actual buying habits.

This integration is my north star for the next phase of this app.

5. Intelligent Eyewear — I Already Live in This World (Sort Of)

Google's collaboration with Samsung brought working demos of intelligent eyewear designed by Warby Parker and Gentle Monster, with deep Gemini integration to perform tasks through voice commands.

Here's my personal take on this one: I own a pair of Meta Ray-Bans.

I love them. I use them for calls. I take photos with them. But that is essentially it. The AI assistant on them feels like a novelty — it can answer general questions but it doesn't know anything about me, my context, or what I'm actually trying to do.

What Google showed at I/O is a fundamentally different vision. Gemini embedded in glasses that actually understand your context — what you're looking at, where you are, what you're trying to accomplish — and assists without you having to pull out your phone.

What this means for families:

I walked into a grocery store last Saturday with a list on my phone. I kept having to unlock it, find the list, check off items, put it back. It's friction. Small friction, but friction.

Intelligent eyewear with Gemini could eliminate that entirely. Walk into Metro, say "what do we still need?" — the glasses check the family list, see what's already in the cart via camera, and tell you what's missing. No phone. No unlocking. No friction.

For my family app, this is a future integration worth designing toward today — even if the hardware isn't available yet. Building the API layer to support a glasses interface means we're ready when the hardware ships this fall.

The Bigger Picture — What Google I/O Meant to Me Personally

I came into this as an SDET who made a decision a few months ago to stop watching AI from the sidelines and start building.

Google I/O showed me how fast the gap between "idea" and "working software" is closing. Antigravity is compressing development time. Search is raising the bar for what users expect from interfaces. Spark is redefining what "proactive" means in software. Universal Cart is eliminating friction that users didn't even know they were tolerating.

As a quality engineer, my job has always been to ask: does this actually work for the person using it? That question doesn't change with AI. If anything, it gets more important — because when AI agents are doing the building and the reasoning and the planning, the human who validates the output becomes more valuable, not less.

That's the role I'm stepping into. And Google I/O 2026 gave me a much clearer picture of what I'm stepping into it for.

What I'm Building Next

Immediately after I/O, I added three things to my roadmap for the family app:

Weekend Planner v2 — redesign as a proactive agent, not a manual feature
Universal Cart integration — research the API, start designing the budget overlay
Voice-first grocery list — design the UX today so it's glasses-ready when the hardware ships

The deadline for this writing challenge is May 24. My app has no deadline. But Google I/O just gave it a much clearer direction.

Hemalatha — SDET and app builder patiently waiting for the Gemini upgrade

Building a family super-app with Google AI Studio + Gemini 3 Flash Preview

I Built an AI Receipt Scanner with Gemma 4 — As an SDET with No Dev Background

Hemalatha Nambiradje — Tue, 19 May 2026 13:42:05 +0000

This is a submission for the Gemma 4 Challenge: Build with Gemma 4

My Honest Starting Point

I want to be upfront about something before diving in: I am an SDET — a Software Development Engineer in Test. My world is test automation, quality assurance, and finding bugs, not building apps from scratch. I have spent the last few weeks learning AI fundamentals, experimenting with different models, trying to understand how this whole ecosystem actually works beneath the surface.
When I came across this hackathon, my first instinct was to scroll past it. This is for developers. But then I thought — why not? The worst that happens is I learn something.
What followed was equal parts confusion, accidental discovery, and a working app I am genuinely proud of.

The Gemini vs Gemma Confusion (I Suspect I'm Not Alone)

My first stop was Google AI Studio. And honestly? It was fantastic for getting ideas off the ground quickly. I built a small app there, got a feel for prompt engineering, and started to understand how multimodal models work.
But there was a problem: every time I tried to use Gemma 4, Google AI Studio kept routing me to Gemini Flash Preview — the latest hosted model. No matter what I selected, it defaulted back to Gemini.
I spent an embarrassing amount of time thinking I was using Gemma 4 when I wasn't.
That confusion forced me to actually sit down and research the difference. And that is when it clicked:

Gemma is not a smaller Gemini. They share research lineage, but the deployment story is completely different. Once I understood that, everything else fell into place.

What I Built: ReceiptMind

ReceiptMind is an AI-powered receipt scanner that extracts structured data from receipt photos and builds an expense dashboard automatically.
You take a photo of a receipt — any receipt, any store — upload it, and Gemma 4 reads the image and returns:

Merchant name
Total amount
Date
Expense category (Food & Dining, Groceries, Transport, Healthcare, Entertainment)
Tax amount

No manual entry. No OCR pipeline. No template matching. Just Gemma 4 looking at the image and understanding it.
This started as a feature I wanted to add to a personal finance app I have been quietly building on the side. The hackathon gave me the deadline I needed to actually ship something.

Why Gemma 4 26B MoE — Not the Other Models

This is the question I care most about answering, because I made this choice deliberately.
The Gemma 4 family has four models:

I chose the 26B MoE (A4B) for two specific reasons:

It is the only model in the family with native image input.
ReceiptMind's entire value is reading receipt photos. Without multimodal vision, there is no product. The E2B and E4B are text-only. The 31B dense is text-only. Only the 26B MoE can receive an image and reason about what it sees.
Despite 26B total parameters, only 4B activate per token.
This is the Mixture-of-Experts efficiency. The model routes each token through only the most relevant expert layers — so I get near-31B quality visual reasoning at a fraction of the compute cost. For a hobby project running on a free API tier, this matters enormously.

I also used the 256K context window to pass multiple receipts in a single prompt when generating monthly spending insights — no chunking, no retrieval, just the full history in one shot.

The Tech Stack

Frontend → HTML + Vanilla JavaScript
Backend → Node.js + Express
AI → Gemma 4 26B MoE via OpenRouter (free tier)
Database → Neon Postgres (serverless)
File Upload → Multer (in-memory buffer → base64)

Why OpenRouter?
Google AI Studio kept routing me to Gemini Flash. OpenRouter gave me direct access to google/gemma-4-26b-a4b-it:free with no credit card and no routing surprises. Once I found it, the API worked on the first try.

How It Works — The Architecture

User uploads receipt image
↓
Express backend receives file via multer
↓
Image converted to base64
↓
Sent to OpenRouter → Gemma 4 26B MoE (multimodal)
↓
Gemma reads the image, returns structured JSON
↓
JSON saved to Neon Postgres
↓
Dashboard updates with new receipt + running totals

The Core API Call

Here is the exact call that makes ReceiptMind work. The key is the image_url content block — this is what tells Gemma 4 to look at the receipt image:

const response = await fetch("https://openrouter.ai/api/v1/chat/completions", {
  method: "POST",
  headers: {
    "Authorization": `Bearer ${process.env.OPENROUTER_API_KEY}`,
    "Content-Type": "application/json"
  },
  body: JSON.stringify({
    model: "google/gemma-4-26b-a4b-it:free",
    messages: [
      {
        role: "user",
        content: [
          {
            type: "image_url",
            image_url: { url: `data:${mimeType};base64,${base64Image}` }
          },
          {
            type: "text",
            text: `Look at this receipt and extract the data. 
            Reply ONLY with JSON:
            {
              "merchant": "store name",
              "amount": 12.50,
              "date": "2026-05-14",
              "category": "Food & Dining",
              "tax": 1.10
            }`
          }
        ]
      }
    ]
  })
});

No OCR library. No preprocessing. No regex parsing of receipt text. Gemma reads the image exactly like a human would and returns clean structured data.

Test Receipts — What Gemma 4 Had to Handle

I tested with 5 real-world receipt types, each designed to stress-test different extraction challenges:

The gas receipt was the toughest — 42.45L @ $1.649/L = $69.96. Gemma extracted both the unit price and total correctly without any hints.

The entertainment receipt had a discount applied before tax, which changes the subtotal calculation. Gemma handled it correctly.

What Broke (And How I Fixed It)

**Problem 1: **The request would hang indefinitely
The free-tier model on OpenRouter can be slow during peak hours. I added a 30-second AbortController timeout so the frontend shows a proper error instead of spinning forever.

const controller = new AbortController(); const timeout = setTimeout(() => controller.abort(), 30000);

Problem 2: Gemma sometimes wraps JSON in markdown code fences
The model would return

json { ... }

instead of raw JSON. Fixed with a one-liner:

const clean = rawText.replace(/json|/g, "").trim();

Problem 3: I had no idea what was failing
As an SDET, my instinct was to add logging everywhere. I added console.log checkpoints at every step (file received → base64 converted → API called → response received → JSON parsed → DB saved). This immediately showed me exactly where things were failing during development.

The Expense Dashboard

After scanning receipts, the dashboard shows:

Running total across all receipts
Breakdown by category (Food & Dining, Groceries, Transport, etc.)
Full receipt log with merchant, amount, date, and category
Per-receipt tax tracking (useful for expense reports)

After scanning all 5 test receipts, the dashboard showed a combined $369.70 across 5 categories — exactly matching the manual totals.

What This Means for My Pet Project

ReceiptMind started as one feature of a larger personal finance app I have been building. The plan is to integrate it so users can:

Scan receipts throughout the month
Get AI-generated spending summaries ("You spent 40% more on dining this month")
Set budget alerts by category
Export expense reports for tax season

The 256K context window is what makes the spending insight feature viable — I can pass the entire month's worth of receipts in one prompt and ask Gemma to reason across all of them at once.

What I Learned as an SDET Doing This

A few things surprised me:

1. Prompt engineering is just test case design.
Writing a good prompt felt exactly like writing a good test spec — be precise, cover the edge cases, define the expected output format. The skills transferred more than I expected.

2. The model choice matters more than I thought.
I initially assumed any capable model would work. But switching from text-only to multimodal was the difference between having a product and not having one.

3. The confusion between Gemini and Gemma is real.
If you are just getting started, burn this into your memory: Gemma = open weights you run yourself. Gemini = Google's hosted API. They are different products built from related research.

4. Ship something small and real.
I could have tried to build the full personal finance app. Instead I picked one feature, made it work end-to-end, and learned more in a week than I had in the previous month of reading documentation.

GitHub Repository
🔗 https://github.com/Hema-Nambi/ReceiptMind

Try It Yourself

Requirements:

Node.js 18+
Free OpenRouter account → openrouter.ai
Free Neon database → neon.tech

git clone https://github.com/Hema-Nambi/receiptmind
cd receiptmind
npm install
# Add your keys to .env
node server.js
# Open http://localhost:3000

Built during the Gemma 4 Challenge — May 2026

When Prompts Go Wrong: Hidden Risks in AI Every QA Engineer Must Know

Hemalatha Nambiradje — Thu, 14 May 2026 03:07:40 +0000

🚨 AI systems are only as secure as their prompts.

As QA engineers, we test inputs every day — but are we testing our AI prompts the same way?

I explored 5 real prompt risks that can silently break AI systems:

🔴 Prompt Injection — users override system rules with malicious instructions
🔴 Prompt Hijacking — tasks get redirected to extract hidden instructions
🔴 Prompt Poisoning — bad data corrupts model outputs
🔴 Prompt Leaking — hidden system prompts get exposed
🔴 Jailbreaking — safety guardrails get bypassed entirely

These aren't theoretical. They are testable, production-level risks.

And QA engineers are exactly the right people to catch them. 🎯

📖 Read the full breakdown with real examples here:
👉 https://hemaai.hashnode.dev/when-prompts-go-wrong-hidden-risks-in-ai-every-qa-engineer-must-know

Prompt engineering is not just about better answers — it's about building safe and reliable AI. 🛡️

QualityEngineering #AITesting #PromptEngineering #PromptInjection #SDET #QA #AISecurity #LearningInPublic

How QA Evaluates Generative AI Models

Hemalatha Nambiradje — Thu, 30 Apr 2026 13:18:54 +0000

Evaluating Generative AI is not just about a single accuracy score. Metrics like ROUGE, BLEU, and BERTScore each measure different aspects of quality — coverage, precision, and meaning.

From a QA perspective, real confidence comes from combining:

automated metrics
human evaluation
business expectations

I wrote a deeper breakdown here
Read more: https://hemaai.hashnode.dev/why-one-metric-is-never-enough-to-evaluate-generative-ai

Learning and sharing one concept at a time

Fine‑Tuning Isn’t Optional for Production‑Ready AI

Hemalatha Nambiradje — Wed, 29 Apr 2026 13:15:55 +0000

Foundation models are powerful, but out‑of‑the‑box they’re rarely production‑ready. Fine‑tuning is what helps align AI systems with real business needs, safety expectations, and quality standards.
From a QA engineer’s perspective, fine‑tuning—through approaches like instruction tuning and RLHF—is critical for improving reliability, consistency, and trust in AI outputs.
I’ve shared a deeper breakdown of:

what fine‑tuning really means
why it’s needed for production systems
how QA principles apply to data preparation and evaluation

Read the full post here:
https://hemaai.hashnode.dev/fine-tuning-isn-t-optional-how-qa-engineers-make-ai-models-production-ready
Learning and sharing one day at a time

Understanding the Generative AI Application Lifecycle

Hemalatha Nambiradje — Thu, 23 Apr 2026 12:59:23 +0000

Generative AI isn’t just about writing prompts — real applications follow a clear lifecycle: defining a use case, selecting the right foundation model, improving performance, evaluating results, and deploying responsibly.
Understanding this lifecycle is key to building AI systems that are reliable, testable, and useful in the real world.
I’ve shared a concise breakdown of the Generative AI application lifecycle here
👉 Read more: https://hemaai.hashnode.dev/building-generative-ai-applications-the-right-way
Learning and sharing one concept at a time 🚀

GenerativeAI #ArtificialIntelligence #SoftwareEngineering #LearningInPublic #QualityEngineering

Why ML Models Break After Deployment

Hemalatha Nambiradje — Tue, 21 Apr 2026 13:08:28 +0000

Many machine learning models perform great during training—but start failing once they reach production.
From my recent learning in MLOps and AI testing, I’ve realized that the issue isn’t usually the model itself. It’s the lack of operational practices like monitoring, drift detection, safe deployments, and retraining.
I wrote a short post explaining:

why ML models degrade in production
how data and concept drift impact predictions
where MLOps and QA make a real difference

Read the full article here:
https://hemaai.hashnode.dev/why-machine-learning-models-break-after-deployment
Would love to hear how your teams handle ML failures in production

QA Engineers Have an Unfair Advantage in Machine Learning

Hemalatha Nambiradje — Sat, 18 Apr 2026 01:07:34 +0000

Most ML models don’t fail because of bad algorithms.
They fail because no one properly evaluates them.

That’s where a QA mindset changes everything.

⸻

Think Like a Tester, Not Just a Builder

In ML:

Training = writing code
Validation = testing & tuning
Test set = final regression

👉 Sound familiar?

⸻

⚖️ The Real Risk Isn’t Accuracy

Overfitting → model memorizes data
Underfitting → model misses patterns
Goal → a balanced model that generalizes

💡 “High accuracy” can still mean a bad model.

⸻

📊 Metrics That Actually Matter

Stop relying only on accuracy:

Precision → Are predicted defects actually defects?
Recall → Are we missing critical defects?
MSE / R² → For predicting numbers

👉 In QA terms: Missing a bug is worse than a false alarm.

⸻

💼 If It Doesn’t Help the Business, It’s Useless

A model isn’t successful because it scores well.
It’s successful if it creates impact.

A/B testing
Canary deployments

👉 Same principles as production rollouts in QA.

⸻

💭 Final Thought

We’re not just testing features anymore.
We’re testing intelligence.

And honestly? QA engineers are built for this.

⸻

🔗 Read the full breakdown:
https://hemaai.hashnode.dev/evaluating-ml-models-like-a-qa-engineer-not-a-data-scientist

Breaking Things and Building Better Tests: A Hackathon Snapshot

Hemalatha Nambiradje — Thu, 16 Apr 2026 20:18:12 +0000

I recently participated in the Breaking Things Hackathon hosted by Hashnode and sponsored by Bug0, and it turned out to be a genuinely fun and refreshing experience.
What made this hackathon special for me was how closely it aligned with my day‑to‑day work. As a Senior SDET, I use Playwright extensively with a traditional Page Object Model (POM) approach—defining locators, managing workflows, handling inheritance, and constantly maintaining selectors as the UI evolves. While powerful, this style often means spending more time on framework maintenance than on actual testing.
During the hackathon, I explored Playwright with Passmark, and the difference was immediately noticeable.
I didn’t have to worry about finding each element, writing XPath selectors, or structuring complex POM classes. Instead, I could focus on user flows and test intent. The tests were easy to read—even for someone without a strong coding background—and extremely quick to create. In a short time, I wrote close to 30 regression tests that were fast, stable, and surprisingly resilient thanks to self‑healing capabilities.
Another highlight was how effortless cross‑browser testing felt. With minimal setup, I could execute tests across browsers without additional configuration or boilerplate.
This hackathon challenged a few long‑held assumptions for me:

Automation doesn’t need to be complex to be effective
Tests don’t have to be hard to read to be reliable
AI can genuinely boost tester productivity, not complicate it

You can find the project I worked on here:
https://github.com/Hema-Nambi/passmark-project

A big thank you to Hashnode for hosting and Bug0 for sponsoring such a great event. Hackathons like this encourage learning by doing and make space for testers to explore modern, AI‑driven approaches to quality engineering.

Read the full post on Hashnode: https://hemaai.hashnode.dev/breaking-things-and-building-better-tests-my-hackathon-experience?

Making AI Work With Humans — Not Against Them

Hemalatha Nambiradje — Wed, 15 Apr 2026 19:00:56 +0000

AI is getting smarter every day — but today I learned something more important than model size or accuracy.

AI is only valuable if it works with humans, not instead of them.
Today’s learning focused on human‑centered AI, feedback‑driven learning, and safety — the pillars that turn AI from a risky black box into a trusted partner.
Key Takeaways

Human‑Centered Design (HCD)
AI should support human decision‑making, not override it.
Good AI explains uncertainty, highlights risks, and keeps humans in control.

Reinforcement Learning from Human Feedback (RLHF)
AI improves by learning from human preferences — not just data.
This is what makes modern AI more helpful, aligned, and safer.

Safety & Transparency
Powerful AI without explainability is a liability.
Trust comes from knowing why a model behaves the way it does — and when humans should step in.

Why This Matters for QA & Engineering
Testing AI isn’t just about accuracy and performance.
It’s about trust, explainability, bias detection, and safe failure paths.
QA teams are becoming the ethical guardians of AI systems.
The future of AI isn’t autonomous — it’s collaborative.

Read the full post on Hashnode:
https://hemaai.hashnode.dev/making-ai-work-with-humans-not-against-them