DEV Community

Cover image for I Built an IoT Forensic Investigation Simulator Powered by Gemma 4 — Paste Any Incident, Get a Full Case with Evidence, Decisions, and a Forensic Report
Tariq Davis
Tariq Davis

Posted on

I Built an IoT Forensic Investigation Simulator Powered by Gemma 4 — Paste Any Incident, Get a Full Case with Evidence, Decisions, and a Forensic Report

Gemma 4 Challenge: Build With Gemma 4 Submission

This is a submission for the Gemma 4 Challenge: Build with Gemma 4


What I Built

I built Threat Trace — an IoT forensic investigation simulator that takes any incident scenario and turns it into a playable 6-stage investigation, powered by Gemma 4 31B.

You describe an incident. Gemma reads it and generates a complete forensic case — real evidence, real decision points, real consequences. Every stage puts you in front of a choice. Wrong calls contaminate your evidence or break chain of custody. At the end you get a downloadable forensic report you can actually use.

This isn't a quiz. It's a training tool built for people who don't have access to expensive forensic labs — specifically designed around the Caribbean institutional context: small IT teams, limited hardware, JCF Cybercrime Unit reporting requirements, and the kinds of incidents that actually happen here.

Threat Trace landing screen showing case title, score tracker, integrity meter, and the Mandeville Maternity Pressure Leak incident summary


Demo

🎮 Play Threat Trace →

Try the pre-loaded scenarios or paste your own IoT incident. Every input generates a unique case.


Code

🔗 github.com/FlowArchitect895/Threat-Trace


How I Used Gemma 4

I chose Gemma 4 31B Dense. Here's why that was the only real option for this project.

The Architecture: Front-load everything

Gemma does all the heavy work once — at generation. When you submit a scenario, one API call produces the entire investigation: all 6 stages, every choice, every consequence, every narrative, and the final report. During actual gameplay there are zero API calls. Every response is instant because it's already been computed.

This makes free 24/7 deployment viable. Cached scenarios cost nothing to replay.

Stage 1 of 6 — Identification. The hospital IT team is alerted by nursing staff. Evidence: NetFlow logs showing bursts of 512KB packets to a non-NWC IP address in Eastern Europe. Three choices presented to the investigator.

Why 31B Dense specifically

256K context window. IoT incidents don't happen in isolation — they involve logs from multiple devices, network captures, firmware dumps, infrastructure context. I needed a model that could hold an entire incident in one prompt and reason across all of it. No other open model has that context window.

Structured output reliability. Gemma 4's thinking mode produces clean, parseable JSON when you force the output format correctly. Without it the parse failure rate is around 50%. With it — near perfect. The game state depends on deterministic output, so this wasn't optional.

Open model. The whole point of this build is accessibility. Running on a closed API defeats that for the communities this tool is meant to serve.

What Gemma actually does

The depth surprised me. I threw this at it:

"A smart water pressure sensor at a children's hospital in Mandeville began sending encrypted packets during maintenance windows — always 1AM-3AM, never consecutive nights. A nurse in the maternity ward noticed hot water pressure dropped every time the anomaly occurred."

Gemma mapped it to T1041 — Exfiltration Over C2 Channel and built the case around an Industrial IoT water pressure sensor. But it didn't stop at the attack technique. It identified that the pressure drop was caused by a function called trigger_valve_bleed() — executed immediately before send_encrypted_payload(). The physical action preceded the data transmission. That means the attacker was using valve actuation as an out-of-band heartbeat to confirm successful exfiltration to a local observer.

It also caught that the non-consecutive timing wasn't random — it was deliberate evasion of basic threshold monitors. And it flagged the exfiltrated payload as internal VLAN mapping data, identifying the sensor as a pivot point for lateral movement, not the final target.

A nurse found the breach. Not IT. Gemma put that in the analysis without being told to.

That's adversarial reasoning. Not pattern matching.

Correct answer feedback — Stage 1. Implementing a mirrored SPAN port highlighted in green. Feedback reads:

Incorrect answer feedback — Stage 2.

The Caribbean context layer

This is what makes it different from a generic forensic simulator. Every generated case includes:

  • Resource constraints realistic to Jamaican institutions — the Mandeville case used university tools to compensate for limited forensic hardware budgets
  • JCF Cybercrime Unit and Jamaica Cybercrime Act reporting requirements
  • Third-party utility dependencies — NWC managing critical hospital infrastructure with no on-site visits in 8 months
  • The human observation chain — a nurse, not a security system, was the first line of detection

That last point is the most Caribbean thing in the report. And Gemma put it there without being told to.

The downloadable report reflects all of this — not a game score, an actual forensic document.

Forensic Investigation Report — top section. Shows incident summary, attack technique T1041, and device type: Industrial IoT Water Pressure Sensor.

Forensic Investigation Report — findings and Caribbean context. Documents the physical heartbeat mechanism, lateral movement campaign, and regional infrastructure challenges including NWC dependency and university tool partnerships. Score: 40/60 | Evidence Integrity: 70%


Judging criteria mapped

Intentional model selection — 256K context for multi-artifact IoT scenarios, structured JSON for game state reliability, open model for the communities this serves.

Technical implementation — front-loaded generation, zero-cost runtime, structured output parsing, deterministic scoring with pre-computed consequences.

Creativity — IoT forensics as an interactive investigation with real physical consequence. A water pressure drop in a maternity ward as the first indicator of compromise. Caribbean context as a first-class feature, not an afterthought.

Usability — three preloaded scenarios, open input for any incident, downloadable forensic report with real methodology output that investigators can actually use.

Top comments (0)