Nick Switzer

Posted on May 24

Vessel Ops

#devchallenge #gemmachallenge #gemma

Gemma 4 Challenge: Build With Gemma 4 Submission

This is a submission for the Gemma 4 Challenge: Build with Gemma 4

What We Built

"When a crew member is injured 200 miles offshore, there is no internet, no doctor, and no second opinion. This application is the second opinion."

We built Vessel Ops AI, an offline-first desktop application that serves as a lifeline for the Medical Person in Charge (MPIC) and Chief Engineer on vessels operating hundreds of miles from shore.

When a crew member is injured mid-ocean, the MPIC usually has to flip through a 400-page medical manual under immense pressure. When a critical component fails, the Chief Engineer troubleshoots alone without an OEM support hotline or even a simple Google search.

Vessel Ops AI puts Gemma 4 on the laptop in the wheelhouse. It's grounded in the WHO International Medical Guide for Ships, cites specific pages for every piece of advice, and requires zero connectivity at sea.

Key Features

🏥 RAG-grounded medical triage — every response cites [WHO IMGS, p. XX] from an embedded 938-chunk knowledge base
⚙️ Engineering fault diagnosis — component tracking, maintenance logs, and AI-assisted troubleshooting
📸 Multimodal analysis — upload a photo of an injury or a failing part and Gemma classifies it on-device
🎓 MPIC Study Mode — an interactive medical examiner that scores crew responses 1–10 against established protocols
🧠 Captain Sparky Trivia — morale-boosting maritime trivia to keep crew engaged during long watches
🔄 Offline sync queue — every write accumulates locally and pushes to Firebase Firestore when the vessel reaches port

Demo

🌐 Live hosted demo (no install required):
https://vessel-ops.web.app
(Cloud preview mode — Gemma 4 26B MoE via Google AI Studio. A yellow banner indicates cloud mode.)

(The 3-minute pitch is embedded at the top of this post! Looking for a deeper dive? Check out the Extended Technical Demo)

💻 Download the desktop app:
GitHub Releases — Windows installer (75 MB, no admin required) or portable .zip

Code

switzloco / sail_pal

A Gemma small language model tool to help sailors with medical, maintenance and other needs when they don't have reliable internet access in the middle of the ocean

Vessel Ops AI

"When a crew member is injured 200 miles offshore, there is no internet, no doctor and no second opinion. This application is the second opinion."

An offline-first AI assistant for the Medical Person in Charge (MPIC) and Chief Engineer on vessels operating in deep-water environments. Runs entirely on a laptop — no cloud, no connectivity required at sea. Powered by Gemma 4 via Ollama.

Hackathon Context

Built for the Gemma 4 Good Hackathon (Kaggle × Google DeepMind, due May 18, 2026).

Prize targets:

Ollama Prize ($10k) — best project using Gemma 4 via Ollama
Global Resilience Prize ($10k) — offline disaster/emergency response
Health & Sciences Prize ($10k) — medical decision support
Cactus Prize ($10k) — local-first app with intelligent model routing

Deployment

Laptop (MacBook, Windows, Linux)
  └── Ollama  →  gemma4:e2b (general / engine / maintenance / trivia)
              +  nswitzer/gemma4-maritime-medical-GGUF (Unsloth fine-tune
                 used automatically for medical routes)

…

View on GitHub

How We Used Gemma 4

Which Model — and Why

Vessel Ops AI doesn't just "use Gemma" — Gemma is the core intelligence, and the model selection is driven entirely by the deployment constraint: a sailor's laptop with no internet.

The Gemma 4 family gave us three options:

Model	Architecture	Total Params	Active Params	Our Verdict
E2B	Dense	~2B	2B	✅ Default edge model — fits 8–16 GB RAM laptops
E4B	Dense	~4B	4B	✅ Scale-up model — stronger reasoning on 32 GB+ hardware
26B MoE	Mixture-of-Experts	~26B	~4B	✅ Cloud preview — used for the hosted demo via AI Studio
31B	Dense	~31B	31B	❌ Too heavy for shipboard hardware

Why we chose the E2B/E4B small models for edge deployment:

Maritime laptops are typically mid-range machines such as a 2–3 year old ThinkPad with 8–16 GB RAM and no discrete GPU. The 26B MoE requires ~16–24 GB VRAM even at Q4 quantization; the 31B dense needs 24 GB+. Neither fits the hardware reality at sea. The E2B and E4B models were purpose-built for exactly this constraint — ultra-mobile, edge deployment on devices like mid-range laptops.

Our system implements intelligent model routing based on clinical severity:

# backend/ai/ollama_client.py
def _pick_model(self, severity: str) -> str:
    if severity in ("critical", "serious"):
        return self.model_scale    # gemma4:e4b — deeper reasoning
    return self.model_primary      # gemma4:e2b — fast, fits any laptop

Minor queries (crew vitals logging, routine maintenance) hit the fast E2B model. Critical medical events and serious engineering faults escalate to E4B for deeper reasoning — if the hardware supports it. This intelligent routing perfectly fits a local-first architecture pattern where the routing decision is quite literally life-or-death.

Why the 26B MoE for the cloud demo:

The hosted preview at vessel-ops.web.app uses gemma-4-26b-a4b-it via Google AI Studio. The MoE architecture is ideal here — it delivers the intelligence of a much larger model while only computing ~4B parameters per token, keeping latency low for the demo. Crucially, we disabled Google Search grounding so the cloud preview doesn't look artificially smarter than what a crew member would experience offline with the E2B/E4B models.

How Gemma Does Real Work

Gemma isn't a chat wrapper in this project. It's doing three distinct kinds of real work:

1. RAG-grounded medical triage
Every query — regardless of which screen the user is on — automatically retrieves the top BM25-ranked passages from a 938-chunk SQLite FTS5 index of the WHO International Medical Guide for Ships (3rd Edition). These passages are injected into the system prompt with explicit citation markers, and Gemma is instructed to cite [Source, p. XX] on every claim. Chris loves this page / source feature so he can dig deeper on his own.

This dramatically reduces hallucination on dosages and protocols — exactly the kind of mistake that could kill/maim someone at sea.

2. Multimodal injury & component analysis
When a crew member uploads a photo of a wound or a failing engine part, Gemma's vision capability classifies the observation. The classification feeds back into the RAG step as additional query context, so the retrieved protocols are relevant to what the camera actually sees.

3. Domain-specific personas
Different code paths apply different system prompts — medical chat, engine chat, MPIC study mode, trivia — all running on the same Gemma model. The MPIC study mode is particularly compelling: it acts as an interactive medical examiner that presents emergency scenarios and scores the crew member's response 1–10 against WHO protocols.

The FTS5 Decision (Killing the Embedding Model)

This deserves its own section because it was the single most impactful engineering decision in the project.

We started with ChromaDB + sentence-transformers for RAG. The embedding model alone added ~1 GB to the installer. For a vessel deployment where every megabyte matters and the installer ships on a USB stick, this was a dealbreaker.

We replaced the entire RAG layer with SQLite FTS5 — which is built into stock Python — using BM25 ranking and a porter stemmer. The 938 WHO IMGS chunks ship as a 1.3 MB JSON that bootstraps into the FTS5 virtual table on first launch. Bundle size dropped ~99% with no measurable loss in retrieval quality on this corpus.

Is BM25 perfect? No. Lexical search struggles with synonyms — if a sailor types "cut," BM25 might miss the WHO manual's section on "lacerations." Dense retrieval (embeddings) normally bridges this semantic gap, but we refused to pay the 1 GB size penalty.

Instead, we implemented LLM-Based Query Expansion. When a sailor submits a medical query, we do a rapid, invisible pass through the fast gemma4:e2b model: "Rewrite this medical symptom into formal clinical terminology for a manual search." Gemma translates "cut" to "laceration, hemorrhage," and we pass the expanded query to the FTS5 index. We get the intelligence of semantic search with the footprint of lexical search.

Architecture

Laptop at Sea (no internet)
  ├── Ollama  →  gemma4:e2b (default) / gemma4:e4b (32GB+ RAM)
  ├── FastAPI backend  →  SQLite WAL + FTS5 RAG index
  ├── Next.js 14 static export  →  Tauri 2 desktop wrapper
  └── Any device on ship LAN  →  http://<laptop-ip>:3000

Cloud Preview (for judges / demos)
  ├── Google AI Studio  →  gemma-4-26b-a4b-it (MoE)
  ├── Cloud Run  →  FastAPI backend (stateless, auto-seeds demo data)
  └── Firebase Hosting  →  https://vessel-ops.web.app

Challenges we Overcame

Running frontier AI on "potato" hardware: Many vessels run 3+ year old laptops. Gemma 4 E2B was chosen specifically for its capability-to-size ratio — responses generate fast enough to be useful in an emergency on hardware that wouldn't fit a 12B+ model in RAM alongside the application.

Deployment for non-technical crews: Installing Python, Node, and Ollama is beyond a typical crew member. We wrap everything — Python interpreter, dependencies, FTS5 index, WHO PDF — into a single NSIS installer that runs without admin rights.

PyInstaller orphan processes: The PyInstaller --onefile bootloader on Windows doesn't propagate termination to its child process. We reap orphan vessel-ops-backend.exe processes on every launch with a 500ms settle — without this, the second launch silently fails because port 8000 is still bound.

Team

This project started with Capt. Chris Oprzadek — eight years as a Navy Nuclear Submarine Officer out of Hawaii and Guam, two Atlantic crossings, and now heading out with Seamester to teach offshore sailing. The brief came directly from him: "we need a second opinion that works when there's no satellite."

Our medical advisor Dr. Michael Switzer — a physician who refits and maintains his own boat out of Port Townsend, WA — documents the practitioner-meets-boat-owner reality in his YouTube channel, where every onboard repair is also a reminder that out here you fix it yourself.

Built by Nick Switzer. Between those three perspectives — deep-water passages, self-reliant boat life, and software — Vessel Ops AI's design brief wrote itself.

Vessel Ops AI is open source under GPL-3.0. The WHO IMGS content is used under fair-use for safety-critical, non-commercial maritime applications.

DEV Community