This is a submission for the Gemma 4 Challenge: Build with Gemma 4
What I Built
Claim review is not a chatbot problem. It is an evidence problem.
ClaimSetu was built to answer one question safely: does the submitted hospital claim packet contain enough evidence for a human reviewer to move forward?
ClaimSetu is a local-first, evidence-backed claim-review assistant for health insurance workflows. It reads messy claim packets — scanned PDFs, photos, discharge notes, bills, lab reports, procedure records, and clinical notes — and turns them into a reviewer-ready evidence pack.
It produces:
- document classification and page triage
- extracted claim fields with provenance
- admission, diagnosis, treatment, and discharge timeline checks
- missing-document and weak-evidence flags
- package-rule findings
- a PASS / CONDITIONAL / REVIEW recommendation for human review
In health insurance workflows, a delayed or unclear claim decision is not just an operational issue. It can create back-and-forth between hospitals, payers, and beneficiaries. ClaimSetu focuses on making review faster, more consistent, and more explainable without removing the human decision-maker.
The core principle is simple:
OCR reads. Gemma reasons. Humans decide.
ClaimSetu is not an autonomous adjudicator. If evidence is missing, weak, or contradictory, it does not guess. It escalates the claim to CONDITIONAL or REVIEW with source-backed reasons.
Demo
Video walkthrough: https://www.youtube.com/watch?v=pygwfJl8b5M
The demo shows ClaimSetu reviewing a severe anemia claim packet. The system identifies useful admission evidence, diagnostic evidence, and clinical notes, but flags missing treatment details, post-treatment evidence, and discharge summary evidence.
Instead of forcing an approval, ClaimSetu returns a CONDITIONAL recommendation with reviewer-facing reasons and evidence gaps.
That is the behavior I wanted: not a confident black box, but a cautious co-pilot for human reviewers.
Code
GitHub repository: https://github.com/ai-suraksha/claimsetu
The repository includes:
- the local claim-review pipeline in
claimsAssistant.py - a FastAPI demo app in
app.py - an interactive browser demo in
demo/index.html - architecture and design assets
- setup instructions for running Gemma 4 locally through Ollama
The public repository does not redistribute real claim packets, patient identifiers, hospital names, doctor names, or private annotations. Raw claim data is expected to stay local and private.
How I Used Gemma 4
ClaimSetu uses Gemma 4 as the understanding and reasoning layer inside a hybrid evidence pipeline.
I intentionally did not use Gemma as a black-box OCR engine. Healthcare claim review needs traceability: source page, extracted text, confidence, and evidence links. So ClaimSetu first uses PaddleOCR and PyTesseract to read documents, then sends the OCR evidence to Gemma 4.
I used a two-model strategy.
Gemma 4 E4B handles the edge layer:
- cleanup of noisy OCR text
- page triage
- document classification fallback
- structured extraction from messy claim pages
E4B was the right fit because this stage needs to be fast, local, and repeatable across many pages.
Gemma 4 26B MoE handles the reasoning layer:
- claim-level timeline interpretation
- package-rule reasoning
- contradiction detection
- reviewer-facing explanation
- PASS / CONDITIONAL / REVIEW recommendation
26B MoE was the right fit because final claim review needs broader context and stronger reasoning, while still supporting local or local-network inference.
The most important design choice was separating model reasoning from safety enforcement. Gemma interprets messy evidence and explains what the reviewer should verify next. Deterministic code enforces date validation, source-text checks, confidence thresholds, missing-document rules, and timeline consistency.
That separation made ClaimSetu more useful, more auditable, and safer for healthcare workflow.
What I Learned
The biggest lesson from building ClaimSetu was that LLMs are most useful in regulated workflows when they are constrained by evidence.
Gemma 4 was strongest when it had a focused job: structure messy OCR, reason over extracted facts, and explain what the reviewer should verify next.
The system became safer when I stopped asking the model to “decide the claim” and instead designed it to support a human reviewer.
OCR reads. Gemma reasons. Humans decide.

Top comments (0)