OCR Reads, Gemma Reasons: ClaimSetu for Evidence-Backed Health Claim Review

Virat Chourasia — Sun, 24 May 2026 14:29:36 +0000

This is a submission for the Gemma 4 Challenge: Build with Gemma 4

What I Built

Claim review is not a chatbot problem. It is an evidence problem.

ClaimSetu was built to answer one question safely: does the submitted hospital claim packet contain enough evidence for a human reviewer to move forward?

ClaimSetu is a local-first, evidence-backed claim-review assistant for health insurance workflows. It reads messy claim packets — scanned PDFs, photos, discharge notes, bills, lab reports, procedure records, and clinical notes — and turns them into a reviewer-ready evidence pack.

It produces:

document classification and page triage
extracted claim fields with provenance
admission, diagnosis, treatment, and discharge timeline checks
missing-document and weak-evidence flags
package-rule findings
a PASS / CONDITIONAL / REVIEW recommendation for human review

In health insurance workflows, a delayed or unclear claim decision is not just an operational issue. It can create back-and-forth between hospitals, payers, and beneficiaries. ClaimSetu focuses on making review faster, more consistent, and more explainable without removing the human decision-maker.

The core principle is simple:

OCR reads. Gemma reasons. Humans decide.

ClaimSetu is not an autonomous adjudicator. If evidence is missing, weak, or contradictory, it does not guess. It escalates the claim to CONDITIONAL or REVIEW with source-backed reasons.

Demo

Video walkthrough: https://www.youtube.com/watch?v=pygwfJl8b5M

The demo shows ClaimSetu reviewing a severe anemia claim packet. The system identifies useful admission evidence, diagnostic evidence, and clinical notes, but flags missing treatment details, post-treatment evidence, and discharge summary evidence.

Instead of forcing an approval, ClaimSetu returns a CONDITIONAL recommendation with reviewer-facing reasons and evidence gaps.

That is the behavior I wanted: not a confident black box, but a cautious co-pilot for human reviewers.

Code

GitHub repository: https://github.com/ai-suraksha/claimsetu

The repository includes:

the local claim-review pipeline in claimsAssistant.py
a FastAPI demo app in app.py
an interactive browser demo in demo/index.html
architecture and design assets
setup instructions for running Gemma 4 locally through Ollama

The public repository does not redistribute real claim packets, patient identifiers, hospital names, doctor names, or private annotations. Raw claim data is expected to stay local and private.

How I Used Gemma 4

ClaimSetu uses Gemma 4 as the understanding and reasoning layer inside a hybrid evidence pipeline.

I intentionally did not use Gemma as a black-box OCR engine. Healthcare claim review needs traceability: source page, extracted text, confidence, and evidence links. So ClaimSetu first uses PaddleOCR and PyTesseract to read documents, then sends the OCR evidence to Gemma 4.

I used a two-model strategy.

Gemma 4 E4B handles the edge layer:

cleanup of noisy OCR text
page triage
document classification fallback
structured extraction from messy claim pages

E4B was the right fit because this stage needs to be fast, local, and repeatable across many pages.

Gemma 4 26B MoE handles the reasoning layer:

claim-level timeline interpretation
package-rule reasoning
contradiction detection
reviewer-facing explanation
PASS / CONDITIONAL / REVIEW recommendation

26B MoE was the right fit because final claim review needs broader context and stronger reasoning, while still supporting local or local-network inference.

The most important design choice was separating model reasoning from safety enforcement. Gemma interprets messy evidence and explains what the reviewer should verify next. Deterministic code enforces date validation, source-text checks, confidence thresholds, missing-document rules, and timeline consistency.

That separation made ClaimSetu more useful, more auditable, and safer for healthcare workflow.

What I Learned

The biggest lesson from building ClaimSetu was that LLMs are most useful in regulated workflows when they are constrained by evidence.

Gemma 4 was strongest when it had a focused job: structure messy OCR, reason over extracted facts, and explain what the reviewer should verify next.

The system became safer when I stopped asking the model to “decide the claim” and instead designed it to support a human reviewer.