DEV Community

Cover image for InsurIQ - Policy Intelligence Engine
Bharat Bhandari
Bharat Bhandari

Posted on

InsurIQ - Policy Intelligence Engine

GitHub “Finish-Up-A-Thon” Challenge Submission

This is a submission for the GitHub Finish-Up-A-Thon Challenge

What I Built

InsurIQ is an agentic platform that acts as an independent insurance expert
for Indian health insurance consumers — helping people actually understand the
policy they own, without depending on a commission-driven salesperson.

India's health insurance documents are brutal. A real Policybazaar kit (I tested
with a Niva Bupa ReAssure 2.0 PDF) is 58 pages of dense legal language: policy
wordings, a personalised schedule, an IRDAI-mandated Customer Info Sheet,
marketing filler, and a hospital blacklist — all concatenated into one file,
internally inconsistent, and designed by lawyers, not people.

The question "Is my mother's knee replacement covered, and when?" requires
assembling:

  • The specific-disease 24-month waiting clause (joint replacement is explicitly listed)
  • The pre-existing disease (PED) waiting clause
  • The meta-rule that the longer of the two waiting periods shall apply
  • AND the room-rent proportionate-deduction interaction

...across four different pages, with one clause being a rule about which other
clause wins. This is not a retrieval problem. It's a reasoning problem over
structured legal structure — and getting it wrong causes real financial harm.

The product has three phases deliberately sequenced:

  1. Post-purchase understanding — "What did I actually buy?" ← building this first
  2. Pre-purchase advisory — "Which policy is right for me?"
  3. Claim assistance — "Help me actually get paid."

Phase 1 is self-contained, has no regulatory exposure (explaining a document the
user already owns), and produces the structured-extraction engine that phases 2
and 3 both depend on.

Tech stack:

  • Backend: FastAPI + LangGraph 1.1 + LangChain-Groq (uv-managed monorepo)
  • Frontend: Next.js + TypeScript
  • DB: PostgreSQL (relational + JSONB)
  • Deployed: insuriq.himalayandev.tech

Demo

🔗 Live: insuriq.himalayandev.tech

Upload a health insurance PDF → InsurIQ segments it by document type, runs a
structured extraction pass with full clause-level citations, flags anything
unverifiable, and presents a queryable, grounded policy object — no hallucinated
facts, every answer traceable to the exact source clause.


The Comeback Story

Before — the placeholder that should never have existed

The project started as an exploration into agentic AI with a real use case. The
first node I wrote — src/nodes/insurance_node.py — took a policy_name string
and asked the LLM to summarise the policy from its training knowledge.

Let that sink in. A health insurance assistant confidently generating policy
facts from LLM memory, with zero grounding in the actual document. Wrong waiting
periods. Hallucinated exclusions. Fabricated sub-limits. This is the exact
failure mode that could cause real harm — someone skips a hospital network check
because the AI said they were covered.

It was a prototype placeholder, and it was wrong by design.

After — grounded, structured, citation-first

The entire extraction architecture was rebuilt from scratch during this
challenge. Key shifts:

1. Structured extraction over RAG

RAG (retrieval-augmented generation) is the obvious first instinct — chunk the
PDF, embed it, retrieve relevant sections. I deliberately rejected it.

Decision-relevant insurance questions are non-local. They require assembling
clauses from different pages where one clause controls the interpretation of
another. Top-k chunk retrieval cannot do that reliably. It will confidently
answer wrong.

Instead: at upload time, run a schema-driven extraction pass once that
produces a complete, fully-cited policy object. All user questions are answered
by reasoning over that structured object.

2. Three-layer document model

After analysing a real 58-page full policy kit, I identified that a single
uploaded "policy" is actually multiple document types with different trust levels:

Layer Source Trust
Wording Policy Wordings (legal contract) Authoritative for rules
Schedule Insurance Certificate Authoritative for this user's values
CIS IRDAI-mandated Customer Info Sheet Pre-structured scaffold with clause cross-refs
Marketing Policybazaar cover, simplified sidebars Ignore for facts

This matters because wording alone misleads. The wording says PED wait is "36
months (48 for Bronze/Silver/Gold)" — but this user's schedule says variant is
Platinum+ and PED is None declared. Wording + Schedule together give the
real answer. Schedule says Co-payment: Not Opted — the co-pay % literally
doesn't exist without the schedule.

3. Grounding is non-negotiable

Every extracted fact carries: source page, verbatim clause span, confidence
score, and verification status. Anything the system cannot verify against a
source clause is flagged as unknown — confirm with insurer. Never fabricated.
Never guessed.

4. The LangGraph graph now has real conditional branching

The old code was a chain. The new extraction graph has genuine branches:

  • Scanned PDF vs clean text layer → different OCR path
  • Verification pass: fail → human review loop
  • User correction → re-verify → update structured record

This is the pattern I first used in reelwright (human-in-the-loop regenerate
loop for AI video generation) — now applied to policy document understanding.

5. PostgreSQL schema designed for auditability

policy_provisions table: one row per extracted fact, with source_page,
source_text, confidence, verification_status, user_corrected. Every
update is auditable. Cross-policy comparison is possible. JSONB for the
genuinely variable benefit tables that don't fit a fixed schema.


My Experience with GitHub Copilot

The two places where Copilot had the biggest impact:

Schema design velocity. The Pydantic models for a three-layer policy object
are dense — nested types, optional fields for schedule-derived values, IRDAI
standard exclusion codes (Excl01–Excl18), linked wording↔schedule joins. Copilot
kept up with the domain context within a session and suggested field names,
Optional wrapping, and validator patterns that matched the structure I was
building toward. What would have been 2–3 hours of boilerplate was maybe 40
minutes.

LangGraph node stubs. Once I described the graph shape — segmentation node,
section-targeted extraction node, verification node, HITL correction node —
Copilot generated solid first-draft stubs with the right StateGraph wiring and
TypedDict state shapes. I rewrote the logic inside each node (that's where the
real reasoning work lives), but the scaffolding was accurate enough to build on
immediately.

Where Copilot didn't help — and I stopped leaning on it: the actual extraction
prompt engineering. The prompts that tell the LLM exactly how to extract a
waiting_period object with a cited clause span from adversarial legal text are
things I had to derive by reading real policy documents, testing, and iterating.
No autocomplete shortcut for that. That was the good, slow, deliberate work.


One honest note: I'm building this to learn agentic AI properly, with a real
use-case that has real stakes. The Finish-Up-A-Thon deadline was the push I
needed to stop designing and start shipping. The placeholder is gone. The
grounding architecture is in. Next: get one full policy end-to-end through the
pipeline into queryable Postgres tables.

If you're navigating Indian health insurance and want to try it:
insuriq.himalayandev.tech

Top comments (0)