Gangadhar

Posted on Apr 26 • Originally published at github.com

I Built an Offline AI for Indian Courts — Here's Why (And How)

#ai #opensource #python #legal

TL;DR: FIRs contain sensitive witness data. Sending them to OpenAI violates judicial confidentiality. I built a 100% offline legal assistant that runs on a judge's laptop with zero internet. Open source, MIT license, fully functional.

GitHub: github.com/gangadharv444/smart-court-assistant-

Demo Video: Watch on YouTube (3 min)

The Problem: Why Judges Can't Use ChatGPT

Last year, I was volunteering with a legal aid organization. A judge asked a simple question:

"Can I use AI to analyze FIRs faster?"

My answer should have been "yes." Instead, I said "not safely."

Here's why:

FIRs are confidential. They contain:

Witness names, addresses, phone numbers
Minor victim details (sexual assault cases)
Undercover officer identities
Confidential informant data
Case strategy from prosecutors

If you send an FIR to OpenAI or Google:

Data goes to US servers
Violates Indian judicial confidentiality laws
Risks witness safety
Could invalidate proceedings

The judge had a real problem. Standard AI solutions couldn't solve it.

The Real Opportunity: 2024 Judicial Reform

India's judiciary underwent a historic overhaul on July 1, 2024.

Three new codes replaced colonial-era legislation:

IPC (1860) → BNS (Bharatiya Nyaya Sanhita 2023)
CrPC (1973) → BNSS (2023)
Indian Evidence Act (1872) → BSA (2023)

Overnight, every legal professional in India needed to cross-reference old sections to new ones:

"IPC 302?" → "Now it's BNS 103"
"IPC 420?" → "Now it's BNS 318"
"IPC 498A?" → "Now it's BNS 85"

Judges were manually searching through gazette notifications. Lawyers were printing reference sheets. Court staff were overwhelmed.

This was an urgent, real problem affecting millions of judicial proceedings.

My Solution: A Smart Court Assistant

What is it?

An offline, air-gapped legal assistant for Indian courts. No internet, no APIs, no cloud dependency. Everything runs locally on a judge's laptop.

What does it do?

Feature	Description
Case Analysis	Upload FIR → Ask questions → AI answers using only that document (RAG pipeline)
Timeline Extraction	Automatically extract chronological timeline of events from evidence
Conflict Detection	Cross-examine 2+ FIRs to find contradictions in facts, timelines, witness accounts
IPC to BNS Mapping	Instant conversion of old law sections to new ones (300+ deterministic mappings)
Regional OCR	Extract text from Kannada/Hindi/Marathi PDFs with English translation
Bulk Analysis	Auto-scan documents for all IPC sections, generate BNS transition reports

How is it 100% offline?

Uses Llama-3 8B (quantized to 4.7 GB) running locally via Ollama
All embeddings stored in ChromaDB (local vector database)
No APIs called after setup
Works on consumer laptops (16 GB RAM, CPU-only)
Tested on AMD Ryzen 5 5500U (6 cores, 12 threads)

Why This Matters More Than You Think

For Judges and Lawyers

Analyze FIRs in seconds, not hours
No confidentiality concerns (zero data leaves the machine)
Works in air-gapped court networks
No subscription fees or API keys

For India's Judiciary

Judicial efficiency without sacrificing privacy
Easier IPC to BNS transition (urgent need in 2024)
Regional language support (Kannada, Hindi, Marathi)
No vendor lock-in or dependency on cloud providers

For Technologists

Shows how to build production AI under extreme constraints
Quantized LLMs on CPU (not everyone has GPUs)
RAG pipeline for domain-specific legal tasks
Modular, testable, offline-first architecture

The Technical Approach

Why Llama-3 8B?

Most people think "AI = big, expensive, cloud-based."

I proved you can build sophisticated AI offline:

Model:          Llama-3 8B (Meta, open license)
Size:           4.7 GB (Q4 quantization)
Inference:      CPU-only (no GPU needed)
Latency:        15-25 seconds per query
Context Window: 8K tokens
Accuracy:       Better than GPT-3.5 on reasoning
License:        Commercial use allowed

Why not larger models?

13B+ models don't fit in 16 GB RAM (with OS + embeddings + vector DB)
GPU acceleration ruled out (target = consumer laptop)
8B is the optimal size for offline judicial work

Architecture: RAG Pipeline

Instead of relying on the LLM's training data (which can hallucinate), I use:

Document Loading — Upload FIR PDFs
Chunking — Break into 500-char segments with overlap
Embedding — Convert to vectors using all-MiniLM-L6-v2 (384-dim)
Storage — Index in ChromaDB (local vector DB)
Retrieval — Find top-3 relevant chunks for query
Generation — Feed to Llama-3 with context

Result: LLM answers grounded in actual document content, not hallucinations.

IPC to BNS Mapping: Deterministic, Not AI

Here's the key insight: Don't use AI for things that need to be 100% accurate.

Instead of asking Llama-3 to map "IPC 302 → BNS ?":

LLM might say "BNS 302" (wrong)
Or hallucinate a completely different section

I use a hardcoded dictionary of ~300 verified mappings:

IPC 302 → BNS 103 (always correct)
IPC 420 → BNS 318 (always correct)
Zero hallucination risk

The AI is only used for interpretation after the mapping is confirmed.

What I Learned Building This

1. Constraints Enable Innovation

Building for a 16 GB RAM laptop forced me to:

Use quantized models instead of bloated ones
Cache aggressively (@st.cache_resource)
Choose CPU-friendly algorithms
Think about deterministic vs AI-based decisions

Most cloud-first AI projects never think about these things.

2. Domain-Specific AI Beats General AI

GPT-4 is amazing at general tasks. But for IPC to BNS mapping?

A deterministic dictionary beats any LLM
Cost: Free vs $30/month API
Speed: less than 1ms vs 10-30 seconds
Reliability: 100% vs 95%

Know when to use AI, when to use databases.

3. Offline-First is a Feature, Not a Limitation

The constraint of "no cloud" forced better design:

Privacy by default
No API key management
Works in any environment
No data exfiltration risks

The best products often solve hard constraints.

Technical Stack

Component	Choice	Why
LLM	Llama-3 8B via Ollama	Fits in RAM, good reasoning, open license
RAG	LangChain + ChromaDB	No server setup, persists to disk, simple
Embeddings	all-MiniLM-L6-v2	80MB, fast on CPU, trained on legal domains
Frontend	Streamlit	No frontend code needed, hot reload, built-in UI
OCR	Tesseract + Poppler	Free, supports Indian scripts, offline
Language	Python 3.10+	Rich ML ecosystem, fast iteration

The Numbers

Codebase:             18 modular Python files (~3,000 lines)
Test Coverage:        43 unit tests covering core logic
BNS Sections:         358 complete legal database
IPC-to-BNS Mappings:  ~300 verified entries
Dashboard Tabs:       6 functional modules
Setup Time:           10 minutes (clone, install, run)
Cloud API Calls:      Zero
GPU Required:         No
Internet Required:    No (after setup)

How to Try It

Quick Start (10 minutes)

# 1. Clone the repo
git clone https://github.com/gangadharv444/smart-court-assistant-
cd smart-court-assistant-

# 2. Install dependencies
python -m venv venv
venv\Scripts\activate  # Windows
pip install -r requirements.txt

# 3. Download model
ollama run llama3

# 4. Run the app
streamlit run app.py

The dashboard opens at http://localhost:8501.

Try With Sample Documents

The repo includes sample PDFs to test:

dummy_fir.pdf — Sample FIR
witness_statement.pdf — Witness statement

In the Case Analysis tab:

Upload dummy_fir.pdf
Ask: "Based on the FIR, detail the timeline of events"
AI extracts the timeline from the document

In the Conflict Detection tab:

Upload both dummy_fir.pdf and witness_statement.pdf
Click "Analyze"
AI finds contradictions between the documents

Why Open Source?

Two reasons:

Trust: Judges need to verify the code themselves. No black boxes.
Contribution: Legal professionals can improve the mappings, add BNSS sections, support more languages.

MIT License. Commercial use allowed. Use it. Modify it. Deploy it in courts.

The Real Vision

This isn't about "AI for AI's sake."

The vision is simple: Judges should have access to cutting-edge AI without sacrificing confidentiality.

No vendor lock-in. No monthly subscriptions. No data leaving the courtroom.

Just a tool that works offline, for the work judges actually do.

What's Next?

BNSS mapping (Criminal Procedure Code)
BSA mapping (Evidence Act)
Multi-language support (Tamil, Telugu, Bengali)
Deploy in actual courts (pilot program)
Train judges on using the tool

Join the Effort

If you work in legal tech, judicial systems, or Indian law:

GitHub: github.com/gangadharv444/smart-court-assistant-

What we need:

Lawyers to validate IPC to BNS mappings
Judges to test with real FIRs
Developers to add BNSS/BSA mappings
Regional language experts for OCR improvement

Contact: gangadharv.444@gmail.com

Conclusion

Building this taught me that the best AI projects often solve real constraints, not imaginary problems.

A judge can't use OpenAI's API. That's not a limitation of the judge. That's a design problem in the AI ecosystem.

Smart Court Assistant solves it.

If you're building AI for India's institutions — courts, hospitals, government offices — you don't need cutting-edge GPUs or expensive cloud APIs. You need offline-first thinking.

Sometimes the most powerful AI runs on your laptop.

Have questions? Found a bug? Want to contribute?

Star the repo on GitHub. Drop an issue. Or reach out directly.

The code is open. The data is yours. The vision is yours to improve.

Built for security. Built for India. Built offline.

DEV Community