Lakshmi Sravya Vedantham

Posted on Feb 20 • Originally published at github.com

I Built a Free AI Legal Assistant That Replaces $1,200/Month Software — and Open-Sourced It

#webdev #ai #python #opensource

Law firms are hemorrhaging money on AI tools they barely control. Harvey AI? ~$1,200/user/month. CoCounsel? Hundreds more. And every one of them sends your most sensitive client documents to someone else's cloud.

I got tired of watching that happen. So I built LegalLens — a fully open-source, self-hosted AI document intelligence platform that does everything those tools do, runs on your own machine, and costs exactly $0.

The Problem Nobody Wants to Talk About

Picture this: A junior associate uploads a confidential M&A agreement to an AI tool to get a quick summary. The document vanishes into a third-party server farm. The firm pays $1,200 per user per month for that privilege. And if the AI hallucinates a clause that doesn't exist? There's no citation to verify it.

This isn't hypothetical. It's the current state of legal AI.

Attorney-client privilege. Trade secrets. Sealed settlement terms. These aren't files you casually hand to a SaaS product. But the alternative — doing it all manually — means associates spending 6-hour stretches reading every line of a 200-page supply chain contract looking for indemnification clauses buried on page 147.

There had to be a better way.

What LegalLens Actually Does

LegalLens is an open-source AI-powered document intelligence platform. Upload a contract, pleading, court order, or NDA. It reads it, understands it, and lets you interrogate it like you would a senior partner who's already read the whole thing.

Here's what it packs in:

9 AI-Powered Analysis Features

Feature	What You Get
Summary	Key points, parties, document type — instantly
Risk Analysis	Risk score 0–100, individual risks ranked by severity
Compliance Checklist	15+ standard provisions, scored pass/fail/needs review
Obligations	Every duty, right, and restriction — with deadlines
Timeline	Chronological events, milestones, and expirations
Document Comparison	Side-by-side diff of two contracts
Legal Memo	AI-generated brief from your saved research
Query Expansion	Suggests legal synonyms to widen your search
RAG Q&A	Ask anything, get answers with numbered citations [1][2]

Clause Finder with 12 Pre-Built Templates

Stop ctrl+F-ing through 300 pages. Pre-built clause search templates cover:

Indemnification
Force Majeure
Limitation of Liability
Termination
Confidentiality / NDA
Governing Law
Intellectual Property
Non-Compete
Payment Terms
Assignment
Representations & Warranties
Notices

Document Intelligence — Auto-Extracted

The moment a document is uploaded, LegalLens automatically pulls:

Party names from agreement headers
Key dates — effective dates, deadlines, expirations
Dollar amounts and currencies
Defined terms (capitalized terms in quotes)
Governing law and jurisdiction
Legal references — statutes and case citations
Document type — Contract, Pleading, Court Order, Memorandum, etc.

No prompting. No manual setup. It just works.

The Real Differentiator: Your Data Stays With You

Every commercial legal AI tool sends your documents somewhere. LegalLens does not.

With Ollama (free, local LLM), the entire pipeline runs on your machine:

Document upload -> text extraction -> vector embeddings -> ChromaDB
       All on your hardware. Nothing leaves.

If you prefer cloud LLMs (Claude, GPT-4, Azure OpenAI), you can configure those too. The system supports a fallback chain — it tries providers in order until one responds. But the choice is always yours.

Setup Is One Command

Seriously. That's it:

git clone https://github.com/LakshmiSravyaVedantham/legal-lens.git
cd legal-lens
cp .env.example .env
docker compose up --build
# Open http://localhost

Want to run it fully offline with a local LLM?

docker compose --profile ollama up --build

Then pull a model:

ollama pull llama3.1:8b

Everything — FastAPI backend, React frontend, MongoDB, ChromaDB, Nginx — comes up in one shot. The semantic search engine works immediately. AI features activate the moment you point to any LLM.

How It's Built (For the Devs in the Room)

No LangChain. Direct LLM integration — simpler, more debuggable, easier to extend.

Frontend (React 19 + TypeScript + Tailwind CSS 4)
         |
    REST API
         |
FastAPI Backend (async Python)
    |          |          |
ChromaDB   MongoDB    LLM Layer
(vectors)  (metadata) (Ollama / Claude / GPT / Azure)
         |
sentence-transformers
(384-dim embeddings, 80MB model, no GPU required)

Layer	Tech
Frontend	React 19 + TypeScript + Vite + Tailwind CSS 4
Backend	Python FastAPI (async)
Database	MongoDB with Motor async driver
Vector DB	ChromaDB
Embeddings	`all-MiniLM-L6-v2` — 384-dim, runs on CPU
LLM	Ollama / Anthropic / OpenAI / Azure (pluggable)
Auth	JWT + bcrypt + Fernet encryption
Deploy	Docker Compose
CI/CD	GitHub Actions — lint, test, Docker build

The embedding model runs on CPU. No GPU required to get started.

LegalLens vs. The Tools You're Paying For

Feature	LegalLens	Harvey AI	CoCounsel	Luminance
Monthly Cost	$0	~$1,200/user	~$250/bundle	Enterprise
Semantic Search	YES	YES	YES	YES
RAG Q&A with Citations	YES	YES	YES	—
Risk Analysis	YES	YES	—	YES
Clause Finder	YES	—	—	YES
Document Comparison	YES	YES	—	YES
Legal Memo Generation	YES	YES	YES	—
100% Data Privacy	YES	NO	NO	Partial
Works Fully Offline	YES	NO	NO	NO
Multi-LLM Support	YES	NO	NO	NO
Self-Hosted	YES	NO	NO	NO
Open Source	YES	NO	NO	NO

It's Not Just for Lawyers

Here's who can actually use this right now:

Startup founders reviewing term sheets and investor agreements before signing anything.

Freelancers and consultants who deal with client contracts and NDAs regularly but can't afford legal review for every document.

Small law firms that need AI tooling but are priced out of enterprise solutions.

Legal ops teams at companies that need document analysis at scale without a six-figure SaaS bill.

Law students and researchers studying contract patterns, clause variations, or compliance across jurisdictions.

Developers who want to build legal AI features into their own products using an open, well-structured codebase.

What Makes the Q&A Actually Trustworthy

One of the biggest criticisms of legal AI is hallucination — the model confidently states something that isn't in the document. LegalLens addresses this directly.

Every answer from the Q&A engine includes numbered citations that link back to the exact document and page number. If the answer says "the indemnification clause limits liability to $500,000 [1]", citation [1] takes you directly to the paragraph in the original file.

You can verify. You can audit. The AI is a tool, not an oracle.

Enterprise-Grade Platform Features

This isn't a weekend script. It's built to run in production:

Multi-tenant — organization-based isolation with role-based access control
JWT auth — access + refresh tokens, bcrypt password hashing
Rate limiting — per-endpoint limits on auth, AI calls, uploads, and search
Audit logging — every upload, deletion, and analysis is logged with IP tracking
Command palette (Cmd+K) — spotlight-style navigation across the entire app
Analytics dashboard — search trends, top queries, activity timelines
Cache-first AI — analysis results cached in MongoDB so repeated queries don't re-hit the LLM

Run It, Break It, Build On It

The repo is live, the CI is green, and the code is MIT licensed:

github.com/LakshmiSravyaVedantham/legal-lens

If you want to contribute, here's what would make the biggest impact right now:

OCR for scanned PDFs — Tesseract or PaddleOCR integration
Hybrid search — BM25 + semantic for better recall
Jurisdiction-specific clause templates — GDPR, CCPA, HIPAA
Accessibility improvements — keyboard navigation, screen reader support
More document types — patent filings, regulatory submissions

The development setup is straightforward:

# Backend
python3 -m venv .venv && source .venv/bin/activate
pip install -r backend/requirements.txt
uvicorn backend.main:app --reload --port 8000

# Frontend (new terminal)
cd frontend && npm install && npm run dev

Full Swagger docs at http://localhost:8000/docs once you're running.

The Bottom Line

Legal AI shouldn't be a $1,200/month luxury. Document intelligence shouldn't mean surrendering privileged client files to a third-party server. And "AI-powered" shouldn't be a black box with no citations and no accountability.

LegalLens is the answer I built because I wanted it to exist. It's free, it's open, and it works.

Star the repo. Try it. Tell me what breaks.

github.com/LakshmiSravyaVedantham/legal-lens

DEV Community