DEV Community

Lakshmi Sravya Vedantham
Lakshmi Sravya Vedantham

Posted on • Originally published at github.com

I Built a Free AI Legal Assistant That Replaces $1,200/Month Software — and Open-Sourced It

Law firms are hemorrhaging money on AI tools they barely control. Harvey AI? ~$1,200/user/month. CoCounsel? Hundreds more. And every one of them sends your most sensitive client documents to someone else's cloud.

I got tired of watching that happen. So I built LegalLens — a fully open-source, self-hosted AI document intelligence platform that does everything those tools do, runs on your own machine, and costs exactly $0.


The Problem Nobody Wants to Talk About

Picture this: A junior associate uploads a confidential M&A agreement to an AI tool to get a quick summary. The document vanishes into a third-party server farm. The firm pays $1,200 per user per month for that privilege. And if the AI hallucinates a clause that doesn't exist? There's no citation to verify it.

This isn't hypothetical. It's the current state of legal AI.

Attorney-client privilege. Trade secrets. Sealed settlement terms. These aren't files you casually hand to a SaaS product. But the alternative — doing it all manually — means associates spending 6-hour stretches reading every line of a 200-page supply chain contract looking for indemnification clauses buried on page 147.

There had to be a better way.


What LegalLens Actually Does

LegalLens is an open-source AI-powered document intelligence platform. Upload a contract, pleading, court order, or NDA. It reads it, understands it, and lets you interrogate it like you would a senior partner who's already read the whole thing.

Here's what it packs in:

9 AI-Powered Analysis Features

Feature What You Get
Summary Key points, parties, document type — instantly
Risk Analysis Risk score 0–100, individual risks ranked by severity
Compliance Checklist 15+ standard provisions, scored pass/fail/needs review
Obligations Every duty, right, and restriction — with deadlines
Timeline Chronological events, milestones, and expirations
Document Comparison Side-by-side diff of two contracts
Legal Memo AI-generated brief from your saved research
Query Expansion Suggests legal synonyms to widen your search
RAG Q&A Ask anything, get answers with numbered citations [1][2]

Clause Finder with 12 Pre-Built Templates

Stop ctrl+F-ing through 300 pages. Pre-built clause search templates cover:

  • Indemnification
  • Force Majeure
  • Limitation of Liability
  • Termination
  • Confidentiality / NDA
  • Governing Law
  • Intellectual Property
  • Non-Compete
  • Payment Terms
  • Assignment
  • Representations & Warranties
  • Notices

Document Intelligence — Auto-Extracted

The moment a document is uploaded, LegalLens automatically pulls:

  • Party names from agreement headers
  • Key dates — effective dates, deadlines, expirations
  • Dollar amounts and currencies
  • Defined terms (capitalized terms in quotes)
  • Governing law and jurisdiction
  • Legal references — statutes and case citations
  • Document type — Contract, Pleading, Court Order, Memorandum, etc.

No prompting. No manual setup. It just works.


The Real Differentiator: Your Data Stays With You

Every commercial legal AI tool sends your documents somewhere. LegalLens does not.

With Ollama (free, local LLM), the entire pipeline runs on your machine:

Document upload -> text extraction -> vector embeddings -> ChromaDB
       All on your hardware. Nothing leaves.
Enter fullscreen mode Exit fullscreen mode

If you prefer cloud LLMs (Claude, GPT-4, Azure OpenAI), you can configure those too. The system supports a fallback chain — it tries providers in order until one responds. But the choice is always yours.


Setup Is One Command

Seriously. That's it:

git clone https://github.com/LakshmiSravyaVedantham/legal-lens.git
cd legal-lens
cp .env.example .env
docker compose up --build
# Open http://localhost
Enter fullscreen mode Exit fullscreen mode

Want to run it fully offline with a local LLM?

docker compose --profile ollama up --build
Enter fullscreen mode Exit fullscreen mode

Then pull a model:

ollama pull llama3.1:8b
Enter fullscreen mode Exit fullscreen mode

Everything — FastAPI backend, React frontend, MongoDB, ChromaDB, Nginx — comes up in one shot. The semantic search engine works immediately. AI features activate the moment you point to any LLM.


How It's Built (For the Devs in the Room)

No LangChain. Direct LLM integration — simpler, more debuggable, easier to extend.

Frontend (React 19 + TypeScript + Tailwind CSS 4)
         |
    REST API
         |
FastAPI Backend (async Python)
    |          |          |
ChromaDB   MongoDB    LLM Layer
(vectors)  (metadata) (Ollama / Claude / GPT / Azure)
         |
sentence-transformers
(384-dim embeddings, 80MB model, no GPU required)
Enter fullscreen mode Exit fullscreen mode
Layer Tech
Frontend React 19 + TypeScript + Vite + Tailwind CSS 4
Backend Python FastAPI (async)
Database MongoDB with Motor async driver
Vector DB ChromaDB
Embeddings all-MiniLM-L6-v2 — 384-dim, runs on CPU
LLM Ollama / Anthropic / OpenAI / Azure (pluggable)
Auth JWT + bcrypt + Fernet encryption
Deploy Docker Compose
CI/CD GitHub Actions — lint, test, Docker build

The embedding model runs on CPU. No GPU required to get started.


LegalLens vs. The Tools You're Paying For

Feature LegalLens Harvey AI CoCounsel Luminance
Monthly Cost $0 ~$1,200/user ~$250/bundle Enterprise
Semantic Search YES YES YES YES
RAG Q&A with Citations YES YES YES
Risk Analysis YES YES YES
Clause Finder YES YES
Document Comparison YES YES YES
Legal Memo Generation YES YES YES
100% Data Privacy YES NO NO Partial
Works Fully Offline YES NO NO NO
Multi-LLM Support YES NO NO NO
Self-Hosted YES NO NO NO
Open Source YES NO NO NO

It's Not Just for Lawyers

Here's who can actually use this right now:

Startup founders reviewing term sheets and investor agreements before signing anything.

Freelancers and consultants who deal with client contracts and NDAs regularly but can't afford legal review for every document.

Small law firms that need AI tooling but are priced out of enterprise solutions.

Legal ops teams at companies that need document analysis at scale without a six-figure SaaS bill.

Law students and researchers studying contract patterns, clause variations, or compliance across jurisdictions.

Developers who want to build legal AI features into their own products using an open, well-structured codebase.


What Makes the Q&A Actually Trustworthy

One of the biggest criticisms of legal AI is hallucination — the model confidently states something that isn't in the document. LegalLens addresses this directly.

Every answer from the Q&A engine includes numbered citations that link back to the exact document and page number. If the answer says "the indemnification clause limits liability to $500,000 [1]", citation [1] takes you directly to the paragraph in the original file.

You can verify. You can audit. The AI is a tool, not an oracle.


Enterprise-Grade Platform Features

This isn't a weekend script. It's built to run in production:

  • Multi-tenant — organization-based isolation with role-based access control
  • JWT auth — access + refresh tokens, bcrypt password hashing
  • Rate limiting — per-endpoint limits on auth, AI calls, uploads, and search
  • Audit logging — every upload, deletion, and analysis is logged with IP tracking
  • Command palette (Cmd+K) — spotlight-style navigation across the entire app
  • Analytics dashboard — search trends, top queries, activity timelines
  • Cache-first AI — analysis results cached in MongoDB so repeated queries don't re-hit the LLM

Run It, Break It, Build On It

The repo is live, the CI is green, and the code is MIT licensed:

github.com/LakshmiSravyaVedantham/legal-lens

If you want to contribute, here's what would make the biggest impact right now:

  • OCR for scanned PDFs — Tesseract or PaddleOCR integration
  • Hybrid search — BM25 + semantic for better recall
  • Jurisdiction-specific clause templates — GDPR, CCPA, HIPAA
  • Accessibility improvements — keyboard navigation, screen reader support
  • More document types — patent filings, regulatory submissions

The development setup is straightforward:

# Backend
python3 -m venv .venv && source .venv/bin/activate
pip install -r backend/requirements.txt
uvicorn backend.main:app --reload --port 8000

# Frontend (new terminal)
cd frontend && npm install && npm run dev
Enter fullscreen mode Exit fullscreen mode

Full Swagger docs at http://localhost:8000/docs once you're running.


The Bottom Line

Legal AI shouldn't be a $1,200/month luxury. Document intelligence shouldn't mean surrendering privileged client files to a third-party server. And "AI-powered" shouldn't be a black box with no citations and no accountability.

LegalLens is the answer I built because I wanted it to exist. It's free, it's open, and it works.

Star the repo. Try it. Tell me what breaks.

github.com/LakshmiSravyaVedantham/legal-lens

Top comments (0)