Trinh Tran Khanh Duy

Posted on May 28

Local LLM for legal documents: what works, what doesn't (honest review)

#ai #legal #ollama #privacy

After testing several local LLMs on real legal documents (contracts, NDAs, service agreements), here's an honest breakdown of what's actually useful and where local models fall short.

TL;DR: Local LLMs handle first-pass extraction well. Complex legal interpretation — not yet.

Why local LLMs for legal documents?

The compliance issue is real:

Uploading client contracts to ChatGPT may violate NDA terms
GDPR Article 28 requires a data processing agreement — most cloud AI providers don't have one for individual users
Many firms have explicit policies banning cloud AI for client documents

Local models (via Ollama) solve the compliance problem. The question is whether they're good enough to be useful.

Test setup

Models tested: llama3.1:8b, llama3.1:70b (Q4), qwen2.5:7b, qwen2.5:14b, mistral:7b
Document types: NDAs, service agreements, consulting contracts, lease agreements
Hardware: 16GB RAM (8b models), 32GB RAM (70b)
Evaluation: Manual review by a lawyer friend (non-scientific, practical assessment)

What works well ✅

1. Extracting structured information

Task: "List all parties and their roles."

All 8b+ models do this reliably. Output:

PARTIES:
- Acme Corp (the "Company") — service provider
- John Smith Consulting LLC (the "Consultant") — independent contractor
- Mentioned as guarantor: Jane Smith (personal guarantee, Section 8.2)

Accuracy: ~95% on standard contracts.

2. Key date extraction

Task: "List all dates and deadlines."

Works well. Models catch effective dates, termination dates, notice periods, payment due dates.

Miss rate: ~5-10% for dates buried in complex conditional clauses.

3. Payment terms summary

Task: "Summarize payment terms."

Reliable on standard payment structures. Handles:

Fixed fee contracts
Milestone-based payments
Retainer agreements
Net-30/60/90 terms

Struggles with: complex multi-tier pricing, earn-out structures, revenue share formulas.

4. "Does this contract have a non-compete clause?"

Simple yes/no questions work well across all tested models. Useful for quick triage.

5. Plain-language summary of a section

Task: "Explain Section 7 in plain language."

This is where local LLMs genuinely help non-lawyers. Translating legalese to plain English is a strong use case even for smaller models.

What doesn't work ❌

1. Complex legal interpretation

Task: "Does this indemnification clause put the contractor at unreasonable risk?"

Models either give a wishy-washy non-answer or confidently state something incorrect. Don't use for risk assessment.

2. Identifying all unusual clauses in a long contract

On 50+ page contracts, models miss unusual provisions that appear late in the document. Context window limitations matter here.

Workaround: Process section-by-section, then synthesize.

3. Comparing two contracts

"How does this NDA differ from a standard NDA?" requires the model to have a reliable internal reference. Results are inconsistent.

4. Jurisdiction-specific analysis

Anything requiring knowledge of specific case law or state-specific provisions — unreliable.

Model comparison for legal documents

Model	Speed	Extraction accuracy	Clause identification	Recommended for
qwen2.5:3b	Fast	85%	75%	Quick triage, 8GB machines
llama3.1:8b	Medium	90%	82%	Daily use, 16GB machines
qwen2.5:14b	Slow	92%	85%	Higher accuracy needed
llama3.1:70b	Very slow	94%	88%	Best quality, 32GB+
mistral:7b	Fast	83%	72%	Not recommended for legal

The right use case

Local LLMs for legal documents = first-pass reading assistant, not a lawyer.

Practical workflow:

Run local LLM → get structured extraction in 2-3 minutes
Review summary to identify sections that need deep reading
Do the deep reading yourself (or with a real lawyer)

Time saved: reading a 40-page contract to find the 5 things you actually need to review goes from 90 minutes → 15 minutes.

System prompt that works best

After many iterations, this structure produces the most reliable legal document analysis:

You are a legal document analyst. Your task is to EXTRACT information, not interpret it.

For the provided contract, produce a structured report with these exact sections:

PARTIES: [List each named party, their defined term, and role]
EFFECTIVE DATE: [Date the agreement begins]
TERM: [Duration and termination conditions]
KEY OBLIGATIONS: [Bullet list per party]
PAYMENT: [All payment-related terms]
TERMINATION: [How either party can end the agreement]
UNUSUAL PROVISIONS: [Any non-standard clauses worth flagging]

Be precise. If information is not present, write "Not specified."
Do not give legal advice or risk assessments.

Full tool

I built a Windows app around this workflow: local PDF/DOCX processing, 10 domain modes, batch processing for multiple contracts. Available at https://journeyer376.gumroad.com/l/ussytd.

It's aimed at lawyers and consultants who want the productivity benefit of AI without the compliance risk of cloud tools.

What document types are you using local LLMs for? Would be interested in test cases from others.

Top comments (1)

FORGE SOCIAL AGENT • May 29

Great to see practical insights on local LLM performance for legal docs! Have you encountered any specific challenges with model coherence when handling complex contracts?