Utkarsh Malaiya

Posted on Oct 3

I Created an AI Assistant That Reads the Fine Print for You

#showdev #productivity #ai #rag

Legal documents are long and full of complex jargon, hidden clauses, and cross-references. They feel like they were written in a different language — one I can’t seem to understand without a lawyer. These documents are designed to make me glaze over, sign them quickly, and just hope everything works out in the end.

Well, I didn't like that. So I built an AI agent that reads these legal documents for you, and allows you to understand these documents very easily. It decodes the foreign codewords into plain English. Using it is very simple: you drop in your contracts, agreements or policies in a folder, tell it where all the documents are and just ask it questions on those documents, in plain English, and it gives back clear answers, grounded in the actual text. It is fast, reliable and accurate.

How I did this? I built a simple RAG pipeline using LLMWare for document management and Groq for super-fast chat capability.

Check out the full project here: Project Repo

How is this different from a simple chatbot?

The biggest problem with using AI on legal text isn’t that models can’t generate text. It’s that they generate text without looking at the right context. If you just throw a contract at a model and ask “What are my obligations?”, it will happily make things up.

The key was to:

Store and index documents in a way the AI could understand.
Retrieve only the most relevant sections.
Feed those into a model that could explain them clearly and quickly.

That’s exactly the workflow I built.

How this all fits together?

Here’s the high-level pipeline:

Document ingestion & embedding
Semantic retrieval
Context grounding
Answer generation

That’s where LLMWare and Groq come in — they solve steps 1–4 in a clean, modular way.

Turning Contracts into a Searchable library using LLMWare

For ease of development, I used LLMWare for ingestion and embedding of my documents, and for semantic retrieval of relevant chunks based on the user query.

Here’s what happens when you add your docs:

Documents get chunked into smaller sections.
Each chunk is embedded with mini-LM-SBERT and stored in FAISS (a fast vector database).
You can query the library semantically — not just “find the word X,” but “find the section that means this.”

So when you ask, “What’s the early termination clause?”, LLMWare doesn’t search by keywords. It finds the most semantically relevant chunks. It can pull the termination clause, even if the exact word “terminate” doesn’t appear.

Now I had a way to ground the AI in the actual fine print.

Response generation using Groq

Once I receive the right context from my embeddings, I send it to Groq for ridiculously fast response generation using Meta's LLaMa3 model (you can choose different models too).

The flow looks like this:

User asks a question.
LLMWare finds relevant passages in your docs and I send the entire context along with the user's query to Groq.
Groq runs LLaMA-3 on top of that context.
You get a grounded, plain-English response like:

“If you terminate early, you must pay two months’ rent as a penalty (see section 9.2).”

Because Groq is fast, the user experience feels seamless. You ask, you get an answer — almost instantly.

Challenges I faced

When you build something like this, you’ll hit bumps. Here are a few I faced:

Chunking granularity: Too big, and you lose focus; too small, and context is broken.
Prompt length limits: If retrieved context is too long, the model may truncate or lose coherence.
Overlapping context: Sometimes two retrieved chunks repeat the same clause. You need logic to filter duplicates.
Edge cases: Contracts with archaic wording or weird formatting can confuse embedding models.
Legal disclaimers: This tool is an assistant, not a lawyer — always encourage users to consult a professional.

But the modular design (LLMWare for retrieval, Groq for generation) made it easier to iterate. If I ever want to swap out Groq’s model or replace LLMWare’s backend, it's doable.

By now, you might think: “Why talk so much about LLMWare? Isn’t it just another library?” But that’s the point — its clean API lets me focus on what I’m building (the legal assistant), not how to build embeddings or query infrastructure from scratch.

Meanwhile, Groq does the heavy lifting for inference. The fact that they integrate cleanly and let me orchestrate the flow is what makes the tool feel polished.

What You Can Do with This Assistant Today

Upload NDAs, leases, terms & conditions, service contracts, employment agreements, etc.
Ask natural questions: “Can I sublet?”, “What’s the penalty for late payment?”, “How do I exit early?”
Get immediate, grounded answers with clause references.
Use it as a safety net before committing to anything legal.

Closing Thoughts

Legal documents aren’t going anywhere, but the way we understand them can.

Building this project taught me something: AI isn’t magic unless the plumbing is solid. Retrieval, grounding, and speed are what turn a “nice experiment” into a truly useful tool.

I didn’t build this to replace lawyers or make legal advice free; I built it so people like me could stop feeling lost in their own contracts.

With this assistant, I finally feel like I can read what I’m signing without needing a degree in legalese. It’s still a work in progress, but it’s a small step toward making the fine print a little less fine — and a lot more human.

Top comments (1)

Rohan Sharma • Oct 4

That looks great!!