DEV Community

peter choe
peter choe

Posted on • Originally published at lawmadi.com

I Built an AI Legal OS with 60 Specialized Agents and Real-time Statute Verification

When people in Korea face legal issues, they have three bad options: expensive lawyers ($75+ per session), unreliable internet searches, or AI chatbots that hallucinate laws that don't exist.

I built Lawmadi OS to fix this — an AI legal operating system with 60 domain-specialized agents that verify every answer against live government databases.

Live: lawmadi.com


The Problem with Legal AI

Ask ChatGPT about Korean labor law, and it will confidently cite "Article 27 of the Labor Standards Act" — except that article might not say what it claims, or might not exist at all. In the legal domain, hallucination isn't just annoying — it's dangerous. People make life-changing decisions based on legal information.

How Lawmadi OS Works

3-Layer NLU Routing

Instead of sending every query through an expensive LLM classification step, we use cascading routing:

This gives us low latency for most queries, high accuracy (264/264 test cases passing), and cost efficiency.

60 Specialized Agents

Each of the 60 agents specializes in a specific area of Korean law:

  • L09 담우 — Labor Law (unfair dismissal, unpaid wages)
  • L08 온유 — Lease/Rent Law (전세 deposits, tenant rights)
  • L03 담슬 — Divorce & Family Law
  • L10 결휘 — Traffic Accidents
  • L01 휘율 — Criminal Law
  • And 55 more covering tax, IP, immigration, inheritance, medical, military, environment, data privacy, startups, etc.

Why 60 instead of 1 generalist? Specialization matters. Each agent has domain-tuned prompts, knowledge of relevant statutes, and optimized response patterns. It's like having a law firm with 60 specialists instead of one generalist.

4-Stage Verification Pipeline

This is the core architecture:

Stage 4 is what makes Lawmadi OS different. After Gemini generates a response, we:

  1. Extract all statute citations from the response
  2. Query Korea's official legislative database (법제처, law.go.kr) via DRF API
  3. Verify — Does the law exist? Does the article number exist? Is the content accurate?
  4. Score — Generate a 0-100 verification score
  5. Decide — If score is below threshold, reject the entire response

We cross-reference against 10 government data sources:

  • Statutes (법령)
  • Enforcement Decrees (시행령)
  • Enforcement Rules (시행규칙)
  • Court Precedents (판례)
  • Administrative Rules (행정규칙)
  • And 5 more

Fail-Closed Design

If the verification API is down, the system doesn't fall back to unverified responses. Instead:

  • Circuit breaker trips after consecutive failures
  • System enters fail-closed mode
  • All responses are held until verification is available
  • We'd rather give no answer than an unverified one

The 5-Stage Empathy Framework

Legal issues are stressful. Every response follows this structure:

  1. Emotional acknowledgment — "This situation must be frustrating..."
  2. Situation diagnosis — Clear analysis of the legal issue
  3. Action roadmap — Specific steps with deadlines
  4. Safety net — Legal aid resources, hotlines, government services
  5. Supportive closing — Encouragement and next steps

Results After 1 Week

Metric Value
Unique Visitors 114
Queries Processed 481
Success Rate 99.6%
Avg Verification Score 84.7/100
Korean Citation Accuracy 82.5%
English Citation Accuracy 25.6% (improving)
Tests Passing 282/282
Avg Response Time ~40s

Most popular domains: Labor law (90 queries), Housing/Lease (83), Divorce (50), Traffic accidents (48)

Tech Stack

Component Technology
Backend FastAPI 0.128.0 + Python 3.10+
LLM Google Gemini 2.5 Flash
RAG Vertex AI Search (14,601 docs)
Verification 법제처 DRF API (10 SSOT sources)
Database Cloud SQL PostgreSQL 17
Hosting GCP Cloud Run + Firebase
Billing Paddle (credit-based)
CI/CD GitHub Actions (5-stage pipeline)
Auth JWT RBAC + Email OTP
Anti-abuse IP + Canvas Fingerprint + Device Token

Pricing

  • Free: 2 queries/day (no account needed)
  • Starter: 20 queries — .50
  • Standard: 100 queries — .99
  • Pro: 300 queries — .99

Credit-based, no subscription. Powered by Paddle.

Challenges & Next Steps

  1. Latency — ~40s avg is too slow. Gemini generation (~30s) is the bottleneck. Exploring parallel RAG + prefetch.
  2. English citations — 25.6% accuracy vs 82.5% Korean. Standardized English translations of Korean law names are inconsistent.
  3. Scale — 60 system prompts to maintain. Considering automated prompt generation.

Try It

I'd love to hear your thoughts, especially on:

  • Multi-agent specialization vs. single generalist approaches
  • Fail-closed verification in AI systems
  • Ideas for reducing response latency

Built by Jainam Choe — choepeter@outlook.kr


Chat with Our Team

Chat with Seoyeon (CSO) Chat with Jiyu (CTO) Chat with Yuna (CCO)

Click any badge above to start a 1:1 chat with our C-Level AI leaders. Free, no account needed.

Top comments (1)

Collapse
 
c_nguynnh_56de361f0 profile image
Đức Nguyễn Đình

Quick personal review of AhaChat after trying it
I recently tried AhaChat to set up a chatbot for a small Facebook page I manage, so I thought I’d share my experience.
I don’t have any coding background, so ease of use was important for me. The drag-and-drop interface was pretty straightforward, and creating simple automated reply flows wasn’t too complicated. I mainly used it to handle repetitive questions like pricing, shipping fees, and business hours, which saved me a decent amount of time.
I also tested a basic flow to collect customer info (name + phone number). It worked fine, and everything is set up with simple “if–then” logic rather than actual coding.
It’s not an advanced AI that understands everything automatically — it’s more of a rule-based chatbot where you design the conversation flow yourself. But for basic automation and reducing manual replies, it does the job.
Overall thoughts:
Good for small businesses or beginners
Easy to set up
No technical skills required
I’m not affiliated with them — just sharing in case someone is looking into chatbot tools for simple automation.
Curious if anyone else here has tried it or similar platforms — what was your experience?