TL;DR: LLMs are great at extracting data but unreliable for production documents. I combined structured JSON schemas, domain-specific validation rules, and human-in-the-loop approval into a pipeline that catches every error before it reaches a customer.
The Three-Layer Architecture
Instead of treating AI as a black box that generates finished documents, I built a pipeline with three independent layers. Each layer solves one problem and expects the previous layer to fail sometimes.
Layer 1: Structured Extraction
The first insight: abandon free-text generation entirely. Instead, define exactly what data you need using a strict JSON schema. The LLM fills this schema from the raw input. If it produces something that does not match, the pipeline rejects it immediately.
Layer 2: Domain Rule Validation
Valid JSON is not enough. A line item with quantity 0 is nonsense. A tax ID from the wrong country is a compliance issue. Layer 2 runs deterministic business rules: plain Python functions written once per document type.
Layer 3: Human Review
Even with structures and rules, some errors are semantic. I use Facio, a HITL agent runtime, to pause the pipeline and present low-confidence extractions to a human. The reviewer sees the original scan alongside the extracted data, with low-confidence fields highlighted.
Why Three Layers Beat One Smart Model
The core insight: none of these layers trusts the AI to be perfect. They expect mistakes and catch them systematically. The alternative, making one super-accurate prompt or fine-tuning a model, is brittle.
The Bottom Line
AI agents work best when they know their limits. If you are building document automation, stop optimizing your prompt. Start defining your schema. Then add validation rules. Then plug in a human review step for the uncertain cases.
I build AI automation tools at centerbit. If document automation or HITL workflows interest you, more at centerbit.co.
Top comments (0)