DEV Community

andrii oliinyk
andrii oliinyk

Posted on

Structured Outputs vs Free-Form Summaries: Notes from an AI Regulatory Monitoring Build

Saw a case study from BN Digital on building an AI regulatory monitoring system and wanted to share the architectural takeaways, because they generalize beyond compliance to basically any LLM-in-production system.

The core problem

LLMs are great at producing fluent text. Fluent text is terrible as a programmatic interface. If your downstream system is a human reviewer with a checklist, a database, or another service, free-form summaries are the wrong output shape.

Three design choices worth stealing

1. Structured output schema instead of summaries

Typed fields with constrained values. Same input → same output shape, every time. Diff-able across runs. Validates with a normal schema validator. Doesn't require a second LLM call to "parse" the first one.

{
  "regulation_id": "...",
  "jurisdiction": "...",
  "change_type": "amendment | new | repeal",
  "affected_entities": [...],
  "effective_date": "...",
  "source_citation": "..."
}
Enter fullscreen mode Exit fullscreen mode

Compare to "Here's a 4-paragraph summary of what changed" — same information, useless downstream.

2. Source filtering before the LLM step

Most "hallucination" in domain-specific work is a garbage-in problem. The model isn't inventing — it's pattern-matching to irrelevant context you gave it. Classical retrieval/filtering before the generation step cuts the surface area dramatically.

3. Human-in-the-loop as part of the type system

High-stakes outputs get a requires_review: true flag set by the model itself, with rules on what triggers it. Reviewer queue is a first-class part of the pipeline, not a thing bolted on after a compliance officer complains.

Why this matters beyond RegTech

The same pattern applies to any LLM system where outputs feed into other systems or workflows: medical decision support, financial analysis, legal drafting, infra automation. If your LLM output isn't typed, you're building a demo, not a system.

Full case study: https://bndigital.co/en-gb/cases/ai-regulatory-monitoring-system?utm_source=devto&utm_medium=backlink&utm_campaign=ai-reg

Curious if anyone here is using JSON schema enforcement (OpenAI structured outputs, Anthropic tool use, Outlines, etc.) in production and what's been brittle.

Top comments (0)