If you've integrated an LLM by parsing its output with regex, you've likely experienced the moment when everything breaks.
The model updates, changes a single phrase, and suddenly your carefully crafted parser fails, routing urgent customer issues to the wrong department or missing critical data entirely.
This is not a theoretical problem. Consider this real scenario:
Day 1: Your customer support classifier works perfectly. A message like "You charged me twice! I want a refund NOW" produces: "The user is very upset about a duplicate charge. This is a billing issue. Sentiment is negative, and it seems urgent." Your regex const department = output.match(/billing issue/i) ? 'billing' : 'general';
routes it correctly.
Day 8: The same input now returns: "This is a payment problem. The customer was double-billed and is demanding a refund." Your parser fails. The ticket is misrouted. Your metrics tank.
The root cause is not the model. It is the approach. Unstructured text is designed for humans. Production applications need structured data. The durable solution is to make the model speak JSON and enforce the shape at generation time.
TL;DR: Enforce a JSON Schema at generation time, then validate with Zod at runtime. No regex. No drift. Production predictable. OpenAI and Azure document this schema enforcement explicitly for structured outputs. OpenAI Platform
Understanding Schema-Enforced Outputs
Most modern LLM APIs now support structured outputs. You provide a JSON Schema, and the model is constrained to produce output that matches that schema. This is not a prompt trick. It is an API level contract for production reliability. OpenAI calls this Structured Outputs and distinguishes it from older JSON mode. JSON mode guarantees valid JSON syntax. Structured Outputs enforces your schema, including required properties and enums. OpenAI Platform
Azure’s documentation mirrors this approach and lists the supported subset of JSON Schema. For example, you cannot use an anyOf
clause at the very top level of an Azure Policy rule. Design within the published subset for optimal adherence. Microsoft Learn
If you know GraphQL, the mental model is familiar. Define the shape you need up front, then receive exactly that shape. No parsing. No guesswork.
The Four Compounding Benefits
1. Zero parsing logic
Stop scraping paragraphs. The API constrains the model to your keys and types. The goal of structured outputs is to generate content that matches your schema. OpenAI Platform
2. End-to-end type safety
Define once with Zod, derive TypeScript types, and validate at runtime with .parse()
. TypeScript ends at compile time. Zod closes the runtime gap with precise errors. Zod
3. Operational reliability
Schema adherence reduces retries and format drift. Azure’s subset guidance exists to help you design shapes that adhere reliably in production. Microsoft Learn
4. Maintainable codebase
Every response has the same keys. Dashboards stabilize. Alerts are deterministic. Your AI integration becomes as predictable as any typed API.
Implementation: A Provider-Agnostic Pattern
The pattern below works with any API that accepts a response schema. We will use TypeScript and Zod as a single source of truth.
Step 1: Define your schema
Use Zod to define schema, types, and guidance in one place. The .describe()
annotations are not only documentation. They steer the model toward the right values.
typescript
import { z } from "zod";
import { zodToJsonSchema } from "zod-to-json-schema";
export const SupportTicketSchema = z.object({
schemaVersion: z.literal("v1")
.describe("Schema version for analytics and migrations"),
language: z.string()
.describe("BCP-47 language code of the input, for example en-US"),
sentiment: z.enum(["positive", "neutral", "negative"])
.describe("Overall sentiment of the customer's message"),
department: z.enum([
"customer_support",
"online_ordering",
"product_quality",
"shipping_and_delivery",
"other_off_topic"
]).describe("Primary routing category for this ticket"),
priority: z.enum(["low", "medium", "high"])
.describe("Urgency level based on content and tone"),
confidence: z.number().min(0).max(1)
.describe("Model confidence in this classification from 0 to 1"),
suggestedReply: z.string().min(1).max(500).describe(
"Friendly, professional tone. 60 to 120 words. Acknowledge the issue. " +
"Give the next step and a contact. No emojis. No internal policy text. " +
"Write in the same language as the user."
)
});
export type SupportTicket = z.infer<typeof SupportTicketSchema>;
export const SupportJSONSchema = zodToJsonSchema(SupportTicketSchema);
Zod is TypeScript-first. .parse()
validates and returns a typed value. If the shape or values are wrong, it throws with clear messages. Zod
Step 2: Configure your LLM call
Pass the JSON Schema to the API using its structured output parameter. The names vary by provider, but the idea is the same.
typescript
import OpenAI from "openai";
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
export async function classifyMessage(message: string): Promise<SupportTicket> {
const completion = await openai.chat.completions.create({
model: "gpt-4o-2024-08-06",
messages: [
{
role: "system",
content: "Classify customer support messages and suggest appropriate replies. Return structured JSON only."
},
{ role: "user", content: message }
],
response_format: {
type: "json_schema",
json_schema: {
name: "support_ticket_classification",
schema: SupportJSONSchema
}
}
});
const content = completion.choices[0]?.message?.content ?? "";
let parsed: unknown;
try {
parsed = JSON.parse(content);
} catch (err) {
throw new Error("Model returned non JSON. Verify response_format and schema.", { cause: err });
}
return SupportTicketSchema.parse(parsed);
}
This is defense in depth. The API enforces your structure at generation. Zod enforces your business rules at runtime. OpenAI is explicit about schema enforcement in structured outputs, and Azure publishes the supported subset so you can design shapes that adhere well. OpenAI Platform
Step 3: Use typed data throughout your app
typescript
const result = await classifyMessage("Order #8841 is two weeks late!");
console.log(result.sentiment); // "negative"
console.log(result.department); // "shipping_and_delivery"
console.log(result.priority); // "high"
console.log(result.confidence); // 0.94
console.log(result.suggestedReply); // Ready to use
if (result.confidence >= 0.8) {
await sendAutomatedReply(result.suggestedReply);
} else {
await flagForHumanReview(result);
}
Reliability and Resilience
Add a minimal retry with backoff and a request timeout to harden your integration.
typescript
async function withRetry<T>(op: () => Promise<T>, tries = 3, baseMs = 300): Promise<T> {
let last: unknown;
for (let i = 1; i <= tries; i++) {
try { return await op(); }
catch (e) {
last = e;
if (i < tries) await new Promise(r => setTimeout(r, baseMs * i));
}
}
throw last;
}
// Usage
const safeResult = await withRetry(() => classifyMessage(msg));
If your HTTP client supports it, add an idempotency key and a request timeout.
Integrating With Modern Data Architectures
For GraphQL teams, treat the LLM as a typed upstream source.
graphql
type Mutation {
enrichLead(rawInquiry: String!): EnrichedLead!
}
type EnrichedLead {
companyName: String!
inquiryType: InquiryType!
urgency: Urgency!
potentialValue: DealSize!
}
typescript
const resolvers = {
Mutation: {
enrichLead: async (_: unknown, { rawInquiry }: { rawInquiry: string }) => {
const structured = await llmService.enrichWithSchema(rawInquiry);
return {
companyName: structured.companyName,
inquiryType: structured.inquiryType,
urgency: structured.urgency,
potentialValue: structured.potentialValue
};
}
}
};
The LLM is no longer a black box. It is a reliable transformer that converts free text into typed graph data.
Considering a Framework for Scale
While the direct pattern of using Zod with a provider's native schema is robust and very flexible, teams managing numerous LLM integrations or requiring provider independence may benefit from a higher level of abstraction.
Frameworks like BAML (Boundary AI Markup Language) are designed for this. BAML uses a dedicated language (.baml
files) to define, version, and test your LLM functions separately from your application code. This approach allows you to swap underlying LLM providers (e.g., from OpenAI to Anthropic) with a configuration change and provides a structured workflow for collaboration between engineers and prompt designers. Adopting a framework adds a toolchain to your project but can be invaluable for scaling and maintaining complex AI systems.
Streaming UX without breaking JSON
Streaming feels great in UI, but there is a common trap. Accumulate the stream in a buffer and parse once when complete. Do not parse partial JSON. If you use a framework, helpers like LangChain’s .withStructuredOutput()
abstract provider specifics and return typed objects while still allowing streaming. js.langchain.com
Real World Business Applications
Customer support triage
Classify sentiment, department, priority, and confidence. Preload a safe reply. Auto route high confidence results. Queue the rest for human review.
Lead enrichment
Transform free text into company name, intent, urgency, and potential value. Feed clean fields into your CRM.
Document processing
Take OCR output from PDFs or emails and transform it into database-ready records using a schema that matches your tables.
Production Best Practices
- Constrain any field you branch on with enums. Do not accept free text for routing or classification. Design within the published schema subset for better adherence. Microsoft Learn
- Add short, specific
.describe()
texts to fields. Providers call out descriptions as useful guidance for generation. OpenAI Platform - Keep schemas relatively flat. Deep nesting tends to hurt adherence. Start simple and add nesting only when it clearly helps. js.langchain.com
- Validate at runtime with Zod even when the API enforces your schema.
.parse()
yields safe objects or actionable errors. Zod - Include a confidence score. Gate automation by threshold.
- Version your schemas. Add
schemaVersion
and migrate intentionally. - For streaming, accumulate then parse. Never parse partial JSON.
- Consider a framework abstraction when you want provider-agnostic code. LangChain’s
.withStructuredOutput()
binds schemas and handles quirks for you.
Security and Safety Notes
- Delimit user input with clear markers, for example, triple quotes to reduce prompt injection risk inside your templates.
- Repeat key guard rails after dynamic text so the latest instructions reassert constraints.
- If tickets may contain PII, add
containsPii: boolean
and redact before logging or analytics. - Use confidence gating. Automate only above a threshold. Queue the rest.
A Minimal, Repeatable Micro-Benchmark
Invite your team to measure rather than guess.
- Task: classify 100 real tickets into your schema.
- Metrics: schema valid rate using Zod, average latency, tokens, and manual spot check accuracy.
- Settings: temperature 0, same prompt, same schema.
- Goal: prove that schema enforcement gives you a higher valid rate and fewer retries than prose parsing.
Complete Working Example
typescript
// npm install zod zod-to-json-schema openai
import { z } from "zod";
import { zodToJsonSchema } from "zod-to-json-schema";
import OpenAI from "openai";
// 1. Define schema and types
const ProductReviewSchema = z.object({
rating: z.number().min(1).max(5)
.describe("Star rating from 1 to 5"),
pros: z.array(z.string())
.describe("List of positive aspects mentioned"),
cons: z.array(z.string())
.describe("List of negative aspects mentioned"),
wouldRecommend: z.boolean()
.describe("Whether the reviewer would recommend this product"),
summary: z.string().max(200)
.describe("Brief summary suitable for display in UI")
});
type ProductReview = z.infer<typeof ProductReviewSchema>;
const ReviewJSONSchema = zodToJsonSchema(ProductReviewSchema);
// 2. Create API function with robust parsing and validation
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
async function analyzeReview(reviewText: string): Promise<ProductReview> {
const completion = await openai.chat.completions.create({
model: "gpt-4o-2024-08-06",
messages: [
{
role: "system",
content: "Extract structured insights from product reviews. Return JSON only."
},
{ role: "user", content: reviewText }
],
response_format: {
type: "json_schema",
json_schema: {
name: "product_review_analysis",
schema: ReviewJSONSchema,
strict: true
}
}
});
const content = completion.choices[0]?.message?.content ?? "";
let data: unknown;
try {
data = JSON.parse(content);
} catch (err) {
throw new Error("Model returned non JSON. Verify response_format and schema.", { cause: err });
}
return ProductReviewSchema.parse(data);
}
// 3. Use it
const result = await analyzeReview(
"Battery life is incredible, easily lasts two days. The screen is crisp and bright. " +
"The camera sometimes hunts for focus in low light. Overall, very happy. 4/5 stars."
);
console.log(result);
// {
// rating: 4,
// pros: ["Long battery life", "High-quality display"],
// cons: ["Camera autofocus issues in low light"],
// wouldRecommend: true,
// summary: "Excellent battery and screen, minor camera issues"
// }
Moving Forward
Structured JSON output turns LLM integration from fragile text parsing into reliable, typed data processing. It reduces incidents, improves maintainability, and makes AI features feel native to your application.
Start with your most brittle endpoint. Replace prose parsing with schema enforcement plus Zod validation. Measure error rates and time to resolution for two weeks. Let the numbers guide your rollout.
References
OpenAI structured outputs guide and schema enforcement details.
open
Azure structured outputs guide and supported JSON Schema subset.
Microsoft Learn
Zod documentation on .parse()
and runtime validation for TypeScript.
Zod
LangChain JS concepts and how to return structured data with .withStructuredOutput()
.
js.langchain.com
Top comments (2)
This is an amazing method to follow. The decision to abandon regex for the purpose of structured outputs in production systems is a very wise one. going to experiment this with Zod for my coming project
Thank you! 🫡