Jangwook Kim

Posted on Jun 2 • Originally published at jangwook.net

TypeScript Zod v4 + Claude API: A Complete Guide to Type-Safe LLM Response Parsing

#typescript #zod #claudeapi

I once trusted a raw JSON.parse() call on a Claude API response and got burned by a runtime error. When you pull content[0].text and parse it, there's no guarantee the resulting object has the fields you expect. LLMs ignore prompts, quietly rename fields, or mix types. Zod v4 catches that at the type level before it ever reaches your business logic.

This article covers practical patterns for safely parsing Claude API responses, tested against Zod 4.4.3 and @anthropic-ai/sdk 0.100.1. I ran a 100,000-iteration parse benchmark myself and checked the v3 API changes against actual behavior in code.

What Actually Changed Between Zod v3 and v4

The headline numbers are impressive: string parsing 14x faster, arrays 7x, objects 6.5x. Bundle size down 57%. TypeScript instantiation reduced up to 100x. That said, you don't need to migrate immediately just because the numbers look good.

After hands-on use, three changes are the ones you actually feel.

First, error messages are more readable. The old pattern of passing separate required_error and invalid_type_error options is replaced by a single error parameter. Default message formats also changed. What was "String must contain at least 1 character(s)" in v3 is now "Too small: expected string to have >=1 characters" in v4. If any of your tests do string comparisons on Zod error messages, they will break.

Second, number validation is stricter. Infinity and -Infinity used to pass z.number() in v3. In v4, they return success: false. Integers exceeding Number.MAX_SAFE_INTEGER are also rejected by z.number().int(). Worth noting if your code might receive extreme values from external APIs or LLM responses.

Third, the API surface got cleaner. The v4 style is z.email() instead of z.string().email(). Use z.intersection(A, B) over .and(). And there's a new .check() method for inline custom validation.

Honest caveat: v4 is not always faster than v3. Community benchmarks show a handful of deeply nested schema scenarios where v3 is actually quicker. The headline numbers reflect typical patterns, not every workload.

APIs Officially Removed (But Still There)

The migration docs say .and() was removed. In practice, testing against 4.4.3, it exists and works fine. The documentation appears to have gotten ahead of the actual release. Same story with required_error — it technically still works, but the message format changed. These look more like quiet deprecations than hard removals.

When planning a migration, verify against the actual version you're running rather than taking the docs at face value.

Installation and Basic Setup

npm install zod@^4.4.3
npm install @anthropic-ai/sdk@^0.100.1

For a TypeScript project, strict: true in tsconfig.json is required for Zod's type inference to work properly.

{
  "compilerOptions": {
    "strict": true,
    "target": "ES2022",
    "module": "ESNext",
    "moduleResolution": "bundler"
  }
}

Here's the minimal check to confirm things work after installation:

import { z } from 'zod';

const UserSchema = z.object({
  name: z.string().min(1),
  email: z.email(),         // v4 style: replaces z.string().email()
  age: z.number().int().min(0).max(150),
  role: z.enum(['admin', 'user', 'viewer'])
});

type User = z.infer<typeof UserSchema>;

const result = UserSchema.safeParse({
  name: 'Jangwook',
  email: 'kim.jangwook@example.com',
  age: 30,
  role: 'admin'
});

if (result.success) {
  console.log(result.data.name); // type: string
} else {
  console.log(result.error.issues);
}

Output is exactly what you'd expect:

success: true
parsed data: {"name":"Jangwook","email":"kim.jangwook@example.com","age":30,"role":"admin"}

@zod/mini Is a Separate Package

The release announcement includes @zod/mini, a tree-shakeable build at roughly 1.9KB gzip. Useful if you care about frontend bundle size. The API surface is different from the main zod package, though. Since this article focuses on server-side Claude API integration, everything here uses the main package.

Designing Schemas for LLM Responses

Schemas for LLM responses need a different design philosophy than schemas for form data. The key difference is defensive handling of optional fields.

An LLM may not return every field you asked for. Response quality is variable, and prompt changes can shift the structure. Your schema should reflect that reality.

A Basic LLM Response Schema

import { z } from 'zod';

// Schema for blog post analysis response
const BlogAnalysisSchema = z.object({
  title: z.string().min(1).max(200),
  summary: z.string().min(10),
  tags: z.array(z.string()).min(1).max(10),
  sentiment: z.enum(['positive', 'neutral', 'negative']),
  readingTimeMinutes: z.number().int().min(1).max(60),
  // Fields the LLM may not always return
  seoScore: z.number().min(0).max(1).optional(),
  suggestedImprovements: z.array(z.string()).optional()
});

type BlogAnalysis = z.infer<typeof BlogAnalysisSchema>;

Nested Schemas with Metadata

Sometimes you want metadata about the response itself alongside the actual content — a confidence score, model info, that kind of thing.

const LLMResponseSchema = z.object({
  // Actual content
  content: z.object({
    title: z.string().min(1),
    tags: z.array(z.string()),
    body: z.string()
  }),
  // Response metadata (optional)
  metadata: z.object({
    model: z.string(),
    confidence: z.number().min(0).max(1),
    processingTimeMs: z.number().int().positive()
  }).optional()
});

In my tests, nested objects with .optional() behaved as expected. Parsing succeeds even when metadata is absent.

LLM response (with metadata) success: true
title: Zod v4: A Deep Dive into Schema Validation
confidence: 0.92
LLM response (no metadata) success: true

Using z.string().check() to Validate LLM Response Format

The new .check() API in v4 is genuinely useful when an LLM is supposed to follow a specific format or prefix convention.

// LLM responses must always start with "RESULT:"
const LLMResultSchema = z.string().check((ctx) => {
  if (!ctx.value.startsWith('RESULT:')) {
    ctx.issues.push({
      code: 'custom',
      message: 'LLM response must start with "RESULT:"',
      input: ctx.value
    });
  }
});

const valid = LLMResultSchema.safeParse('RESULT: analysis complete');
const invalid = LLMResultSchema.safeParse('analysis complete');

console.log(valid.success);   // true
console.log(invalid.success); // false

One rough edge worth knowing: TypeScript autocomplete inside the .check() callback is thin. The issue object you push into ctx.issues needs code: 'custom', message, and input, but the editor won't hint these fields reliably. It's an easy place to make a typo the first time through.

Parsing Claude API Responses with Zod

Pattern 1: Prompt for JSON, Then Parse

The simplest approach. Specify JSON format in the system prompt, extract the response text, run it through JSON.parse(), then validate with Zod.

import Anthropic from '@anthropic-ai/sdk';
import { z } from 'zod';

const client = new Anthropic();

// Define the expected response structure
const ArticleAnalysisSchema = z.object({
  title: z.string().min(1),
  mainTopics: z.array(z.string()).min(1).max(5),
  difficulty: z.enum(['beginner', 'intermediate', 'advanced']),
  estimatedReadTime: z.number().int().positive(),
  hasCodeExamples: z.boolean()
});

type ArticleAnalysis = z.infer<typeof ArticleAnalysisSchema>;

async function analyzeArticle(content: string): Promise<ArticleAnalysis> {
  const response = await client.messages.create({
    model: 'claude-sonnet-4-6',
    max_tokens: 1024,
    system: `You are a technical document analyzer.
Respond only with JSON in this exact format:
{
  "title": "document title",
  "mainTopics": ["topic1", "topic2"],
  "difficulty": "beginner" | "intermediate" | "advanced",
  "estimatedReadTime": number (minutes),
  "hasCodeExamples": true | false
}
Do not include any text outside the JSON.`,
    messages: [
      { role: 'user', content: `Analyze the following document:\n\n${content}` }
    ]
  });

  // Extract the text content
  const textContent = response.content.find(block => block.type === 'text');
  if (!textContent || textContent.type !== 'text') {
    throw new Error('No text response received');
  }

  // Parse JSON
  let parsed: unknown;
  try {
    parsed = JSON.parse(textContent.text);
  } catch {
    throw new Error(`JSON parse failed: ${textContent.text}`);
  }

  // Zod validation
  const result = ArticleAnalysisSchema.safeParse(parsed);
  if (!result.success) {
    const errorSummary = result.error.issues
      .map(issue => `${issue.path.join('.')}: ${issue.message}`)
      .join(', ');
    throw new Error(`Schema validation failed: ${errorSummary}`);
  }

  return result.data;
}

The weak point here is that when the LLM wraps its JSON in markdown code fences or adds explanation text, JSON.parse() fails. You need a bit of defensive extraction:

function extractJsonFromResponse(text: string): string {
  // Extract JSON from ```
{% endraw %}
json ...
{% raw %}
 ``` blocks
  const codeBlockMatch = text.match(/```
{% endraw %}
(?:json)?\s*([\s\S]*?)\s*
{% raw %}
```/);
  if (codeBlockMatch) {
    return codeBlockMatch[1];
  }

  // Extract anything wrapped in curly braces
  const jsonMatch = text.match(/\{[\s\S]*\}/);
  if (jsonMatch) {
    return jsonMatch[0];
  }

  return text;
}

Pattern 2: Force Structured Output via Tool Use

As covered in Claude Agent SDK Tool Use Complete Guide, using tool_use lets you enforce JSON structure. The LLM "calls" a tool and returns structured data as the tool input.

import Anthropic from '@anthropic-ai/sdk';
import { z } from 'zod';

const client = new Anthropic();

// Zod schema for the tool's input
const ArticleMetadataSchema = z.object({
  title: z.string().describe('The article\'s core title'),
  tags: z.array(z.string()).describe('List of relevant tags (up to 5)'),
  confidence: z.number().min(0).max(1).describe('Analysis confidence (0-1)')
});

// Tool definition in Anthropic format
// (written manually here, without zodToJsonSchema)
const extractMetadataTool: Anthropic.Messages.Tool = {
  name: 'extract_metadata',
  description: 'Extract metadata from a document',
  input_schema: {
    type: 'object',
    properties: {
      title: {
        type: 'string',
        description: 'The article\'s core title'
      },
      tags: {
        type: 'array',
        items: { type: 'string' },
        description: 'List of relevant tags (up to 5)'
      },
      confidence: {
        type: 'number',
        minimum: 0,
        maximum: 1,
        description: 'Analysis confidence (0-1)'
      }
    },
    required: ['title', 'tags', 'confidence']
  }
};

async function extractMetadata(content: string) {
  const response = await client.messages.create({
    model: 'claude-sonnet-4-6',
    max_tokens: 1024,
    tools: [extractMetadataTool],
    tool_choice: { type: 'auto' },
    messages: [
      {
        role: 'user',
        content: `Extract metadata from the following:\n\n${content}`
      }
    ]
  });

  // Find the tool_use block
  const toolUseBlock = response.content.find(
    block => block.type === 'tool_use' && block.name === 'extract_metadata'
  );

  if (!toolUseBlock || toolUseBlock.type !== 'tool_use') {
    throw new Error('Tool was not called');
  }

  // tool_use input is unknown — validate with Zod
  const result = ArticleMetadataSchema.safeParse(toolUseBlock.input);

  if (!result.success) {
    throw new Error(
      `tool_use input validation failed: ${JSON.stringify(result.error.format())}`
    );
  }

  return result.data;
}

Tool Use is more reliable than Pattern 1 for a clear reason. Claude structures its JSON directly into the tool input field. There's no room for markdown fences or stray explanatory text. The SDK handles JSON parsing internally, so you don't need to catch JSON.parse() failures separately.

That said, never skip Zod validation even with Tool Use. toolUseBlock.input is typed as unknown. If Claude returns an unexpected type, the error hides until runtime.

Production Error Handling Patterns

LLM response parsing fails at two distinct layers: JSON parsing and Zod schema validation. Distinguishing between them makes debugging much faster.

Separating Error Layers

type ParseResult<T> =
  | { success: true; data: T }
  | { success: false; stage: 'json' | 'schema'; error: string; raw?: string };

function parseLLMResponse<T>(
  text: string,
  schema: z.ZodType<T>
): ParseResult<T> {
  // Layer 1: JSON parsing
  let parsed: unknown;
  try {
    const jsonText = extractJsonFromResponse(text);
    parsed = JSON.parse(jsonText);
  } catch (err) {
    return {
      success: false,
      stage: 'json',
      error: err instanceof Error ? err.message : String(err),
      raw: text
    };
  }

  // Layer 2: Zod schema validation
  const result = schema.safeParse(parsed);
  if (!result.success) {
    return {
      success: false,
      stage: 'schema',
      error: formatZodError(result.error),
      raw: text
    };
  }

  return { success: true, data: result.data };
}

function formatZodError(error: z.ZodError): string {
  return error.issues
    .map(issue => {
      const path = issue.path.length > 0
        ? `[${issue.path.join('.')}]`
        : '[root]';
      return `${path} ${issue.message}`;
    })
    .join('; ');
}

Structured Errors with error.format()

error.format() is still available in v4, returning errors organized by field.

const result = BlogAnalysisSchema.safeParse(badData);

if (!result.success) {
  const formatted = result.error.format();
  // Example output:
  // {
  //   _errors: [],
  //   title: { _errors: ['Too small: expected string to have >=1 characters'] },
  //   tags: { _errors: ['Too small: expected array to have >=1 items'] }
  // }

  // Pull errors for a specific field
  const titleErrors = formatted.title?._errors ?? [];
  const tagsErrors = formatted.tags?._errors ?? [];
}

When you need per-field structure for client responses or logs, error.format() is clean. For a flat list of issues, error.issues directly is simpler.

Retry Logic with Feedback

When parsing fails, you can retry with the error message injected back into the prompt so the LLM can self-correct.

async function analyzeWithRetry(
  content: string,
  schema: z.ZodType<unknown>,
  maxRetries = 2
): Promise<unknown> {
  let lastError = '';

  for (let attempt = 0; attempt <= maxRetries; attempt++) {
    const systemPrompt = attempt === 0
      ? BASE_SYSTEM_PROMPT
      : `${BASE_SYSTEM_PROMPT}\n\nThe previous response caused this error: ${lastError}\nRespond only with the JSON format specified.`;

    const response = await client.messages.create({
      model: 'claude-sonnet-4-6',
      max_tokens: 1024,
      system: systemPrompt,
      messages: [{ role: 'user', content }]
    });

    const textBlock = response.content.find(b => b.type === 'text');
    if (!textBlock || textBlock.type !== 'text') continue;

    const parseResult = parseLLMResponse(textBlock.text, schema);
    if (parseResult.success) return parseResult.data;

    lastError = parseResult.error;
    console.warn(`Attempt ${attempt + 1} failed: ${lastError}`);
  }

  throw new Error(`Parse failed after ${maxRetries + 1} attempts: ${lastError}`);
}

Keep retries at 2 or fewer. API costs add up quickly.

Performance: What Zod v4 Speed Actually Looks Like

I ran this on Apple Silicon with a 4-field object schema, 100,000 safeParse() iterations:

const UserSchema = z.object({
  name: z.string().min(1),
  email: z.email(),
  age: z.number().int().min(0).max(150),
  role: z.enum(['admin', 'user', 'viewer'])
});

const testData = {
  name: 'Jangwook',
  email: 'kim.jangwook@example.com',
  age: 30,
  role: 'admin'
};

const iterations = 100_000;
const start = performance.now();

for (let i = 0; i < iterations; i++) {
  UserSchema.safeParse(testData);
}

const duration = performance.now() - start;
const parsesPerSecond = Math.round(iterations / (duration / 1000));
console.log(`duration: ${duration.toFixed(2)}ms`);
console.log(`parses/second: ${parsesPerSecond.toLocaleString()}`);

Results:

iterations: 100,000
duration: 45.78ms
parses/second: 2,184,481

2.18 million parses per second. That's overkill for Claude API response handling. The API call itself takes hundreds of milliseconds to seconds — Zod parsing will never be your bottleneck.

Where the speed matters is batch processing. If you're running Zod validation across millions of log entries or event records, v4's throughput improvement is genuinely noticeable. For LLM response parsing alone, the performance case for migrating from v3 to v4 is weak.

My current position: start new projects on v4. No urgent reason to migrate existing v3 codebases. v4 is production-ready, but if v3 is working fine, there's no fire.

Environment Variance

These numbers came from an Apple Silicon M-series machine. AWS or GCP Linux x86 instances will differ. If you need performance guarantees in CI, measure directly in your actual environment. Don't take official benchmarks as ground truth for your setup.

Practical Integration: Blog Post Metadata Extractor

Here's a working example combining the patterns above:

import Anthropic from '@anthropic-ai/sdk';
import { z } from 'zod';

const client = new Anthropic();

// Blog post metadata schema
const PostMetadataSchema = z.object({
  title: z.string().min(1).max(100),
  description: z.string().min(50).max(200),
  tags: z.array(z.string().min(1)).min(1).max(5),
  difficulty: z.enum(['beginner', 'intermediate', 'advanced']),
  estimatedReadingTime: z.number().int().min(1).max(60),
  hasCodeExamples: z.boolean(),
  targetAudience: z.string().min(10).max(100)
});

type PostMetadata = z.infer<typeof PostMetadataSchema>;

async function extractPostMetadata(
  markdownContent: string
): Promise<PostMetadata> {
  const response = await client.messages.create({
    model: 'claude-sonnet-4-6',
    max_tokens: 1024,
    system: `Analyze a technical blog post and return its metadata as JSON.
You must follow this exact format:
{
  "title": "core post title (under 100 characters)",
  "description": "SEO description (50-200 characters)",
  "tags": ["tag1", "tag2"],
  "difficulty": "beginner" | "intermediate" | "advanced",
  "estimatedReadingTime": number (minutes),
  "hasCodeExamples": true | false,
  "targetAudience": "description of intended readers (10-100 characters)"
}`,
    messages: [
      {
        role: 'user',
        content: `Analyze the following markdown content:\n\n${markdownContent}`
      }
    ]
  });

  const textBlock = response.content.find(b => b.type === 'text');
  if (!textBlock || textBlock.type !== 'text') {
    throw new Error('No text response received');
  }

  const parseResult = parseLLMResponse(textBlock.text, PostMetadataSchema);

  if (!parseResult.success) {
    throw new Error(
      `Metadata extraction failed [${parseResult.stage}]: ${parseResult.error}`
    );
  }

  return parseResult.data;
}

The same pattern drops directly into MCP tool handlers from TypeScript MCP Server Step-by-Step. Call an LLM inside the handler, validate the response with Zod, return structured output.

When unit testing this function as described in Vitest 4 AI Agent Testing Patterns, mock client.messages.create() and assert on the safeParse() result. Having a Zod schema makes it easy to build test fixtures that match the schema exactly.

Migration Checklist: v3 to v4

Find any code that validates Infinity or -Infinity with z.number()
Replace required_error and invalid_type_error options with the unified error parameter
Update test assertions that compare Zod error message strings directly
Gradually replace z.string().email() with z.email() (old API still works, but v4 style is preferred)
Replace .and() with z.intersection(A, B) (still works, but officially deprecated)
For large codebases, evaluate the zod-v3-to-v4 community codemod

If the migration feels like a lot, start by auditing just the z.number() breaking changes. The rest can be handled incrementally.

Closing Thoughts

Zod v4 is a solid choice for LLM response parsing. The type safety from safeParse(), nested schema support, and consolidated error API all fit naturally with Claude API integration. The performance improvement won't be noticeable in LLM response handling, but the TypeScript compilation speedup makes a real difference in larger projects.

The one rough edge: .check() TypeScript support is not quite there yet. When pushing custom issues via ctx.issues.push(), you're writing without autocomplete. That needs improvement.

For new projects, go with Zod v4. For existing v3 codebases, review the breaking changes list and migrate incrementally.

DEV Community