Most developers treat JSON as an afterthought when building LLM-powered apps. They dump raw API responses into prompts and wonder why the model hallucinates, misreads fields, or burns through tokens.
JSON structure is a first-class concern in AI engineering. Here's how to get it right.
The problem: LLMs don't read JSON like humans do
When you paste this into a prompt:
{"user":{"id":1,"name":"Alice","preferences":{"theme":"dark","notifications":true,"language":"en"}}}
The model tokenizes it character by character. Every {, ", :, and , is a token. Deeply nested structures force the model to maintain more working context just to understand the shape of the data — before it even processes the values.
The result: more tokens consumed, more room for misinterpretation, higher API costs.
Rule 1: Flatten before you prompt
Nested JSON is great for APIs. It's bad for prompts.
Before:
{
"user": {
"profile": {
"name": "Alice",
"age": 28
}
}
}
After (flattened):
{
"user_name": "Alice",
"user_age": 28
}
One level deep is almost always enough for LLM context. If the model needs to reason about relationships, describe them in natural language alongside the data — don't encode them in nesting.
Rule 2: Strip fields the model doesn't need
Every field in your JSON costs tokens. If the model doesn't need created_at, updated_at, internal_id, or _metadata — remove them before building the prompt.
const { created_at, updated_at, _metadata, ...relevant } = apiResponse;
const prompt = `Here is the user data: ${JSON.stringify(relevant)}`;
This alone can cut token usage by 20–40% on typical API responses.
Rule 3: Use TOON for large payloads
If you're passing payloads larger than ~500 tokens, consider TOON (Token-Oriented Object Notation). It's a compact alternative to JSON that strips redundant syntax while preserving structure.
JSON:
[
{"name": "Alice", "role": "admin"},
{"name": "Bob", "role": "editor"}
]
TOON:
name|role
Alice|admin
Bob|editor
Token reduction: 30–60% on typical datasets. The model reads it correctly because the structure is still unambiguous — just more compact.
Try it on your own payloads with the JSON to TOON converter. There's also a TOON to JSON converter for decoding the model's response back.
Rule 4: Use JSON Schema to enforce structured outputs
LLMs can return JSON — but without constraints, they hallucinate keys, change types, and add fields you didn't ask for.
The fix: define a schema and include it in your system prompt.
{
"type": "object",
"properties": {
"sentiment": { "type": "string", "enum": ["positive", "negative", "neutral"] },
"confidence": { "type": "number", "minimum": 0, "maximum": 1 },
"summary": { "type": "string", "maxLength": 200 }
},
"required": ["sentiment", "confidence", "summary"]
}
Tell the model: "Respond only with a JSON object matching this schema. No explanation, no markdown."
Then validate the output with a JSON Schema validator before trusting it. This is especially critical in agentic workflows where one bad output poisons downstream steps. You can generate a schema automatically from any sample payload using the JSON Schema generator — useful as a starting point you can then tighten up.
Rule 5: Use Pydantic or Zod to validate at the boundary
Never trust raw LLM JSON output in production. Parse and validate it immediately.
Python (FastAPI / AI agents):
from pydantic import BaseModel
class SentimentResult(BaseModel):
sentiment: str
confidence: float
summary: str
result = SentimentResult.model_validate_json(llm_output)
TypeScript (Next.js / tRPC):
import { z } from 'zod';
const SentimentResult = z.object({
sentiment: z.enum(['positive', 'negative', 'neutral']),
confidence: z.number().min(0).max(1),
summary: z.string().max(200),
});
const result = SentimentResult.parse(JSON.parse(llmOutput));
Writing these by hand from a large JSON payload is tedious. The JSON to Pydantic and JSON to Zod generators handle it instantly — paste your payload, get the model.
Rule 6: Use TypeScript interfaces when working with typed LLM outputs
If you're building in TypeScript, generating interfaces from your JSON response shapes saves time and prevents drift between what the LLM returns and what your code expects.
// Generated from your actual LLM response shape
interface SentimentResponse {
sentiment: 'positive' | 'negative' | 'neutral';
confidence: number;
summary: string;
}
The JSON to TypeScript converter generates these from any payload — useful when you're iterating quickly on prompt outputs and want the type system to catch regressions.
The full checklist
Before passing JSON to an LLM:
- Flatten nested structures to one level where possible
- Strip fields irrelevant to the task
- Use TOON for payloads > 500 tokens
- Define a JSON Schema for expected output
- Validate output with Pydantic or Zod before use
- Use TypeScript interfaces to catch output shape regressions
These aren't micro-optimisations. On high-volume AI apps, they compound into significant cost and reliability improvements.
What's your current approach to JSON in LLM workflows? Drop it in the comments — I'm curious how others handle this.
Free tools used in this post:
- JSON to TOON — compress JSON for LLM context
- TOON to JSON — decode TOON back to JSON
- JSON Schema Generator — generate schemas from sample payloads
- JSON to Pydantic — instant Python models
- JSON to Zod — instant TypeScript validation schemas
- JSON to TypeScript — generate interfaces from any payload
- All tools — client-side, no sign-up, nothing leaves your browser
Top comments (0)