Tahmid

Posted on Apr 18

How to structure JSON for LLMs (and stop wasting tokens)

#ai #json #webdev #llm

Most developers treat JSON as an afterthought when building LLM-powered apps. They dump raw API responses into prompts and wonder why the model hallucinates, misreads fields, or burns through tokens.

JSON structure is a first-class concern in AI engineering. Here's how to get it right.

The problem: LLMs don't read JSON like humans do

When you paste this into a prompt:

{"user":{"id":1,"name":"Alice","preferences":{"theme":"dark","notifications":true,"language":"en"}}}

The model tokenizes it character by character. Every {, ", :, and , is a token. Deeply nested structures force the model to maintain more working context just to understand the shape of the data — before it even processes the values.

The result: more tokens consumed, more room for misinterpretation, higher API costs.

Rule 1: Flatten before you prompt

Nested JSON is great for APIs. It's bad for prompts.

Before:

{
  "user": {
    "profile": {
      "name": "Alice",
      "age": 28
    }
  }
}

After (flattened):

{
  "user_name": "Alice",
  "user_age": 28
}

One level deep is almost always enough for LLM context. If the model needs to reason about relationships, describe them in natural language alongside the data — don't encode them in nesting.

Rule 2: Strip fields the model doesn't need

Every field in your JSON costs tokens. If the model doesn't need created_at, updated_at, internal_id, or _metadata — remove them before building the prompt.

const { created_at, updated_at, _metadata, ...relevant } = apiResponse;
const prompt = `Here is the user data: ${JSON.stringify(relevant)}`;

This alone can cut token usage by 20–40% on typical API responses.

Rule 3: Use TOON for large payloads

If you're passing payloads larger than ~500 tokens, consider TOON (Token-Oriented Object Notation). It's a compact alternative to JSON that strips redundant syntax while preserving structure.

JSON:

[
  {"name": "Alice", "role": "admin"},
  {"name": "Bob", "role": "editor"}
]

TOON:

name|role
Alice|admin
Bob|editor

Token reduction: 30–60% on typical datasets. The model reads it correctly because the structure is still unambiguous — just more compact.

Try it on your own payloads with the JSON to TOON converter. There's also a TOON to JSON converter for decoding the model's response back.

Rule 4: Use JSON Schema to enforce structured outputs

LLMs can return JSON — but without constraints, they hallucinate keys, change types, and add fields you didn't ask for.

The fix: define a schema and include it in your system prompt.

{
  "type": "object",
  "properties": {
    "sentiment": { "type": "string", "enum": ["positive", "negative", "neutral"] },
    "confidence": { "type": "number", "minimum": 0, "maximum": 1 },
    "summary": { "type": "string", "maxLength": 200 }
  },
  "required": ["sentiment", "confidence", "summary"]
}

Tell the model: "Respond only with a JSON object matching this schema. No explanation, no markdown."

Then validate the output with a JSON Schema validator before trusting it. This is especially critical in agentic workflows where one bad output poisons downstream steps. You can generate a schema automatically from any sample payload using the JSON Schema generator — useful as a starting point you can then tighten up.

Rule 5: Use Pydantic or Zod to validate at the boundary

Never trust raw LLM JSON output in production. Parse and validate it immediately.

Python (FastAPI / AI agents):

from pydantic import BaseModel

class SentimentResult(BaseModel):
    sentiment: str
    confidence: float
    summary: str

result = SentimentResult.model_validate_json(llm_output)

TypeScript (Next.js / tRPC):

import { z } from 'zod';

const SentimentResult = z.object({
  sentiment: z.enum(['positive', 'negative', 'neutral']),
  confidence: z.number().min(0).max(1),
  summary: z.string().max(200),
});

const result = SentimentResult.parse(JSON.parse(llmOutput));

Writing these by hand from a large JSON payload is tedious. The JSON to Pydantic and JSON to Zod generators handle it instantly — paste your payload, get the model.

Rule 6: Use TypeScript interfaces when working with typed LLM outputs

If you're building in TypeScript, generating interfaces from your JSON response shapes saves time and prevents drift between what the LLM returns and what your code expects.

// Generated from your actual LLM response shape
interface SentimentResponse {
  sentiment: 'positive' | 'negative' | 'neutral';
  confidence: number;
  summary: string;
}

The JSON to TypeScript converter generates these from any payload — useful when you're iterating quickly on prompt outputs and want the type system to catch regressions.

The full checklist

Before passing JSON to an LLM:

Flatten nested structures to one level where possible
Strip fields irrelevant to the task
Use TOON for payloads > 500 tokens
Define a JSON Schema for expected output
Validate output with Pydantic or Zod before use
Use TypeScript interfaces to catch output shape regressions

These aren't micro-optimisations. On high-volume AI apps, they compound into significant cost and reliability improvements.

What's your current approach to JSON in LLM workflows? Drop it in the comments — I'm curious how others handle this.

Free tools used in this post:

JSON to TOON — compress JSON for LLM context
TOON to JSON — decode TOON back to JSON
JSON Schema Generator — generate schemas from sample payloads
JSON to Pydantic — instant Python models
JSON to Zod — instant TypeScript validation schemas
JSON to TypeScript — generate interfaces from any payload
All tools — client-side, no sign-up, nothing leaves your browser

DEV Community