Hey! I'm Andrey, a frontend developer at Cloud.ru, and I write about frontend and AI on my blog and Telegram channel.
I work with LLM APIs every day. And every day I send structured data into context: product lists, logs, users, metrics. All of it - JSON. All of it - money.
At some point I calculated how many tokens in my prompts go to curly braces, quotes, and repeated keys. Turns out - a lot. Way too much.
Then I tried TOON. Here's what happened.
The Problem: JSON Is a Generous Format
Take a typical case. You're building a RAG system or an AI assistant that analyzes data. Your prompt pulls in a list of 50 records. Here's one record in JSON:
{"id": 2001, "timestamp": "2025-11-18T08:14:23Z", "level": "error", "service": "auth-api", "ip": "172.16.4.21", "message": "Auth failed for user", "code": "AUTH_401"}
Now multiply by 50. Each record repeats 7 keys: "id", "timestamp", "level", "service", "ip", "message", "code". Plus quotes around every key and string value. Plus curly braces. Plus commas.
Across 50 records that's ~350 redundant key repetitions and hundreds of characters of syntactic overhead. The model tokenizes all of it. You pay for all of it.
The Fix: TOON in 30 Seconds
TOON (Token-Oriented Object Notation) encodes the same data with the same structure, but without repetition. Keys are declared once in a header, then it's values only:
logs[3]{id,timestamp,level,service,ip,message,code}:
2001,2025-11-18T08:14:23Z,error,auth-api,172.16.4.21,Auth failed for user,AUTH_401
2002,2025-11-18T08:14:24Z,warn,payment,172.16.4.22,Timeout on payment gateway,PAY_TIMEOUT
2003,2025-11-18T08:14:25Z,info,user-svc,172.16.4.23,User profile updated,USR_200
The header logs[3]{id,timestamp,level,service,ip,message,code}: says: array of 3 elements, fields are these. That's it. Then rows of comma-separated values. No quotes around keys, no {} per object, no duplication.
JSON -> TOON -> JSON conversion is lossless, 1:1. It's not a different data model - it's a different encoding of the same model.
Counting Tokens: A Real Test
I took a dataset of 50 log entries (7 fields each) and ran it through a tokenizer:
Python:
import json
import toon_format # pip install toon-format
import tiktoken
enc = tiktoken.encoding_for_model("gpt-4o")
with open("logs.json") as f:
data = json.load(f)
json_str = json.dumps(data, indent=2)
json_compact = json.dumps(data)
toon_str = toon_format.encode(data)
print(f"JSON (formatted): {len(enc.encode(json_str))} tokens")
print(f"JSON (compact): {len(enc.encode(json_compact))} tokens")
print(f"TOON: {len(enc.encode(toon_str))} tokens")
TypeScript:
import { encode as toToon } from "@toon-format/toon";
import { encode as tokenize } from "gpt-3-encoder";
import fs from "fs";
const data = JSON.parse(fs.readFileSync("./logs.json", "utf8"));
const jsonFormatted = JSON.stringify(data, null, 2);
const jsonCompact = JSON.stringify(data);
const toonStr = toToon(data);
console.log(`JSON (formatted): ${tokenize(jsonFormatted).length} tokens`);
console.log(`JSON (compact): ${tokenize(jsonCompact).length} tokens`);
console.log(`TOON: ${tokenize(toonStr).length} tokens`);
Results on real data (from TOON benchmarks):
| Format | Tokens | Savings vs JSON |
|---|---|---|
| JSON (formatted) | 379 | - |
| JSON (compact) | 236 | -37.7% |
| TOON | 150 | -60.4% |
60% savings. On a single prompt. Not hypothetically - measured by tokenizer.
Counting Money: How Much You're Overpaying
Now the fun part. Current API prices (April 2026):
| Model | Input (per 1M tokens) | Output (per 1M tokens) |
|---|---|---|
| GPT-4o | $2.50 | $10.00 |
| GPT-4.1 | $2.00 | $8.00 |
| Claude Sonnet 4.6 | $3.00 | $15.00 |
| Claude Opus 4.6 | $5.00 | $25.00 |
Say you make 10,000 requests per day, each containing an array of 100 objects (typical RAG/analytics). Let's calculate for GPT-4o:
| JSON | TOON | Difference | |
|---|---|---|---|
| Tokens per request | ~3,200 | ~1,850 | -42% |
| Tokens per day | 32M | 18.5M | -13.5M |
| Cost per day | $80 | $46.25 | -$33.75 |
| Per month | $2,400 | $1,387 | -$1,013 |
| Per year | $28,800 | $16,650 | -$12,150 |
On Claude Opus 4.6 (input $5/1M) the savings are even bigger:
| JSON | TOON | Difference | |
|---|---|---|---|
| Per month | $4,800 | $2,775 | -$2,025 |
| Per year | $57,600 | $33,300 | -$24,300 |
$12-24K per year - on input tokens alone, on a single endpoint. If you have multiple pipelines - multiply accordingly.
Integration: 5 Minutes, 4 Lines of Code
You don't need to rewrite your architecture. TOON plugs in as a layer before sending to the API:
Python + OpenAI:
import openai
import toon_format
def analyze_with_llm(data: list[dict]) -> str:
toon_str = toon_format.encode({"records": data}) # JSON -> TOON
response = openai.chat.completions.create(
model="gpt-4o",
messages=[{
"role": "user",
"content": f"Analyze this data and find anomalies:\n\n{toon_str}"
}]
)
return response.choices[0].message.content
TypeScript + Anthropic:
import Anthropic from "@anthropic-ai/sdk";
import { encode as toToon } from "@toon-format/toon";
async function analyzeData(records: any[]) {
const toonData = toToon({ records });
const response = await anthropic.messages.create({
model: "claude-sonnet-4-6-20250514",
max_tokens: 1024,
messages: [{
role: "user",
content: `Analyze this data and find anomalies:\n\n${toonData}`
}]
});
return response.content[0].text;
}
One line - toon_format.encode() - and you save 40-60% of tokens. The model responds in its usual format, nothing to change on the output side.
The Big Comparison: TOON vs Everything Else
No format is perfect for every case. Here's an honest breakdown:
| Criteria | JSON | JSON compact | YAML | CSV | TOON | TRON |
|---|---|---|---|---|---|---|
| Tokens (tabular data) | 100% | ~63% | ~72% | ~38% | ~40% | ~55% |
| Tokens (nested data) | 100% | ~78% | ~85% | n/a | ~67% | ~75% |
| LLM accuracy | 75.0% | 73.7% | 74.5% | ~72% | 76.4% | - |
| Nested structures | excellent | excellent | good | none | medium | good |
| Pipeline compatibility | everywhere | everywhere | wide | wide | needs SDK | JSON-compatible |
| LLM familiarity (training data) | huge | huge | large | large | minimal | minimal |
| Lossless round-trip with JSON | yes | yes | caveats | no | yes | yes |
Key takeaways:
- TOON vs CSV: CSV is ~5-6% more compact for flat tables but doesn't support nesting or types. TOON adds minimal overhead but the model parses data more accurately.
- TOON vs YAML: TOON saves 48% tokens on tabular data. YAML is better for deeply nested configs.
- TOON vs JSON compact: Even minified JSON loses to TOON by 35% on tables. On nested data the gap is smaller (~15%).
- TOON vs TRON: TRON is JSON-compatible (parseable with any JSON parser). TOON is more compact but requires a dedicated parser. Choose TRON if you don't want to change your toolchain.
When to Use TOON (and When Not To)
Use TOON when:
- Uniform arrays of objects - user lists, products, logs, metrics. 40-60% savings.
- RAG pipelines - dozens of same-structure documents pulled into context.
- Batch processing - thousands of requests per day, every percentage of savings = real money.
- Long contexts - when data doesn't fit the context window and shrinking it is critical.
Don't use TOON when:
- Deeply nested structures (4+ levels). LLM accuracy drops to 43% on nested data. JSON is more reliable.
- Data goes to a regular service, not an LLM. TOON is a prompt format, not for REST APIs or databases.
- Flat tables with no nesting. CSV is 5-6% more compact and needs no SDK.
- You need JSON Schema validation. TOON is a different syntax - existing validators won't work.
Ecosystem: What Already Works
| Language | Package | Status |
|---|---|---|
| TypeScript | @toon-format/toon |
Reference implementation |
| Python |
toon-format / python-toon
|
Stable |
| Go | toon-format/go-toon |
In development |
| Rust | toon-format/toon-rs |
In development |
| .NET | toon-format/toon-dotnet |
In development |
| CLI | npx @toon-format/cli |
Works |
Quick start:
# TypeScript
npm install @toon-format/toon
# Python
pip install toon-format
# Convert a file via CLI
npx @toon-format/cli data.json -o data.toon
npx @toon-format/cli data.toon -o data.json # and back
The spec is open, ABNF grammar is documented, test fixtures are available: toon-format/spec.
Bottom Line
TOON is not a JSON replacement. JSON will remain the standard for APIs, configs, and storage. But if you're sending structured data to an LLM - you're literally throwing money at syntactic overhead.
Four lines of code. Five minutes to integrate. Minus 40-60% tokens. Minus $12-24K per year at moderate load.
Try it on one endpoint. Measure. Calculate. Your API budget will thank you. Or at least stop quietly sobbing at night.
If this was useful - I write about frontend, AI, and practical dev stuff on my blog and Telegram channel. Come say hi!
Links:
Top comments (0)