Tejas Page

Posted on Dec 16, 2025

TOON vs JSON: When 60% Token Savings Becomes 1.8% - A Reality Check

#ai #llm #mcp #discuss

The data format everyone's talking about - and the caveats most people skip

TOON promises 40-60% token reductions. After testing it on my Azure DevOps code review MCP server, the real improvement was 1.8%.

Here's what happened: TOON delivers massive gains for uniform, tabular data - but my nested API responses and code diffs saw only 6-19% reduction. Since I'd already optimized my JSON (83.3% token reduction), TOON's additional benefit was marginal.

The lesson: TOON works - but only for specific data structures. Here's when it shines, when it doesn't, and why the marketing doesn't tell the full story.

✅ Where TOON Absolutely Shines

1. Uniform Tabular Data (Database Results, Logs, Analytics)

Why it works: Identical fields. TOON declares the schema once and streams rows.

2. High-Volume RAG Systems

When you're embedding 1,000 rows of product catalog or customer data into every LLM prompt:

Before: 4,500 tokens per query × 10,000 queries/day = 45M tokens/day
After: 1,900 tokens per query × 10,000 queries/day = 19M tokens/day
Token reduction: 26M tokens/day
Savings: $32.50/day at current GPT-5.1 rates ($1.25/1M input tokens) = ~$975/month

3. Time-Series & Monitoring Data

Server logs, metrics, events - anything with repeated structure across hundreds of entries.

   logs[500]{timestamp,level,service,message}:
     2024-11-20T10:00:01,INFO,api-gateway,Request processed
     2024-11-20T10:00:02,WARN,auth-service,Rate limit approaching
     ...

⚠️ Where TOON's Benefits Diminish: My Real-World Test

I maintain an MCP server that returns Azure DevOps pull request data to LLMs for code reviews. After optimizing my JSON responses to 33,400 tokens (83% reduction), I tested TOON expecting another 30-40% improvement.

What actually happened: TOON saved 11% on average - far below the advertised gains. Here's why:

The Problem: Deeply Nested, Non-Uniform Data

My pull request response looks like this:

interface PullRequestDetails {
  pullRequestId: number;
  title: string;
  description?: string; // Optional!
  createdBy: User; // Nested object
  closedBy?: User; // Optional nested object
  lastMergeSourceCommit: {
    // Different nested structure
    commitId: string;
  };
  reviewers: Reviewer[]; // Array of objects with mixed fields
}

interface Reviewer {
  displayName: string;
  uniqueName?: string;
  vote: number;
  isRequired?: boolean; // Not all reviewers have this
}

Why TOON's efficiency gains are limited here:

Optional fields reduce the uniformity that TOON excels at
Nested objects (createdBy, closedBy) use indentation instead of JSON's compact syntax
Mixed structures (reviewers with varying fields) can't leverage TOON's tabular format efficiently
Non-uniform arrays where objects have different optional fields lose TOON's compression advantage

The Complete Picture

My current optimization:

Raw Azure DevOps API responses: 200,000 tokens
After my JSON optimization: 33,400 tokens (83.3% reduction)

Actual measured TOON savings on my data:

Pull Request Details: 564 → 457 tokens = 19% reduction
Work Item Details: 1103 → 1035 tokens = 6.2% reduction
Unified Diffs (the actual code): 8,629 → 7,978 tokens = 7.5% reduction
Average across response types: ~10-12% reduction (vs 40-60% for flat data)

The critical insight: The unified diffs - which contain the actual code being reviewed and make up the bulk of my token usage - get the smallest benefit from TOON.

The Reality:

Starting from 33,400 tokens (already 83% optimized), TOON's average 11% reduction saves 3,670 tokens - a 1.8% overall improvement (83.3% → 85.1%). The cost impact: $0.005 per review (half a cent) vs $0.21 from JSON optimization.

Finding: Unified diffs (the actual code being reviewed) only compress by 7.5%, while nested structures see 6-19% reduction - far below TOON's 40-60% gains on flat, tabular data.

Which Format Should You Use?

Choose TOON for:

Long-running RAG pipelines with thousands of uniform records
Database query results, server logs, time-series data
Static schemas with consistent fields across all objects
High-volume scenarios (10K+ queries/day) where every token counts

Stick with JSON for:

API-shaped data with nested objects and optional fields
Code diffs, documentation, or free-form text
Incremental evolution where schemas change frequently
Already-optimized responses (diminishing returns)

The Real Lesson: Data Transformation > Data Format

I expected TOON to deliver another 30-40% reduction. Instead, I got 11%. Why? Because most token waste comes from sending unnecessary data, not from how you encode it.

My 83.3% reduction came from eliminating noise:

Removing navigation metadata (_links, URLs) - 40% savings
Filtering system-generated comments - 25% savings
Stripping HTML formatting - 10% savings
Simplifying user objects (7 fields → 2) - 8% savings

The surprise: Code diffs - the actual content I need - barely compress with TOON (7.5%). Format optimization helps with repetitive metadata, not valuable content.

My Verdict

TOON delivers on its promise - for the right data. The 40-60% claim is real for uniform, tabular structures. But for nested APIs, code diffs, and irregular schemas, expect 6-20% gains.

Before implementing TOON, optimize your data first:

Remove unnecessary fields
Filter system noise
Strip formatting
Simplify nested objects

If you've already done this and still need more compression, then evaluate TOON. For high-volume RAG systems with uniform data, it's compelling. For already-optimized APIs, it's marginal.

Cost Impact: Why This Matters

Real-World Measurements (6-file PR)

Progressive token reduction across optimization stages for a 6-file PR:

Approach	Tokens Used	Reduction
Full API Responses	200,000	baseline
Unified diffs + slimmed responses	33,400	83.3%
+ TOON format conversion	29,730	85.1%

At current GPT 5.1 rates ($1.25/1M input tokens), here's the annual impact (100 reviews/day, 264 working days):

Approach	Annual Cost	Savings vs Baseline
No optimization	$6,600	—
JSON optimized	$1,056	$5,544 (84%)
JSON + TOON	$977	$5,623 (85%)

TOON adds $79/year (~1.4% more savings) for this volume.

Bottom line: TOON does work - but converting data pipelines, testing LLM comprehension, and maintaining dual formats isn't justified for $79/year when JSON optimization already captured $5,544.

Conclusion

TOON isn't a universal upgrade - it's excellent for uniform, tabular data and high-volume RAG pipelines. For nested, irregular, or code-heavy workloads (especially if you've already optimized your JSON), the gains are marginal.

My results: 83.3% reduction from data transformation, 1.8% from format change.

The real opportunity isn't switching formats—it's eliminating the data you don't need to send.

What's your experience with token optimization? Have you tried TOON in production? I'd love to hear real-world results beyond the marketing benchmarks.

DEV Community