DEV Community

Cover image for TOON vs JSON: When 60% Token Savings Becomes 1.8% - A Reality Check
Tejas Page
Tejas Page

Posted on

TOON vs JSON: When 60% Token Savings Becomes 1.8% - A Reality Check

The data format everyone's talking about - and the caveats most people skip

TOON promises 40-60% token reductions. After testing it on my Azure DevOps code review MCP server, the real improvement was 1.8%.

Here's what happened: TOON delivers massive gains for uniform, tabular data - but my nested API responses and code diffs saw only 6-19% reduction. Since I'd already optimized my JSON (83.3% token reduction), TOON's additional benefit was marginal.

The lesson: TOON works - but only for specific data structures. Here's when it shines, when it doesn't, and why the marketing doesn't tell the full story.


✅ Where TOON Absolutely Shines

1. Uniform Tabular Data (Database Results, Logs, Analytics)

JSON-TOON converter on Scalevise

Why it works: Identical fields. TOON declares the schema once and streams rows.

2. High-Volume RAG Systems

When you're embedding 1,000 rows of product catalog or customer data into every LLM prompt:

  • Before: 4,500 tokens per query × 10,000 queries/day = 45M tokens/day
  • After: 1,900 tokens per query × 10,000 queries/day = 19M tokens/day
  • Token reduction: 26M tokens/day
  • Savings: $32.50/day at current GPT-5.1 rates ($1.25/1M input tokens) = ~$975/month

3. Time-Series & Monitoring Data

Server logs, metrics, events - anything with repeated structure across hundreds of entries.

   logs[500]{timestamp,level,service,message}:
     2024-11-20T10:00:01,INFO,api-gateway,Request processed
     2024-11-20T10:00:02,WARN,auth-service,Rate limit approaching
     ...
Enter fullscreen mode Exit fullscreen mode

⚠️ Where TOON's Benefits Diminish: My Real-World Test

I maintain an MCP server that returns Azure DevOps pull request data to LLMs for code reviews. After optimizing my JSON responses to 33,400 tokens (83% reduction), I tested TOON expecting another 30-40% improvement.

What actually happened: TOON saved 11% on average - far below the advertised gains. Here's why:

The Problem: Deeply Nested, Non-Uniform Data

My pull request response looks like this:

interface PullRequestDetails {
  pullRequestId: number;
  title: string;
  description?: string; // Optional!
  createdBy: User; // Nested object
  closedBy?: User; // Optional nested object
  lastMergeSourceCommit: {
    // Different nested structure
    commitId: string;
  };
  reviewers: Reviewer[]; // Array of objects with mixed fields
}

interface Reviewer {
  displayName: string;
  uniqueName?: string;
  vote: number;
  isRequired?: boolean; // Not all reviewers have this
}
Enter fullscreen mode Exit fullscreen mode

Why TOON's efficiency gains are limited here:

  1. Optional fields reduce the uniformity that TOON excels at
  2. Nested objects (createdBy, closedBy) use indentation instead of JSON's compact syntax
  3. Mixed structures (reviewers with varying fields) can't leverage TOON's tabular format efficiently
  4. Non-uniform arrays where objects have different optional fields lose TOON's compression advantage

The Complete Picture

My current optimization:

  • Raw Azure DevOps API responses: 200,000 tokens
  • After my JSON optimization: 33,400 tokens (83.3% reduction)

Actual measured TOON savings on my data:

  • Pull Request Details: 564 → 457 tokens = 19% reduction
  • Work Item Details: 1103 → 1035 tokens = 6.2% reduction
  • Unified Diffs (the actual code): 8,629 → 7,978 tokens = 7.5% reduction
  • Average across response types: ~10-12% reduction (vs 40-60% for flat data)

The critical insight: The unified diffs - which contain the actual code being reviewed and make up the bulk of my token usage - get the smallest benefit from TOON.

The Reality:

Starting from 33,400 tokens (already 83% optimized), TOON's average 11% reduction saves 3,670 tokens - a 1.8% overall improvement (83.3% → 85.1%). The cost impact: $0.005 per review (half a cent) vs $0.21 from JSON optimization.

Finding: Unified diffs (the actual code being reviewed) only compress by 7.5%, while nested structures see 6-19% reduction - far below TOON's 40-60% gains on flat, tabular data.


Which Format Should You Use?

Choose TOON for:

  • Long-running RAG pipelines with thousands of uniform records
  • Database query results, server logs, time-series data
  • Static schemas with consistent fields across all objects
  • High-volume scenarios (10K+ queries/day) where every token counts

Stick with JSON for:

  • API-shaped data with nested objects and optional fields
  • Code diffs, documentation, or free-form text
  • Incremental evolution where schemas change frequently
  • Already-optimized responses (diminishing returns)

The Real Lesson: Data Transformation > Data Format

I expected TOON to deliver another 30-40% reduction. Instead, I got 11%. Why? Because most token waste comes from sending unnecessary data, not from how you encode it.

My 83.3% reduction came from eliminating noise:

  • Removing navigation metadata (_links, URLs) - 40% savings
  • Filtering system-generated comments - 25% savings
  • Stripping HTML formatting - 10% savings
  • Simplifying user objects (7 fields → 2) - 8% savings

The surprise: Code diffs - the actual content I need - barely compress with TOON (7.5%). Format optimization helps with repetitive metadata, not valuable content.


My Verdict

TOON delivers on its promise - for the right data. The 40-60% claim is real for uniform, tabular structures. But for nested APIs, code diffs, and irregular schemas, expect 6-20% gains.

Before implementing TOON, optimize your data first:

  • Remove unnecessary fields
  • Filter system noise
  • Strip formatting
  • Simplify nested objects

If you've already done this and still need more compression, then evaluate TOON. For high-volume RAG systems with uniform data, it's compelling. For already-optimized APIs, it's marginal.


Cost Impact: Why This Matters

Real-World Measurements (6-file PR)

Progressive token reduction across optimization stages for a 6-file PR:

Approach Tokens Used Reduction
Full API Responses 200,000 baseline
Unified diffs + slimmed responses 33,400 83.3%
+ TOON format conversion 29,730 85.1%

At current GPT 5.1 rates ($1.25/1M input tokens), here's the annual impact (100 reviews/day, 264 working days):

Approach Annual Cost Savings vs Baseline
No optimization $6,600
JSON optimized $1,056 $5,544 (84%)
JSON + TOON $977 $5,623 (85%)

TOON adds $79/year (~1.4% more savings) for this volume.

Bottom line: TOON does work - but converting data pipelines, testing LLM comprehension, and maintaining dual formats isn't justified for $79/year when JSON optimization already captured $5,544.


Conclusion

TOON isn't a universal upgrade - it's excellent for uniform, tabular data and high-volume RAG pipelines. For nested, irregular, or code-heavy workloads (especially if you've already optimized your JSON), the gains are marginal.

My results: 83.3% reduction from data transformation, 1.8% from format change.

The real opportunity isn't switching formats—it's eliminating the data you don't need to send.


What's your experience with token optimization? Have you tried TOON in production? I'd love to hear real-world results beyond the marketing benchmarks.

Top comments (0)