Felipe Stanzani

Posted on Nov 2

JSON is Making You Lose Money!!!

#javascript #programming #ai #opensource

Let's be real: every time you shove a bloated JSON blob into an LLM prompt, you're literally burning cash. Those curly braces, endless quotes, and repeated keys? They're token vampires sucking your OpenAI/Anthropic/Cursor bill dry. I've been there – cramming user data, analytics, or repo stats into prompts, only to hit context limits or watch costs skyrocket.

But what if I told you there's a format that cuts tokens by up to 60%, boosts LLM accuracy, and was cleverly designed for exactly this problem? Meet TOON (Token-Oriented Object Notation), the brainchild of Johann Schopplich – a dev who's all about making AI engineering smarter and cheaper.

Johann nailed it with TOON over at his original TypeScript repo: github.com/johannschopplich/toon. It's not just another serialization format; it's a lifeline for anyone building AI apps at scale.

Why JSON is Robbing You Blind in LLM Prompts

JSON is great for APIs and config files. But for LLM context? It's a disaster:

Verbose AF: Braces {}, brackets [], quotes around every key and string – all eating tokens.
Repeated Keys: In arrays of objects, every row repeats the same field names. 100 users? That's 100x "id", "name", etc.
No Built-in Smarts: LLMs have to parse all that noise, leading to higher error rates on retrieval tasks.
Token Explosion at Scale: A modest dataset can balloon to thousands of unnecessary tokens.

Result? Higher costs, slower responses, and more "context too long" errors. If you're querying GPT-5-nano or Claude with tabular data, JSON is quietly making you poor.

Enter TOON: The Token-Slaying Hero

TOON flips the script by blending YAML's clean indentation with CSV's tabular efficiency – but optimized for LLMs. Key differences from JSON:

Tabular Arrays: Declare fields once in a header, then stream rows comma/tab/pipe-separated. No repeating keys!
Minimal Punctuation: Ditches braces/brackets/quotes where possible. Indentation handles nesting.
Explicit Lengths: [N] prefixes arrays so LLMs know exactly what's coming – reduces parsing errors.
Smart Quoting: Only quotes when needed (e.g., strings with delimiters or specials).
Delimiter Options: Comma (default), tab, or pipe for extra token wins (tabs often tokenize best).

TOON shines on uniform arrays of objects – think user lists, analytics rows, or GitHub repos. For non-uniform or deeply nested data, it gracefully falls back to list format (still slimmer than JSON).

Real Examples: JSON vs TOON

Classic JSON Bloat (257 tokens for a tiny e-commerce order):

{
    "items": [
        { "sku": "A1", "qty": 2, "price": 9.99 },
        { "sku": "B2", "qty": 1, "price": 14.5 }
    ]
}

TOON Magic (166 tokens – 35% savings):

items[2]{sku,qty,price}:
  A1,2,9.99
  B2,1,14.5

Nested? No problem:

orders[1]:
  - users[2]{id,name}:
      1,Ada
      2,Bob
    status: active

Primitive arrays inline:

tags[3]: admin,ops,dev

And for wonky data, it uses clean lists:

mixed[3]:
  - 42
  - name: Ada
  - "quoted, string"

Benchmarks That'll Make You Switch Today

Johann's rigorous tests (using GPT-5 tokenizer) across real datasets prove TOON crushes JSON:

GitHub Repos (top 100): 8,745 tokens vs JSON's 15,145 (42% saved)
Daily Analytics (180 days): 4,507 vs 10,977 (59% saved)
E-Commerce Orders: 166 vs 257 (35% saved)
Total: 13,418 vs 26,379 (49% average savings)

Accuracy? TOON hits 70%+ on retrieval tasks across GPT-5, Claude, Gemini, and Grok – often beating JSON while using half the tokens. Check the full spec for details: TOON Spec.

Pro tip: Use tab delimiters for even more savings on big tables!

Hands-On: TOON in TypeScript (The OG)

Install the original:

npm install @toon-format/toon

Encode:

import { encode } from "@toon-format/toon";

const data = {
    users: [
        { id: 1, name: "Alice", role: "admin" },
        { id: 2, name: "Bob", role: "user" },
    ],
};

console.log(encode(data, { delimiter: "\t" })); // Tab for extra savings!

Output:

users[2 ]{id    name    role}:
  1 Alice   admin
  2 Bob user

Decode back to JS objects too. CLI for quick conversions: npx @toon-format/cli data.json --stats.

TOON in Java: Meet JToon (My port)

Java devs rejoice! JToon brings TOON to the JVM – Maven Central ready.

Add it:

<dependency>
    <groupId>com.felipestanzani</groupId>
    <artifactId>jtoon</artifactId>
    <version>0.1.1</version>
</dependency>

Code:

import com.felipestanzani.jtoon.JToon;
import com.felipestanzani.jtoon.Delimiter;
import com.felipestanzani.jtoon.EncodeOptions;

record Item(String sku, int qty, double price) {}
record Data(List<Item> items) {}

var items = List.of(new Item("A1", 2, 9.99), new Item("B2", 1, 14.5));
var data = new Data(items);

var options = new EncodeOptions(2, Delimiter.TAB, true);  // Tabs + length markers
System.out.println(JToon.encode(data, options));

Output:

items[#2 ]{sku qty price}:
  A1 2 9.99
  B2 1 14.5

Even encodes JSON strings directly: JToon.encodeJson(jsonString).

TOON Everywhere: Ports Galore

TOON's spec is open and crystal-clear, so ports are popping up:

TypeScript/JS (original): github.com/johannschopplich/toon
Java (JToon – battle-tested): github.com/felipestanzani/JToon
.NET: ToonSharp
Crystal: toon-crystal
Dart: toon
Elixir: toon_ex
Gleam: toon_codec
Go: gotoon
OCaml: ocaml-toon
PHP: toon-php
Python: python-toon or pytoon
Ruby: toon-ruby
Rust: rtoon
Swift: TOONEncoder

Pick your poison and start saving.

Stop Losing Money – Switch to TOON Now

JSON had its time. For LLM prompts, TOON is the future: cheaper, faster, more accurate. Thank Johann Schopplich for this gem – follow him at byjohann.link for more AI wizardry.

Try it in your next prompt. Your wallet (and your LLMs) will thank you. What's your biggest token horror story? Drop it in the comments – and share this if you're tired of JSON waste!

Originally posted on my blog, Memory Leak

DEV Community