DEV Community

Cover image for πŸš€ TOON (Token-Oriented Object Notation) β€” The Smarter, Lighter JSON for LLMs
abhilaksh-arora
abhilaksh-arora

Posted on

πŸš€ TOON (Token-Oriented Object Notation) β€” The Smarter, Lighter JSON for LLMs

When building AI and LLM-based applications, one of the biggest hidden costs often comes from something simple β€” the format of your data.

Every {}, [], and " inside JSON counts as a token when you send it to a Large Language Model (LLM).

With big payloads or complex structured data, this can burn through tokens (and money) fast. ⚑️

That's where TOON (Token-Oriented Object Notation) steps in β€” a format designed specifically for LLMs to make structured data compact, readable, and token-efficient.


πŸ’‘ What Is TOON?

TOON stands for Token-Oriented Object Notation β€” a modern, lightweight data format optimized for LLMs.

Think of it as:

"JSON, reimagined for token efficiency and human readability."

It trims the excess β€” no curly braces, square brackets, or quotes β€” and uses indentation plus tabular patterns instead.

The result is a format that models (and humans) can parse easily, while using far fewer tokens.


βš™οΈ Why TOON Matters

When you send JSON to an LLM:

  • Every punctuation mark adds to the token count.
  • Repeated keys in long arrays multiply the cost.
  • The verbosity doesn't actually help model understanding.

TOON solves this by:

  • Declaring keys once per table-like block
  • Replacing commas/braces with indentation
  • Maintaining data clarity but cutting syntactic noise

πŸ’° The result: 30–60% fewer tokens on average.


🧠 Example: TOON in Action

JSON

{
  "users": [
    { "id": 1, "name": "Alice" },
    { "id": 2, "name": "Bob" }
  ]
}
Enter fullscreen mode Exit fullscreen mode

TOON

users[2]{id,name}:
  1,Alice
  2,Bob
Enter fullscreen mode Exit fullscreen mode

Same structure.

Same meaning.

Roughly half the tokens.


🧰 Encode JSON β†’ TOON in TypeScript

Try it yourself using the official TOON package.

Installation

npm install @toon-format/toon
# or
pnpm add @toon-format/toon
Enter fullscreen mode Exit fullscreen mode

Example Code

import { encode, decode } from "@toon-format/toon";

const data = {
  users: [
    { id: 1, name: "Alice", role: "admin" },
    { id: 2, name: "Bob", role: "user" },
  ],
};

const toon = encode(data);
console.log("TOON Format:\n", toon);

// Decode back to JSON if needed
const parsed = decode(toon);
console.log("Decoded JSON:\n", parsed);
Enter fullscreen mode Exit fullscreen mode

Output

users[2]{id,name,role}:
  1,Alice,admin
  2,Bob,user
Enter fullscreen mode Exit fullscreen mode

βš–οΈ JSON vs TOON

Feature JSON TOON
Purpose Universal data format (APIs, configs, storage) Token-efficient format for LLMs
Syntax Verbose {}, [], " Compact indentation, tabular style
Readability Moderate High (human + model friendly)
Token Usage High πŸ”₯ Up to 60% fewer
Best Use Case APIs, persistence LLM prompts, structured outputs
Nested Objects Excellent ⚠️ Inefficient for deep nesting
Ecosystem Mature, universal Emerging, growing fast

⚠️ When Not to Use TOON

TOON shines for flat, tabular JSON objects, but it's not ideal for deeply nested structures.

In those cases, the extra indentation and context actually increase tokens.

Example:

{
  "company": {
    "departments": [
      {
        "name": "Engineering",
        "employees": [{ "id": 1, "name": "Alice" }]
      }
    ]
  }
}
Enter fullscreen mode Exit fullscreen mode

➑ Converting this to TOON can be longer, not shorter.

βœ… Best suited for

  • Flat lists (users, products, messages)
  • Prompt templates
  • Model training or evaluation datasets

❌ Avoid for

  • Deeply nested hierarchies
  • Complex relational data

πŸ“Š Token Efficiency Snapshot

Dataset JSON Tokens TOON Tokens Savings
User list 150 82 βˆ’45%
Product catalog 320 180 βˆ’44%
Nested data 410 435 ❌ +6%

🧩 TL;DR

TOON (Token-Oriented Object Notation) is a lightweight, token-efficient alternative to JSON β€” built for AI and LLM workloads.

βœ… Cleaner syntax

βœ… Human-readable

βœ… Up to 60% fewer tokens

But remember β€” it works best for flat JSON objects, not deeply nested structures.

If you're building LLM pipelines, prompt templates, or structured AI datasets, TOON can save tokens, reduce cost, and keep your data clean.


πŸ§ͺ Bonus: Benchmark Token Count (JSON vs TOON)

Here's a quick Node.js script you can use to compare token usage between JSON and TOON using OpenAI's tiktoken tokenizer.

Install Dependencies

npm install @toon-format/toon tiktoken
Enter fullscreen mode Exit fullscreen mode

Script

import { encode } from "@toon-format/toon";
import { encoding_for_model } from "tiktoken";

const data = {
  users: [
    { id: 1, name: "Alice", role: "admin" },
    { id: 2, name: "Bob", role: "user" },
    { id: 3, name: "Charlie", role: "editor" },
  ],
};

const jsonData = JSON.stringify(data, null, 2);
const toonData = encode(data);

// Use GPT-4 tokenizer (you can change to "gpt-3.5-turbo" etc.)
const tokenizer = encoding_for_model("gpt-4o-mini");

const jsonTokens = tokenizer.encode(jsonData).length;
const toonTokens = tokenizer.encode(toonData).length;

console.log("πŸ“Š Token Comparison");
console.log("-------------------");
console.log("JSON tokens:", jsonTokens);
console.log("TOON tokens:", toonTokens);
console.log("Savings:", (((jsonTokens - toonTokens) / jsonTokens) * 100).toFixed(2) + "%");

tokenizer.free();
Enter fullscreen mode Exit fullscreen mode

Example Output

πŸ“Š Token Comparison
-------------------
JSON tokens: 84
TOON tokens: 32
Savings: 61.90%
Enter fullscreen mode Exit fullscreen mode

You can tweak this for your own datasets β€” you'll see consistent 30–60% token savings for flat, tabular data.


πŸ’¬ Final Thoughts

The ecosystem around LLMs is evolving fast, and even small optimizations β€” like switching from JSON to TOON β€” can create huge cost and performance improvements at scale.

Try it out, benchmark it, and see how many tokens (and dollars) you save! πŸš€


Tags: #AI #LLM #PromptEngineering #JSON #TOON #AIOptimization #OpenAI #DataCompression #DeveloperTools

Top comments (0)