Token-Oriented Object Notation (TOON) is a compact, human-readable encoding of the JSON data model, specifically designed to minimize tokens and simplify structure for Large Language Models (LLMs). It acts as a drop-in, lossless representation of JSON, allowing developers to use familiar JSON programmatically while converting to TOON for efficient AI input.
TOON merges YAML’s indentation-based structure for nested objects with CSV-style tabular arrays for uniform data. Its primary strength is with uniform arrays of objects—multiple fields per row with consistent structure—achieving compactness similar to CSV, while maintaining explicit schema information for reliable LLM parsing. For deeply nested or non-uniform data, standard JSON may remain more efficient.
Why TOON?
With AI becoming more accessible, context windows are expanding, but tokens still cost money. Standard JSON is verbose:
{
"context": {
"task": "Our favorite hikes together",
"location": "Boulder",
"season": "spring_2025"
},
"friends": ["ana", "luis", "sam"],
"hikes": [
{"id": 1, "name": "Blue Lake Trail", "distanceKm": 7.5, "elevationGain": 320, "companion": "ana", "wasSunny": true},
{"id": 2, "name": "Ridge Overlook", "distanceKm": 9.2, "elevationGain": 540, "companion": "luis", "wasSunny": false},
{"id": 3, "name": "Wildflower Loop", "distanceKm": 5.1, "elevationGain": 180, "companion": "sam", "wasSunny": true}
]
}
TOON conveys the same information with fewer tokens, combining YAML-style indentation and CSV-style tabular arrays:
context:
task: Our favorite hikes together
location: Boulder
season: spring_2025
friends[3]: ana,luis,sam
hikes[3]{id,name,distanceKm,elevationGain,companion,wasSunny}:
1,Blue Lake Trail,7.5,320,ana,true
2,Ridge Overlook,9.2,540,luis,false
3,Wildflower Loop,5.1,180,sam,true
Key Features
- Token-Efficient & Accurate: TOON achieves up to 74% accuracy versus JSON’s 70% while using ~40% fewer tokens in mixed-structure benchmarks.
- JSON-Compatible: Encodes objects, arrays, and primitives with deterministic, lossless round-trips.
-
LLM-Friendly: Explicit
[N]lengths and{fields}headers provide clear schema information for reliable parsing. - Minimal Syntax: Indentation instead of braces, minimal quoting, YAML-like readability with CSV compactness.
- Tabular Arrays: Uniform arrays collapse into tables, declaring fields once and streaming row values line by line.
- Multi-Language Ecosystem: Implementations exist in TypeScript, Python, Go, Rust, .NET, and more.
Media Type & File Extension
-
File extension:
.toon -
Media type:
text/toon - Always UTF-8 encoded.
When Not to Use TOON
- Deeply nested or non-uniform structures: JSON compact may use fewer tokens.
- Semi-uniform arrays: Token savings are reduced.
- Pure tabular data: CSV may remain slightly smaller than TOON.
- Latency-critical applications: Test performance against your specific model and setup.
Benchmarks
TOON consistently reduces token usage while improving comprehension across four major LLMs:
- Efficiency Score: Accuracy % ÷ Tokens × 1,000
- Mixed-Structure Track: TOON uses 39.6% fewer tokens while improving accuracy over standard JSON.
- Flat-Only Track: TOON slightly exceeds CSV token count (+6%) for added structure and reliability.
Detailed per-model benchmarks show TOON outperforms JSON, YAML, and XML across varied datasets while remaining competitive with CSV on flat tabular data.
Installation & Quick Start
CLI (no installation required):
npx @toon-format/cli input.json -o output.toon
echo '{"name": "Ada", "role": "dev"}' | npx @toon-format/cli
TypeScript Library:
npm install @toon-format/toon
Example usage:
import { encode } from '@toon-format/toon'
const data = {
users: [
{ id: 1, name: 'Alice', role: 'admin' },
{ id: 2, name: 'Bob', role: 'user' }
]
}
console.log(encode(data))
// users[2]{id,name,role}:
// 1,Alice,admin
// 2,Bob,user
Playgrounds & Editor Support
- Official Playground: Convert JSON to TOON in real time, compare token counts, share experiments.
- Editor Support: VS Code extension, Tree-sitter grammar, Neovim plugin, and YAML highlighting for other editors.
Using TOON with LLMs
TOON’s structure is self-documenting. When prompting LLMs:
- Wrap data in TOON code blocks.
- Provide
[N]lengths and{fields}headers. - Use tab delimiters for token efficiency.
Other Implementations
- Official: .NET, Dart, Go, Java, Julia, Python, Rust, Swift
- Community: Apex, C++, Clojure, Crystal, Elixir, Scala, Lua, OCaml, Perl, PHP, R, Ruby, Kotlin
TOON is stable, but still evolving. Contributions, feedback, and experimentation are encouraged.
Summary:
TOON provides a token-efficient, readable, LLM-friendly alternative to JSON, especially for uniform arrays of objects. It reduces token costs, increases parsing reliability, and is easy to integrate with existing JSON workflows. For developers working with LLMs at scale, TOON is a powerful addition to the toolset.
Top comments (0)