How a novel data format is saving developers 30-60% on LLM token costs
If you've been working with Large Language Models, you've probably noticed ...
For further actions, you may consider blocking this person and/or reporting abuse
Thank you for the article, would you mind adding our JSON to TOON tool in your article?
scalevise.com/json-toon-converter
Cobol is that you?
Why not use YAML if you want a more condensed format?
Our system processes billions of tokens each month from the database alone, we can't use YAML or JSON because they are too token heavy. For flat results (like a database query) we simply return it as CSV (which is even 30% fewer tokens than TOON). We haven't adopted TOON yet (we have our own for structured objects) but its definitely more token friendly than YAML. At the scale we operate, YAML is expensive.
I based my statement on the examples I have seen at that moment. And I agree if you use YAML for tabular data it is expensive. That is why in my other comment I mentioned a switch based on the shape of the data. CSV for tabular data and YAML for hierarchical data.
You can even have CSV in YAML.
If that wasn't possible, I would go all in for TOON for those mixed cases.
TOON is YAML with hierarchical data. So it doesn't reduces tokens. And as you mention, in the case of tabular data CSV is better.
If you can show where TOON is saving tokens over the smart use of YAML and CSV, I'm glad to stand corrected.
I got that TOON is one more compact data serialization format, similar in purpose to YAML
True but why the need to invent a new format?
Most languages have mature YAML and CSV libraries if you need to condense the text that is send to an AI.
The main reason for TOON is probably that you can feed it content that is better compacted by YAML and better compacted by CSV. Instead of creating a function to switch the output between the two formats yourself.
Out of curiosity I asked an AI to create that function.
I’ve been experimenting with this new “code execution with MCP” concept and recently implemented it in my OSS mcpproxy project. The current version can generate a JSON → TOON converter on the fly for any MCP tool within an agent session, which makes data format conversion and reuse incredibly easy.
Prompt example: Read this post: dev.to/akki907/toon-vs-json-the-new-format-designed-for-ai-nk5. Then implement and run code that converts the tool's JSON output to TOON format, using the tool. Output ONLY TOON data in code execution response
Surely XML would do better in peak accuracy because a close tag explicitly matches to an opening tag?
Surely Markdown and YAML offer similar compactness and readability to TOON with better support?
It would be good to see the article amended to test against the JSON, XML, YAML, Markdown and CSV (or perhaps better still another delimited format like tab-delimited) data, to get a better idea as to how it compares to a more complete range of options
I'll just leave this here for people still taking this seriously 👀
How it is different from TONL github.com/tonl-dev/tonl ?
Thank you for the article)
Thanks for sharing this. Built a quick tool for anyone wanting to test TOON formatting: bestaitools.tech/tools/json-to-toon Runs client-side, no data sent to servers 🛠️
Thanks for sharing.
Why would you pick TOON over CSV?
Thanks for the article. Had read many articles, but this one explains in much better way!
Thorough resource on how to use TOON, thanks! I've made a free converter from JSON to TOON to showcase the library functionality.