<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Webmaster Ramos</title>
    <description>The latest articles on DEV Community by Webmaster Ramos (@webramos).</description>
    <link>https://dev.to/webramos</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1273297%2F5ca1ac25-c251-4144-a980-975f0b6a7c4d.jpg</url>
      <title>DEV Community: Webmaster Ramos</title>
      <link>https://dev.to/webramos</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/webramos"/>
    <language>en</language>
    <item>
      <title>YAML vs Markdown vs JSON vs TOON: Which Format Is Most Efficient for the Claude API</title>
      <dc:creator>Webmaster Ramos</dc:creator>
      <pubDate>Tue, 14 Apr 2026 20:22:36 +0000</pubDate>
      <link>https://dev.to/webramos/yaml-vs-markdown-vs-json-vs-toon-which-format-is-most-efficient-for-the-claude-api-4l94</link>
      <guid>https://dev.to/webramos/yaml-vs-markdown-vs-json-vs-toon-which-format-is-most-efficient-for-the-claude-api-4l94</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;My own benchmark across three Claude tiers (Haiku, Sonnet, Opus): 120 data files, 8 real-world scenarios, 5 formats. Tokens, cost, and accuracy – numbers, not opinions.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  You Are Overpaying for Prompts
&lt;/h2&gt;

&lt;p&gt;Every time you send data to the Claude API, the format of that data determines how many tokens you spend. The same 200-product catalog in JSON costs 15,879 tokens. In Markdown, it costs 7,814. In TOON, 6,088. That is a 62% difference.&lt;/p&gt;

&lt;p&gt;A 120-task list? JSON consumes 8,500 tokens. TOON uses 2,267. Savings: 73%.&lt;/p&gt;

&lt;p&gt;The problem is that every existing benchmark focuses on GPT, Gemini, and Llama. There has not been a public benchmark for Claude. I decided to fix that.&lt;/p&gt;

&lt;p&gt;I ran 450 API calls on Claude Haiku 4.5, tested Sonnet 4.6 and Opus 4.6, and counted tokens across 120 files using Anthropic’s production tokenizer. Eight real-world scenarios, five formats. In this article – the results, the conclusions, and specific recommendations.&lt;/p&gt;




&lt;h2&gt;
  
  
  Five Formats at a Glance
&lt;/h2&gt;

&lt;h3&gt;
  
  
  JSON (JavaScript Object Notation)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Year created:&lt;/strong&gt; 2001; ECMA-404 standard (2013)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Author:&lt;/strong&gt; Douglas Crockford&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Primary use case:&lt;/strong&gt; APIs, data exchange between systems, configuration files&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Key characteristic:&lt;/strong&gt; strict typing, nesting via &lt;code&gt;{}&lt;/code&gt; and &lt;code&gt;[]&lt;/code&gt;, mandatory quotes&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;JSON is the lingua franca of programmatic interfaces. Every API speaks JSON, and every language can parse it. But that universality comes at a price in an LLM context: quotes, braces, and commas all consume tokens. They carry syntactic weight, but not semantic meaning.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"products"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Mouse"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"price"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;29.99&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"in_stock"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;}]}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  YAML (YAML Ain't Markup Language)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Year created:&lt;/strong&gt; 2001; YAML 1.2 standard (2009)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Authors:&lt;/strong&gt; Clark Evans, Ingy döt Net, Oren Ben-Kiki&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Primary use case:&lt;/strong&gt; configuration files (Docker Compose, Kubernetes, GitHub Actions)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Key characteristic:&lt;/strong&gt; indentation-based structure, minimal punctuation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;YAML is the de facto standard of the DevOps world. It reads like pseudocode and usually does not require quotes. The trade-off is that repeating keys for every array item eats up much of the punctuation savings.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;products&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;
    &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Mouse&lt;/span&gt;
    &lt;span class="na"&gt;price&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;29.99&lt;/span&gt;
    &lt;span class="na"&gt;in_stock&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Markdown
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Year created:&lt;/strong&gt; 2004&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Author:&lt;/strong&gt; John Gruber (with Aaron Swartz)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Primary use case:&lt;/strong&gt; documentation, READMEs, blogs, wikis&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Key characteristic:&lt;/strong&gt; human-first syntax – headings &lt;code&gt;#&lt;/code&gt;, tables &lt;code&gt;|&lt;/code&gt;, lists &lt;code&gt;-&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Markdown is the most “native” format for LLMs. Models have been trained on billions of READMEs and wiki pages. GitHub, Notion, Obsidian – all rely on Markdown. It is a communication format, not a data format.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gu"&gt;## Products&lt;/span&gt;

| ID | Name  | Price | In Stock |
|----|-------|-------|----------|
| 1  | Mouse | 29.99 | Yes      |
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Plain Text
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Primary use case:&lt;/strong&gt; human communication – emails, notes, instructions&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Key characteristic:&lt;/strong&gt; no syntax, no markup, maximum flexibility&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Plain text with no markup. It minimizes token overhead, but it provides no explicit structure for programmatic data extraction.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Products: Mouse (ID 1, $29.99, in stock)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  TOON (Token-Oriented Object Notation)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Year created:&lt;/strong&gt; 2025 (v1.0 – November 2025, MIT license)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Author:&lt;/strong&gt; open-source community (&lt;a href="https://github.com/toon-format/toon" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Primary use case:&lt;/strong&gt; token optimization in LLM prompts, replacing JSON in AI workflows&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Key characteristic:&lt;/strong&gt; a YAML + CSV hybrid (indentation for objects, row-style encoding for arrays)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The newest format in this comparison. TOON was created for one purpose: minimize tokens while preserving lossless JSON round-tripping. For arrays of homogeneous objects, field names are declared once and values are written as CSV-style rows. On GPT-5 Nano, it showed 99.4% accuracy with 46% token savings. Before this benchmark, it had not been tested on Claude.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;products[1]{id,name,price,in_stock}:
1,Mouse,29.99,true
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Methodology
&lt;/h2&gt;

&lt;h3&gt;
  
  
  What I Tested
&lt;/h3&gt;

&lt;p&gt;Eight scenarios, each in three sizes (S / M / L), each in five formats. Total: 120 data files.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;#&lt;/th&gt;
&lt;th&gt;Scenario&lt;/th&gt;
&lt;th&gt;Data type&lt;/th&gt;
&lt;th&gt;S&lt;/th&gt;
&lt;th&gt;M&lt;/th&gt;
&lt;th&gt;L&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;System prompt / instructions&lt;/td&gt;
&lt;td&gt;Rules, sections&lt;/td&gt;
&lt;td&gt;10 rules&lt;/td&gt;
&lt;td&gt;30 rules&lt;/td&gt;
&lt;td&gt;60 rules&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Product catalog&lt;/td&gt;
&lt;td&gt;Tabular data&lt;/td&gt;
&lt;td&gt;20 products&lt;/td&gt;
&lt;td&gt;100 products&lt;/td&gt;
&lt;td&gt;200 products&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Roadmap / tasks&lt;/td&gt;
&lt;td&gt;Statuses, dependencies&lt;/td&gt;
&lt;td&gt;15 tasks&lt;/td&gt;
&lt;td&gt;50 tasks&lt;/td&gt;
&lt;td&gt;120 tasks&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Business rules&lt;/td&gt;
&lt;td&gt;Conditional logic&lt;/td&gt;
&lt;td&gt;8 rules&lt;/td&gt;
&lt;td&gt;25 rules&lt;/td&gt;
&lt;td&gt;50 rules&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;Few-shot classification&lt;/td&gt;
&lt;td&gt;Input-output examples&lt;/td&gt;
&lt;td&gt;5 examples&lt;/td&gt;
&lt;td&gt;15 examples&lt;/td&gt;
&lt;td&gt;40 examples&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;Organizational hierarchy&lt;/td&gt;
&lt;td&gt;3 levels of nesting&lt;/td&gt;
&lt;td&gt;12 people&lt;/td&gt;
&lt;td&gt;60 people&lt;/td&gt;
&lt;td&gt;150 people&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;API documentation&lt;/td&gt;
&lt;td&gt;Endpoints, parameters&lt;/td&gt;
&lt;td&gt;5 endpoints&lt;/td&gt;
&lt;td&gt;15 endpoints&lt;/td&gt;
&lt;td&gt;30 endpoints&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;Output format&lt;/td&gt;
&lt;td&gt;Requesting data in a given format&lt;/td&gt;
&lt;td&gt;10 countries&lt;/td&gt;
&lt;td&gt;50 countries&lt;/td&gt;
&lt;td&gt;100 countries&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Few-shot&lt;/strong&gt; (scenario 5) is a prompting technique in which several “input → output” examples are included directly in the prompt so the model can infer the task from a pattern. For example: &lt;code&gt;"Great product!" → positive&lt;/code&gt;, &lt;code&gt;"Terrible quality" → negative&lt;/code&gt;, then the question &lt;code&gt;"Love it!" → ?&lt;/code&gt;. Zero examples is zero-shot, one example is one-shot, several examples is few-shot. The format of those examples directly affects cost: 40 pairs in JSON take 2,131 tokens; in TOON, 996.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;For scenarios 2, 3, 6, and 7, I prepared questions with precomputed correct answers (ground truth). For scenarios 1, 4, and 5, scoring was manual and rubric-based. For scenario 8, I measured output tokens and format compliance.&lt;/p&gt;

&lt;h3&gt;
  
  
  Models and Pricing
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model&lt;/th&gt;
&lt;th&gt;Tier&lt;/th&gt;
&lt;th&gt;Input ($/1M)&lt;/th&gt;
&lt;th&gt;Output ($/1M)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Claude Haiku 4.5&lt;/td&gt;
&lt;td&gt;Fast&lt;/td&gt;
&lt;td&gt;$0.80&lt;/td&gt;
&lt;td&gt;$4&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Claude Sonnet 4.6&lt;/td&gt;
&lt;td&gt;Mid&lt;/td&gt;
&lt;td&gt;$3&lt;/td&gt;
&lt;td&gt;$15&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Claude Opus 4.6&lt;/td&gt;
&lt;td&gt;Premium&lt;/td&gt;
&lt;td&gt;$15&lt;/td&gt;
&lt;td&gt;$75&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Accuracy was measured across all three tiers. Sizes S and M were tested for accuracy. L-size was used only for token counts.&lt;/p&gt;

&lt;h3&gt;
  
  
  Clean-Test Principle
&lt;/h3&gt;

&lt;p&gt;All requests were sent directly via the &lt;code&gt;anthropic&lt;/code&gt; Python SDK: plain &lt;code&gt;client.messages.create()&lt;/code&gt; with &lt;code&gt;temperature=0&lt;/code&gt;. No MCP servers, IDE plugins, or agent frameworks.&lt;/p&gt;

&lt;p&gt;Token counting was done with &lt;code&gt;client.messages.count_tokens()&lt;/code&gt; – Anthropic’s production tokenizer, i.e. the same numbers used for billing. &lt;strong&gt;The tokenizer is the same across all Claude tiers&lt;/strong&gt; – so the token-count data applies to all Claude models.&lt;/p&gt;

&lt;p&gt;Benchmark code: &lt;a href="https://github.com/webmaster-ramos/yaml-vs-md-benchmark" rel="noopener noreferrer"&gt;github.com/webmaster-ramos/yaml-vs-md-benchmark&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Input-Token Efficiency
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;These numbers apply to all Claude tiers – Haiku, Sonnet, and Opus all use the same tokenizer. The only cost difference comes from the price per token.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Summary Table: Average Input Tokens Across All Scenarios
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Format&lt;/th&gt;
&lt;th&gt;Average tokens&lt;/th&gt;
&lt;th&gt;vs JSON&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;JSON&lt;/td&gt;
&lt;td&gt;3,252&lt;/td&gt;
&lt;td&gt;baseline&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;YAML&lt;/td&gt;
&lt;td&gt;2,208&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;-32%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Markdown&lt;/td&gt;
&lt;td&gt;1,514&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;-53%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Plain Text&lt;/td&gt;
&lt;td&gt;1,391&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;-57%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;TOON&lt;/td&gt;
&lt;td&gt;1,226&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;-62%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;TOON saves 62% of input tokens on average versus JSON. Markdown saves 53%. YAML, despite its minimal punctuation, saves only 32% – because of repeated keys and indentation overhead.&lt;/p&gt;

&lt;h3&gt;
  
  
  Breakdown by Scenario (% Savings vs JSON, L-size)
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Scenario&lt;/th&gt;
&lt;th&gt;YAML&lt;/th&gt;
&lt;th&gt;MD&lt;/th&gt;
&lt;th&gt;TXT&lt;/th&gt;
&lt;th&gt;TOON&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Instructions&lt;/td&gt;
&lt;td&gt;-22%&lt;/td&gt;
&lt;td&gt;-29%&lt;/td&gt;
&lt;td&gt;-24%&lt;/td&gt;
&lt;td&gt;-24%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Products&lt;/td&gt;
&lt;td&gt;-29%&lt;/td&gt;
&lt;td&gt;-51%&lt;/td&gt;
&lt;td&gt;-53%&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;-62%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Tasks&lt;/td&gt;
&lt;td&gt;-35%&lt;/td&gt;
&lt;td&gt;-63%&lt;/td&gt;
&lt;td&gt;-69%&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;-73%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Business Rules&lt;/td&gt;
&lt;td&gt;-28%&lt;/td&gt;
&lt;td&gt;-52%&lt;/td&gt;
&lt;td&gt;-48%&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;-63%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Few-shot&lt;/td&gt;
&lt;td&gt;-31%&lt;/td&gt;
&lt;td&gt;-45%&lt;/td&gt;
&lt;td&gt;-37%&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;-53%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Hierarchy&lt;/td&gt;
&lt;td&gt;-37%&lt;/td&gt;
&lt;td&gt;-61%&lt;/td&gt;
&lt;td&gt;-67%&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;-68%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;API Docs&lt;/td&gt;
&lt;td&gt;-35%&lt;/td&gt;
&lt;td&gt;-45%&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;-59%&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;-53%&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h4&gt;
  
  
  YAML Savings vs JSON (%, L-size)
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fz60wk1w9ooculhvvjok1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fz60wk1w9ooculhvvjok1.png" alt="YAML savings vs JSON by scenario" width="800" height="474"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  MD Savings vs JSON (%, L-size)
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn0vl0ox06w09jse0ncgq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn0vl0ox06w09jse0ncgq.png" alt="MD savings vs JSON by scenario" width="800" height="474"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  TXT Savings vs JSON (%, L-size)
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsu83ntyoo8rm0ug98vbx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsu83ntyoo8rm0ug98vbx.png" alt="TXT savings vs JSON by scenario" width="800" height="474"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  TOON Savings vs JSON (%, L-size)
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F070ng6789odiolaota0y.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F070ng6789odiolaota0y.png" alt="TOON savings vs JSON by scenario" width="800" height="474"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Detailed Charts by Scenario
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Input tokens by scenario: Instructions
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzchje0o1csh4gvxsnh3j.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzchje0o1csh4gvxsnh3j.png" alt="Input tokens: Instructions" width="800" height="485"&gt;&lt;/a&gt;)&lt;/p&gt;

&lt;h4&gt;
  
  
  Input tokens by scenario: Products
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqjvnbl7gwa854kp4jv79.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqjvnbl7gwa854kp4jv79.png" alt="Input tokens: Products" width="800" height="478"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Input tokens by scenario: Tasks
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flc0qtuu6gqkkt9veu9vm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flc0qtuu6gqkkt9veu9vm.png" alt="Input tokens: Tasks" width="800" height="483"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Input tokens by scenario: Rules
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiu0vc317rzolm6qit0gc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiu0vc317rzolm6qit0gc.png" alt="Input tokens: Rules" width="800" height="483"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Input tokens by scenario: Few-shot
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4rys14l7hw6f7n6pk8q3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4rys14l7hw6f7n6pk8q3.png" alt="Input tokens: Few-shot" width="800" height="483"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Input tokens by scenario: Hierarchy
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foayupph0gz3j4x0rigs0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foayupph0gz3j4x0rigs0.png" alt="Input tokens: Hierarchy" width="800" height="555"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Input tokens by scenario: API Docs
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw984vwh94vw6r6e1uzod.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw984vwh94vw6r6e1uzod.png" alt="Input tokens: API Docs" width="800" height="483"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Key Observations
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;TOON is the clear leader for tabular data.&lt;/strong&gt; Product catalogs, task lists, few-shot examples – anything that looks like an array of homogeneous objects. Savings: 62–73% versus JSON.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Markdown is the best all-purpose format.&lt;/strong&gt; A stable 50–65% reduction across all data types. It is the only format that performs consistently well across tables, instructions, and hierarchies.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;YAML is underwhelming.&lt;/strong&gt; Many people expect YAML to be much more compact than JSON. In practice, the savings are only 14–41%. The reason is repeated keys for every array element.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Plain Text wins on API docs.&lt;/strong&gt; For technical specifications, plain text is more efficient than TOON (59% vs 53%). Without extra syntax, descriptive text compresses better.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Scale barely affects the percentage savings.&lt;/strong&gt; The difference between S and L is under 2 percentage points. Format drives efficiency more than data volume does.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Haiku 4.5: When Format Matters
&lt;/h2&gt;

&lt;p&gt;Haiku is the most format-sensitive tier. In 35% of questions, it produced different answers depending on the input format. Accuracy spread reached as high as 36 percentage points between the best and worst format within the same scenario.&lt;/p&gt;

&lt;h3&gt;
  
  
  Accuracy by Scenario
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Accuracy Haiku: Products (product catalog)
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn3auwn1sahrmazcfa9ih.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn3auwn1sahrmazcfa9ih.png" alt="Accuracy Haiku: Products" width="800" height="555"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Accuracy Haiku: Tasks (tasks / roadmap)
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2rf6ypezr9byqkghfq7q.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2rf6ypezr9byqkghfq7q.png" alt="Accuracy Haiku: Tasks" width="800" height="555"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Accuracy Haiku: Hierarchy (organizational hierarchy)
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fg3f3q2rmsufbuj0z6726.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fg3f3q2rmsufbuj0z6726.png" alt="Accuracy Haiku: Hierarchy" width="800" height="555"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Accuracy Haiku: API Docs (documentation)
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsoum4j3h0tmlw7e36qlw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsoum4j3h0tmlw7e36qlw.png" alt="Accuracy Haiku: API Docs" width="800" height="555"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Scenario&lt;/th&gt;
&lt;th&gt;JSON&lt;/th&gt;
&lt;th&gt;YAML&lt;/th&gt;
&lt;th&gt;MD&lt;/th&gt;
&lt;th&gt;TXT&lt;/th&gt;
&lt;th&gt;TOON&lt;/th&gt;
&lt;th&gt;Best&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Products&lt;/td&gt;
&lt;td&gt;63.4%&lt;/td&gt;
&lt;td&gt;61.4%&lt;/td&gt;
&lt;td&gt;69.2%&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;70.2%&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;66.2%&lt;/td&gt;
&lt;td&gt;TXT&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Tasks&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;71.0%&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;65.7%&lt;/td&gt;
&lt;td&gt;66.7%&lt;/td&gt;
&lt;td&gt;56.7%&lt;/td&gt;
&lt;td&gt;65.3%&lt;/td&gt;
&lt;td&gt;JSON&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Hierarchy&lt;/td&gt;
&lt;td&gt;85.7%&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;92.9%&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;85.7%&lt;/td&gt;
&lt;td&gt;78.2%&lt;/td&gt;
&lt;td&gt;85.7%&lt;/td&gt;
&lt;td&gt;YAML&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;API Docs&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;85.7%&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;85.7%&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;57.1%&lt;/td&gt;
&lt;td&gt;78.6%&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;85.7%&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;JSON/YAML/TOON&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Hierarchy shows the sharpest gap:&lt;/strong&gt; YAML (92.9%) vs Markdown (57.1%) – a 36-point difference. Tree-like structures are clearly easier for Haiku to parse in an indentation-based format.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;API Docs: Markdown performs unexpectedly poorly&lt;/strong&gt; – 57.1% vs 85.7% for JSON. For technical specifications with parameters and types, explicit structure matters more than compactness.&lt;/p&gt;

&lt;h3&gt;
  
  
  Accuracy by Size (Haiku)
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Size&lt;/th&gt;
&lt;th&gt;Accuracy&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;S (small data)&lt;/td&gt;
&lt;td&gt;80.3%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;M (medium data)&lt;/td&gt;
&lt;td&gt;67.2%&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Scale matters more than format.&lt;/strong&gt; Accuracy drops by 13 points when moving from S to M – more than the average difference between formats (5.7 points). The implication is straightforward: reduce data volume first, then optimize format.&lt;/p&gt;

&lt;h3&gt;
  
  
  Cost: Haiku
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Format&lt;/th&gt;
&lt;th&gt;Avg tokens&lt;/th&gt;
&lt;th&gt;Cost / request&lt;/th&gt;
&lt;th&gt;100K requests / month&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;JSON&lt;/td&gt;
&lt;td&gt;3,252&lt;/td&gt;
&lt;td&gt;$0.0026&lt;/td&gt;
&lt;td&gt;$260&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;YAML&lt;/td&gt;
&lt;td&gt;2,208&lt;/td&gt;
&lt;td&gt;$0.0018&lt;/td&gt;
&lt;td&gt;$177&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MD&lt;/td&gt;
&lt;td&gt;1,514&lt;/td&gt;
&lt;td&gt;$0.0012&lt;/td&gt;
&lt;td&gt;$121&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;TXT&lt;/td&gt;
&lt;td&gt;1,391&lt;/td&gt;
&lt;td&gt;$0.0011&lt;/td&gt;
&lt;td&gt;$111&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;TOON&lt;/td&gt;
&lt;td&gt;1,226&lt;/td&gt;
&lt;td&gt;$0.0010&lt;/td&gt;
&lt;td&gt;$98&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;JSON -&amp;gt; TOON&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;-62%&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$162/month&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Output Format: Haiku
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Output tokens: S-size (10 countries) – Haiku, Sonnet, Opus
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhfvqb33zeladjlrvo53x.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhfvqb33zeladjlrvo53x.png" alt="Output tokens S-size, all 3 models" width="800" height="483"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Output tokens: M-size (50 countries) – Haiku, Sonnet, Opus
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1q9vzf5yk20kzlf9p6gy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1q9vzf5yk20kzlf9p6gy.png" alt="Output tokens M-size, all 3 models" width="800" height="503"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Requested format&lt;/th&gt;
&lt;th&gt;S (10 countries)&lt;/th&gt;
&lt;th&gt;M (50 countries)&lt;/th&gt;
&lt;th&gt;Savings vs JSON&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;JSON&lt;/td&gt;
&lt;td&gt;465&lt;/td&gt;
&lt;td&gt;1,985&lt;/td&gt;
&lt;td&gt;baseline&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;YAML&lt;/td&gt;
&lt;td&gt;296&lt;/td&gt;
&lt;td&gt;1,352&lt;/td&gt;
&lt;td&gt;-32..36%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Markdown&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;165&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;1,125&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;-43..65%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Plain Text&lt;/td&gt;
&lt;td&gt;294&lt;/td&gt;
&lt;td&gt;1,381&lt;/td&gt;
&lt;td&gt;-30..37%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;TOON&lt;/td&gt;
&lt;td&gt;342&lt;/td&gt;
&lt;td&gt;1,369&lt;/td&gt;
&lt;td&gt;-26..31%&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Markdown is the cheapest output format on Haiku.&lt;/strong&gt; 165 vs 465 tokens on S-size – a 65% reduction. At $4 per 1M output tokens, that matters.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Important: TOON loses on output.&lt;/strong&gt; Haiku does not know the TOON format and, instead of producing compact CSV-like rows, tends to emit verbose plain text that only vaguely resembles TOON. A few-shot example improves TOON output quality, but it still trails Markdown in efficiency.&lt;/p&gt;

&lt;h3&gt;
  
  
  Output-Format Choice: Technical Requirements
&lt;/h3&gt;

&lt;p&gt;Output cost is not the only thing that matters. Often, Claude’s response must be processed programmatically – parsed, inserted into a database, or passed to another service. The best output format depends on who or what is going to read it.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Usage scenario&lt;/th&gt;
&lt;th&gt;Recommendation&lt;/th&gt;
&lt;th&gt;Why&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;User-facing answer in UI&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Markdown&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Renders natively, lowest token cost&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Backend parsing&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;JSON&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Reliable, universal, guaranteed structure&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Config / YAML pipeline&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;YAML&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Human-readable + machine-parsable&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Rows for CSV / spreadsheet&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;TXT&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Minimal overhead, structure via delimiters&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Compact output for TOON SDK&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;TOON&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Only if using Opus, or with a few-shot example&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Rule of thumb:&lt;/strong&gt; if a human reads the output, use Markdown. If code reads it, use JSON or YAML. Do not optimize output cost at the expense of parsing reliability in production.&lt;/p&gt;

&lt;h3&gt;
  
  
  Recommendations for Haiku
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Data type&lt;/th&gt;
&lt;th&gt;Best input&lt;/th&gt;
&lt;th&gt;Accuracy&lt;/th&gt;
&lt;th&gt;Best output&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;System prompts&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;MD&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;stable&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;MD&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Catalogs, lists&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;TXT&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;70.2%&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;MD&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Tasks / roadmap&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;JSON&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;71.0%&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;MD&lt;/strong&gt; or &lt;strong&gt;JSON&lt;/strong&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Hierarchies&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;YAML&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;92.9%&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;YAML&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;API documentation&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;JSON or YAML&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;85.7%&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;JSON&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Few-shot examples&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;TOON&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;65.3% (-0.5% vs JSON)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;MD&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;On Haiku, format matters – especially for hierarchies and API documentation. Use TOON on input where token savings are worth a small accuracy trade-off, but &lt;strong&gt;do not use TOON on output&lt;/strong&gt; without a few-shot example.&lt;/p&gt;




&lt;h2&gt;
  
  
  Sonnet 4.6: Format Affects Cost, Not Quality
&lt;/h2&gt;

&lt;p&gt;Sonnet 4.6 produced identical answers across all five formats. In 100% of questions, the result was the same regardless of how the data was represented. For Sonnet, format optimization is pure cost reduction with no quality trade-off.&lt;/p&gt;

&lt;h3&gt;
  
  
  Accuracy: Format-Invariant
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Accuracy by model and format
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Faman5i1lej65040r6mqc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Faman5i1lej65040r6mqc.png" alt="Accuracy by model and format" width="800" height="483"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Format&lt;/th&gt;
&lt;th&gt;Sonnet 4.6&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;JSON&lt;/td&gt;
&lt;td&gt;89.4%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;YAML&lt;/td&gt;
&lt;td&gt;89.4%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Markdown&lt;/td&gt;
&lt;td&gt;89.4%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Plain Text&lt;/td&gt;
&lt;td&gt;89.4%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;TOON&lt;/td&gt;
&lt;td&gt;89.4%&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The answers are completely identical across all formats. Switching from JSON to TOON saves 62% of input tokens while preserving the same output.&lt;/p&gt;

&lt;h3&gt;
  
  
  Cost: Sonnet
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Format&lt;/th&gt;
&lt;th&gt;Avg tokens&lt;/th&gt;
&lt;th&gt;Cost / request&lt;/th&gt;
&lt;th&gt;100K requests / month&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;JSON&lt;/td&gt;
&lt;td&gt;3,252&lt;/td&gt;
&lt;td&gt;$0.0098&lt;/td&gt;
&lt;td&gt;$975&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;YAML&lt;/td&gt;
&lt;td&gt;2,208&lt;/td&gt;
&lt;td&gt;$0.0066&lt;/td&gt;
&lt;td&gt;$663&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MD&lt;/td&gt;
&lt;td&gt;1,514&lt;/td&gt;
&lt;td&gt;$0.0045&lt;/td&gt;
&lt;td&gt;$454&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;TXT&lt;/td&gt;
&lt;td&gt;1,391&lt;/td&gt;
&lt;td&gt;$0.0042&lt;/td&gt;
&lt;td&gt;$417&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;TOON&lt;/td&gt;
&lt;td&gt;1,226&lt;/td&gt;
&lt;td&gt;$0.0037&lt;/td&gt;
&lt;td&gt;$368&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;JSON -&amp;gt; TOON&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;-62%&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$607/month&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;At 100K requests per month, switching from JSON to TOON saves $607/month. On Sonnet, output costs $15 per 1M tokens, so output optimization also matters.&lt;/p&gt;

&lt;h3&gt;
  
  
  Output Format: Sonnet
&lt;/h3&gt;

&lt;p&gt;Output tokens for Sonnet (estimated as characters ÷ 3.5 chars/token):&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Format&lt;/th&gt;
&lt;th&gt;S (10 countries)&lt;/th&gt;
&lt;th&gt;M (50 countries)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;JSON&lt;/td&gt;
&lt;td&gt;~210&lt;/td&gt;
&lt;td&gt;~1,120&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;YAML&lt;/td&gt;
&lt;td&gt;~195&lt;/td&gt;
&lt;td&gt;~1,023&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Markdown&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~143&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~746&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Plain Text&lt;/td&gt;
&lt;td&gt;~103&lt;/td&gt;
&lt;td&gt;~549&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;TOON&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~86&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~414&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Comparison of output tokens across all three models (S-size):&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhfvqb33zeladjlrvo53x.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhfvqb33zeladjlrvo53x.png" alt="Output tokens S-size, all 3 models" width="800" height="483"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;M-size (50 countries):&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1q9vzf5yk20kzlf9p6gy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1q9vzf5yk20kzlf9p6gy.png" alt="Output tokens M-size, all 3 models" width="800" height="503"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;On Sonnet, TOON output requires a few-shot example.&lt;/strong&gt; Without extra context, Sonnet interprets “TOON format” literally – as an abbreviation connected to cartoons – and returns an irrelevant answer. With a format example in the prompt, it generates correct TOON.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Technical requirements for output on Sonnet&lt;/strong&gt; are the same as on Haiku: if a downstream system parses the response programmatically, use JSON or YAML. If a human is going to read it, use Markdown.&lt;/p&gt;

&lt;h3&gt;
  
  
  Recommendations for Sonnet
&lt;/h3&gt;

&lt;p&gt;On Sonnet, format choice is a pure cost optimization. The logic is simple:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Input data:&lt;/strong&gt; use TOON (for tables) or MD (for instructions / hierarchies)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Human-readable output:&lt;/strong&gt; Markdown (-65% vs JSON)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Machine-parsed output:&lt;/strong&gt; JSON (most reliable) or YAML (more compact, still parseable)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;TOON output:&lt;/strong&gt; add a few-shot example to the prompt; otherwise the answer may be incorrect&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Optimal prompt design: &lt;strong&gt;MD for instructions + TOON for data + a request for MD/JSON output&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Opus 4.6: Maximum Capability, Also Format-Invariant
&lt;/h2&gt;

&lt;p&gt;Opus 4.6 is the strongest model and the most expensive one. Like Sonnet, it is completely insensitive to input format. But Opus has one unique advantage: it knows TOON “out of the box.”&lt;/p&gt;

&lt;h3&gt;
  
  
  Accuracy: Format-Invariant
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Format&lt;/th&gt;
&lt;th&gt;Opus 4.6&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;JSON&lt;/td&gt;
&lt;td&gt;93.5%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;YAML&lt;/td&gt;
&lt;td&gt;93.5%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Markdown&lt;/td&gt;
&lt;td&gt;93.5%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Plain Text&lt;/td&gt;
&lt;td&gt;93.5%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;TOON&lt;/td&gt;
&lt;td&gt;93.5%&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The answers are 100% identical across all formats. Changing format affects only cost.&lt;/p&gt;

&lt;h3&gt;
  
  
  Cost: Opus
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Format&lt;/th&gt;
&lt;th&gt;Avg tokens&lt;/th&gt;
&lt;th&gt;Cost / request&lt;/th&gt;
&lt;th&gt;100K requests / month&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;JSON&lt;/td&gt;
&lt;td&gt;3,252&lt;/td&gt;
&lt;td&gt;$0.0488&lt;/td&gt;
&lt;td&gt;$4,878&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;YAML&lt;/td&gt;
&lt;td&gt;2,208&lt;/td&gt;
&lt;td&gt;$0.0331&lt;/td&gt;
&lt;td&gt;$3,312&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MD&lt;/td&gt;
&lt;td&gt;1,514&lt;/td&gt;
&lt;td&gt;$0.0227&lt;/td&gt;
&lt;td&gt;$2,271&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;TXT&lt;/td&gt;
&lt;td&gt;1,391&lt;/td&gt;
&lt;td&gt;$0.0209&lt;/td&gt;
&lt;td&gt;$2,087&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;TOON&lt;/td&gt;
&lt;td&gt;1,226&lt;/td&gt;
&lt;td&gt;$0.0184&lt;/td&gt;
&lt;td&gt;$1,839&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;JSON -&amp;gt; TOON&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;-62%&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$3,039/month&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;On Opus, switching from JSON to TOON saves over $3,000/month at 100K requests. Output costs $75 per 1M tokens – so format optimization has the largest financial impact here.&lt;/p&gt;

&lt;h3&gt;
  
  
  Output Format: Opus
&lt;/h3&gt;

&lt;p&gt;Output tokens for Opus (estimated as characters ÷ 3.5 chars/token):&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Format&lt;/th&gt;
&lt;th&gt;S (10 countries)&lt;/th&gt;
&lt;th&gt;M (50 countries)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;JSON&lt;/td&gt;
&lt;td&gt;~254&lt;/td&gt;
&lt;td&gt;~1,271&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;YAML&lt;/td&gt;
&lt;td&gt;~286&lt;/td&gt;
&lt;td&gt;~1,414&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Markdown&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~177&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~814&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Plain Text&lt;/td&gt;
&lt;td&gt;~194&lt;/td&gt;
&lt;td&gt;~986&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;TOON&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~106&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~543&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Comparison of output tokens across all three models (S-size):&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhfvqb33zeladjlrvo53x.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhfvqb33zeladjlrvo53x.png" alt="Output tokens S-size, all 3 models" width="800" height="483"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;M-size (50 countries):&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1q9vzf5yk20kzlf9p6gy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1q9vzf5yk20kzlf9p6gy.png" alt="Output tokens M-size, all 3 models" width="800" height="503"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Opus generates TOON without hints.&lt;/strong&gt; That is the key difference from Sonnet and Haiku. Opus knows the format and produces valid TOON output on the first try.&lt;/p&gt;

&lt;h4&gt;
  
  
  Can Claude generate valid TOON output?
&lt;/h4&gt;

&lt;p&gt;&lt;a href="/media/blog/chart-toon-output.png" class="article-body-image-wrapper"&gt;&lt;img src="/media/blog/chart-toon-output.png" alt="TOON output generation across models"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model&lt;/th&gt;
&lt;th&gt;Without example in prompt&lt;/th&gt;
&lt;th&gt;With few-shot example&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Opus 4.6&lt;/td&gt;
&lt;td&gt;Valid TOON&lt;/td&gt;
&lt;td&gt;Valid TOON&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Sonnet 4.6&lt;/td&gt;
&lt;td&gt;Cartoon / irrelevant&lt;/td&gt;
&lt;td&gt;Valid TOON&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Haiku 4.5&lt;/td&gt;
&lt;td&gt;Verbose plain text&lt;/td&gt;
&lt;td&gt;Closer to TOON, but still inaccurate&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;In practical terms, this means: if you need TOON output and want it to work reliably without prompt scaffolding, use Opus.&lt;/p&gt;

&lt;h3&gt;
  
  
  Technical Requirements for Output: When Parsing Matters More Than Cost
&lt;/h3&gt;

&lt;p&gt;On Opus, output costs $75 per 1M tokens – so output-format savings are highly relevant. But the requirements of the downstream system still take priority:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Scenarios where output must be parsed programmatically:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The response goes into a database or structured store – use &lt;strong&gt;JSON&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Another LLM or service consumes the response through an API – use &lt;strong&gt;JSON&lt;/strong&gt; or &lt;strong&gt;YAML&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;The response is part of a pipeline (the next step processes the data) – use &lt;strong&gt;JSON&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;The response is rendered in the UI as text or a document – use &lt;strong&gt;Markdown&lt;/strong&gt; (lowest token cost)&lt;/li&gt;
&lt;li&gt;You need compact machine-readable output and already have a TOON SDK – use &lt;strong&gt;TOON&lt;/strong&gt; (only Opus works reliably without prompt help)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;The key point:&lt;/strong&gt; output on Opus costs $75 per 1M – five times more than input. A 65% output reduction (Markdown vs JSON) can matter even more than input savings. But do not trade away parse reliability just to cut cost.&lt;/p&gt;

&lt;h3&gt;
  
  
  Recommendations for Opus
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Input:&lt;/strong&gt; TOON for tabular data (-62%), MD for instructions (-53%)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Human-readable output:&lt;/strong&gt; Markdown (-65% output tokens)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Machine-parsed output:&lt;/strong&gt; JSON – reliable and universal&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;TOON output:&lt;/strong&gt; works without few-shot – Opus’s unique advantage&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Do not use JSON on input:&lt;/strong&gt; it is the most expensive format with no accuracy benefit&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Summary Results
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Accuracy Across All Models and Formats
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Format&lt;/th&gt;
&lt;th&gt;Haiku 4.5&lt;/th&gt;
&lt;th&gt;Sonnet 4.6&lt;/th&gt;
&lt;th&gt;Opus 4.6&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;JSON&lt;/td&gt;
&lt;td&gt;75.3%&lt;/td&gt;
&lt;td&gt;89.4%&lt;/td&gt;
&lt;td&gt;93.5%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;YAML&lt;/td&gt;
&lt;td&gt;75.1%&lt;/td&gt;
&lt;td&gt;89.4%&lt;/td&gt;
&lt;td&gt;93.5%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Markdown&lt;/td&gt;
&lt;td&gt;69.6%&lt;/td&gt;
&lt;td&gt;89.4%&lt;/td&gt;
&lt;td&gt;93.5%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Plain Text&lt;/td&gt;
&lt;td&gt;70.6%&lt;/td&gt;
&lt;td&gt;89.4%&lt;/td&gt;
&lt;td&gt;93.5%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;TOON&lt;/td&gt;
&lt;td&gt;74.8%&lt;/td&gt;
&lt;td&gt;89.4%&lt;/td&gt;
&lt;td&gt;93.5%&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;For Sonnet and Opus, format does not affect accuracy. For Haiku, it matters materially – especially for hierarchies and documentation.&lt;/p&gt;

&lt;h3&gt;
  
  
  Decision Matrix: Input Format
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Data type&lt;/th&gt;
&lt;th&gt;Haiku&lt;/th&gt;
&lt;th&gt;Sonnet / Opus&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;System prompts / instructions&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;MD&lt;/strong&gt; (-29%)&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;TOON&lt;/strong&gt; or &lt;strong&gt;MD&lt;/strong&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Catalogs, lists&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;TXT&lt;/strong&gt; (70.2%)&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;TOON&lt;/strong&gt; (-62%)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Tasks / roadmap&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;JSON&lt;/strong&gt; (71.0%)&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;TOON&lt;/strong&gt; (-73%)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Business rules&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;JSON&lt;/strong&gt; (stable)&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;TOON&lt;/strong&gt; (-63%)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Few-shot examples&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;TOON&lt;/strong&gt; (≈JSON)&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;TOON&lt;/strong&gt; (-53%)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Hierarchies&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;YAML&lt;/strong&gt; (92.9%)&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;TOON&lt;/strong&gt; or &lt;strong&gt;MD&lt;/strong&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;API documentation&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;JSON/YAML&lt;/strong&gt; (85.7%)&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;TXT&lt;/strong&gt; (-59%)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Decision Matrix: Output Format
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Output consumer&lt;/th&gt;
&lt;th&gt;Recommendation&lt;/th&gt;
&lt;th&gt;Haiku&lt;/th&gt;
&lt;th&gt;Sonnet&lt;/th&gt;
&lt;th&gt;Opus&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;UI / end user&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Markdown&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;native&lt;/td&gt;
&lt;td&gt;native&lt;/td&gt;
&lt;td&gt;native&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;API / JSON parser&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;JSON&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;reliable&lt;/td&gt;
&lt;td&gt;reliable&lt;/td&gt;
&lt;td&gt;reliable&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;YAML pipeline&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;YAML&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;reliable&lt;/td&gt;
&lt;td&gt;reliable&lt;/td&gt;
&lt;td&gt;reliable&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;TOON SDK&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;TOON&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;with few-shot*&lt;/td&gt;
&lt;td&gt;with few-shot*&lt;/td&gt;
&lt;td&gt;native&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CSV / spreadsheet&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;TXT&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;with template&lt;/td&gt;
&lt;td&gt;with template&lt;/td&gt;
&lt;td&gt;with template&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;*Requires a few-shot example in the prompt&lt;/p&gt;




&lt;h2&gt;
  
  
  Benchmark Limitations
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Accuracy was measured only on S+M sizes.&lt;/strong&gt; L-size includes token counts only. Accuracy may degrade more sharply on larger data.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The data is synthetic.&lt;/strong&gt; Catalogs and tasks were script-generated. Real-world data may be messier (missing fields, Unicode, long descriptions).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Automatic scoring covers 4 of 8 cases.&lt;/strong&gt; Cases 1, 4, and 5 require rubric-based evaluation. The accuracy numbers here cover cases 2, 3, 6, and 7.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Sonnet / Opus were tested via subscription (subagents).&lt;/strong&gt; Output-token counts are estimated, not directly measured. Haiku was tested via API.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No A/B test on live traffic.&lt;/strong&gt; This is a laboratory benchmark. The impact on a production product must be validated separately.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The code and data are open – reproduce it, extend it, challenge it.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Surprised Me
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Opus and Sonnet are completely insensitive to format.&lt;/strong&gt; I expected a 3–5% gap. I got 0%. For the higher tiers, format is pure cost optimization.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;YAML is not as efficient as many assume.&lt;/strong&gt; The expectation is usually “YAML is more compact than JSON.” In practice, the savings are only 32%. Repeated keys wipe out much of the benefit from removing braces.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;TOON works on Claude without special training.&lt;/strong&gt; Claude may not have seen much TOON in training data, yet all three tiers parse it correctly – essentially on par with JSON.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Opus knows TOON; Sonnet does not.&lt;/strong&gt; Opus generates valid TOON output without hints. Sonnet interpreted “TOON format” as “cartoon” and produced an irrelevant answer. With a few-shot example, both work correctly.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Markdown is the best output format.&lt;/strong&gt; The gap in output tokens between JSON and Markdown is 65%. At $75 per 1M on Opus, that is significant. It is also the only format every tier generates natively without extra prompting.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;On Haiku, scale matters more than format.&lt;/strong&gt; Accuracy drops from 80.3% (S) to 67.2% (M) – a 13-point drop. The average difference between formats is 5.7 points. On Sonnet and Opus, scale is much less of an issue.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Q: Do these results apply to other models (GPT, Gemini)?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The trends are similar, but the numbers differ. Every model has its own tokenizer. On GPT-5 Nano, YAML shows 62% accuracy on nested data (&lt;a href="https://www.improvingagents.com/blog/best-nested-data-format/" rel="noopener noreferrer"&gt;ImprovingAgents&lt;/a&gt;); on Claude Haiku, it reaches 93%. Use these results for Claude, and other benchmarks for other models.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Q: How were tokens counted?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Using &lt;code&gt;client.messages.count_tokens()&lt;/code&gt; – the standard Anthropic SDK method and production tokenizer. These are the same numbers used for billing. The tokenizer is the same across all tiers.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Q: Why not test XML?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;XML is rarely used in modern LLM workflows. Existing benchmarks (&lt;a href="https://shshell.com/blog/token-efficiency-module-13-lesson-2-format-comparison" rel="noopener noreferrer"&gt;ShShell&lt;/a&gt;) suggest that XML is significantly more expensive than Markdown in token terms, with comparable or worse accuracy.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Q: Is TOON a serious format or just hype?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;TOON v1.0 was released in November 2025 under MIT, and there are SDKs in 6+ languages. For tabular data, the savings are real – 62% on Claude with JSON-level accuracy. Opus generates TOON output without prompting. Other tiers require a few-shot example.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Q: Does the input format affect the output format?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Partially. If you provide data in YAML, Claude is more likely to structure its answer with indentation. But an explicit instruction such as “Return as a Markdown table” overrides that tendency.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Q: Is it worth converting all prompts away from JSON?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;At 100K requests/month on Sonnet, moving from JSON to TOON saves $607/month. On Opus, it saves $3,039/month. For hobby projects with 1K requests, the difference is around $6. Run the math on your own usage.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Q: Can you combine formats in one prompt?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Yes – and that is usually the recommended approach. Markdown for instructions + TOON for data + a request for output in the format you need. Claude handles multi-format prompts well.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Q: Where is the benchmark source code?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/webmaster-ramos/yaml-vs-md-benchmark" rel="noopener noreferrer"&gt;github.com/webmaster-ramos/yaml-vs-md-benchmark&lt;/a&gt;. All 120 data files, 51 questions, ground truth, runner, and scorer are open for reproduction.&lt;/p&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Data format in a prompt is not a cosmetic choice. On the Claude API, the gap between JSON and TOON is 62% on input tokens. Markdown saves 65% on output tokens. At 100K requests/month on Opus, that means $3,039 saved on input and even more on output.&lt;/p&gt;

&lt;p&gt;But the main finding is not about tokens. &lt;strong&gt;Claude Sonnet 4.6 and Opus 4.6 are completely insensitive to format.&lt;/strong&gt; They produced 100% identical answers on JSON, YAML, Markdown, Plain Text, and TOON. For the higher tiers, format optimization is pure savings with no quality trade-off.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Only Haiku 4.5 is meaningfully format-sensitive&lt;/strong&gt; – and only there does the choice of format affect accuracy (by up to 36 percentage points). On Haiku, format should be matched to data type: YAML for hierarchies, JSON for tasks with dependencies.&lt;/p&gt;

&lt;p&gt;Beyond cost, there are technical requirements: if the output must be parsed programmatically, JSON is more reliable than Markdown. If a human reads the answer, Markdown is cheaper. Opus is the only tier that generates TOON natively; Sonnet and Haiku require a few-shot example.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;TL;DR by tier:&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;Haiku 4.5&lt;/th&gt;
&lt;th&gt;Sonnet 4.6&lt;/th&gt;
&lt;th&gt;Opus 4.6&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Does format affect accuracy?&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Yes, by up to 36 points&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Best input (data)&lt;/td&gt;
&lt;td&gt;YAML/JSON/TXT by data type&lt;/td&gt;
&lt;td&gt;TOON&lt;/td&gt;
&lt;td&gt;TOON&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Best input (instructions)&lt;/td&gt;
&lt;td&gt;MD&lt;/td&gt;
&lt;td&gt;MD&lt;/td&gt;
&lt;td&gt;MD&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Best output (human-readable)&lt;/td&gt;
&lt;td&gt;MD&lt;/td&gt;
&lt;td&gt;MD&lt;/td&gt;
&lt;td&gt;MD&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Best output (parsing)&lt;/td&gt;
&lt;td&gt;JSON&lt;/td&gt;
&lt;td&gt;JSON&lt;/td&gt;
&lt;td&gt;JSON&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;TOON output without prompt help&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Yes&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;JSON -&amp;gt; TOON savings&lt;/td&gt;
&lt;td&gt;$162 / 100K&lt;/td&gt;
&lt;td&gt;$607 / 100K&lt;/td&gt;
&lt;td&gt;$3,039 / 100K&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;p&gt;&lt;em&gt;Benchmark run in April 2026 on Claude Opus 4.6, Sonnet 4.6, and Haiku 4.5.&lt;/em&gt;&lt;br&gt;
&lt;em&gt;120 data files, 8 scenarios, 3 sizes, 5 formats, 3 models.&lt;/em&gt;&lt;br&gt;
&lt;em&gt;All code and data: &lt;a href="https://github.com/webmaster-ramos/yaml-vs-md-benchmark" rel="noopener noreferrer"&gt;github.com/webmaster-ramos/yaml-vs-md-benchmark&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>promptengineering</category>
      <category>claude</category>
      <category>llm</category>
    </item>
  </channel>
</rss>
