DEV Community

Cover image for Cut Your LLM Token Costs 40-50% — Java TOON Support in json-io
John DeRegnaucourt
John DeRegnaucourt

Posted on • Edited on

Cut Your LLM Token Costs 40-50% — Java TOON Support in json-io

Every time your Spring AI tool call returns a JSON result to the LLM, roughly half the tokens are syntactic noise — braces, brackets, quotes, and repeated key names. You're paying for punctuation.

I maintain json-io (listed on json.org). When I saw the excellent TOON articles here on Dev.to (this one, this one) and noticed Java was underserved, I added full TOON support. Here's what it looks like, why it matters, and how to drop it into your Spring AI stack today.

The Problem: JSON's Token Tax

When an LLM calls a tool and gets back structured data, that data is serialized — almost always as JSON. The LLM tokenizes it and feeds it into its context window. Consider this tool call result:

{"employees":[{"name":"Alice Johnson","age":28,"department":"Engineering","salary":95000},{"name":"Bob Smith","age":34,"department":"Marketing","salary":78000},{"name":"Charlie Brown","age":22,"department":"Engineering","salary":72000}]}
Enter fullscreen mode Exit fullscreen mode

Every key is quoted and repeated on every row, every object gets braces, the array gets brackets. The LLM doesn't need any of that structure to understand the data.

What TOON Looks Like

TOON (Token-Oriented Object Notation) is an open format (spec v3.0, Wikipedia) designed specifically for LLM token optimization. The same employee data:

employees[3]{name,age,department,salary}:
  Alice Johnson,28,Engineering,95000
  Bob Smith,34,Marketing,78000
  Charlie Brown,22,Engineering,72000
Enter fullscreen mode Exit fullscreen mode

What's different:

  • No braces or brackets — structure comes from indentation
  • No quotes on keys or values (unless a value contains special characters)
  • Column headers stated once{name,age,department,salary} replaces repeating every key on every row
  • Full fidelity — this round-trips back to the same Java objects

For nested objects, TOON uses indentation:

name: Alice Johnson
address:
  street: 123 Main St
  city: Denver
  state: CO
Enter fullscreen mode Exit fullscreen mode

With key folding enabled, it can flatten one level for even more token savings:

name: Alice Johnson
address.street: 123 Main St
address.city: Denver
address.state: CO
Enter fullscreen mode Exit fullscreen mode

The Numbers: Measured Token Savings

We measured token counts using OpenAI's o200k_base tokenizer (GPT-4o/4.1) across different payload sizes:

Payload JSON Tokens TOON Tokens Savings
3 employees, 4 fields 46 35 24%
10 employees, 4 fields 206 121 41%
20 employees, 4 fields 413 231 44%
25 products, 5 fields 680 459 33%

The pattern: savings scale with repetition. A single object saves ~5%. But tool calls rarely return a single object — they return lists of records, search results, database rows. That's where TOON's tabular format delivers 30-50% reduction in token costs.

Java Usage — Two Lines of Code

// Any Java object -> TOON
String toon = JsonIo.toToon(employee);

// TOON -> typed Java object
Employee emp = JsonIo.fromToon(toon).asClass(Employee.class);

// Works with generics
List<Employee> team = JsonIo.fromToon(toon)
    .asType(new TypeHolder<List<Employee>>(){});

// TOON -> Maps (no class definition needed)
Map map = JsonIo.fromToon(toon).asMap();
Enter fullscreen mode Exit fullscreen mode

Spring AI Integration: One Annotation

If you're using Spring AI for LLM tool calling, you can reduce token costs without changing any application logic. Add the dependency:

<dependency>
    <groupId>com.cedarsoftware</groupId>
    <artifactId>json-io-spring-ai-toon</artifactId>
    <version>4.98.0</version>
</dependency>
Enter fullscreen mode Exit fullscreen mode

Then annotate your tools:

@Tool(description = "Look up employees by department",
      resultConverter = ToonToolCallResultConverter.class)
List<Employee> getEmployees(String department) {
    return employeeRepository.findByDepartment(department);
}
Enter fullscreen mode Exit fullscreen mode

That's it. The ToonToolCallResultConverter serializes the return value to TOON before it enters the LLM's context. Every tool call result is now 40-50% smaller.

Programmatic registration works too:

FunctionToolCallback.builder("getEmployees", this::getEmployees)
    .toolCallResultConverter(new ToonToolCallResultConverter())
    .build();
Enter fullscreen mode Exit fullscreen mode

Structured Output Parsing

TOON works in both directions. When you want the LLM to respond with structured data, ToonBeanOutputConverter teaches it to respond in TOON instead of JSON:

ToonBeanOutputConverter<Person> converter =
    new ToonBeanOutputConverter<>(Person.class);

Person person = chatClient.prompt()
    .user("Get info about John")
    .call()
    .entity(converter);
Enter fullscreen mode Exit fullscreen mode

The converter generates format instructions for the LLM and parses TOON responses back into typed Java objects. The LLM produces fewer output tokens too — you save on both sides.

For generic types:

ToonBeanOutputConverter<List<Person>> converter =
    new ToonBeanOutputConverter<>(new TypeHolder<List<Person>>() {});
Enter fullscreen mode Exit fullscreen mode

Configuration

spring:
  json-io:
    ai:
      tool-call:
        key-folding: true      # Flatten nested keys (default: true)
      output:
        strict-toon: false     # Permissive parsing for LLM output (default)
Enter fullscreen mode Exit fullscreen mode

Spring Boot REST Integration

json-io also has a Spring Boot starter for REST APIs that support JSON, JSON5, and TOON via content negotiation:

<dependency>
    <groupId>com.cedarsoftware</groupId>
    <artifactId>json-io-spring-boot-starter</artifactId>
    <version>4.98.0</version>
</dependency>
Enter fullscreen mode Exit fullscreen mode

Your existing REST controllers automatically gain TOON support:

# Request TOON format
curl -H "Accept: application/vnd.toon" http://localhost:8080/api/employees
Enter fullscreen mode Exit fullscreen mode

No controller changes needed. Supports both Spring MVC and WebFlux.

But Can LLMs Actually Read TOON?

Yes. LLMs don't parse JSON structurally — they understand it from training data. TOON's indentation-based format is similar enough to YAML, Python, and other whitespace-significant formats that current LLMs (GPT-4o, Claude, Gemini) handle it without issues. A FreeCodeCamp article covers the accuracy testing — TOON actually achieved slightly higher accuracy (74%) vs JSON (70%) in mixed-structure benchmarks across 4 models.

Where the Savings Come From

It's worth understanding why the savings vary:

  • Single objects: ~5% savings. Keys are stated once in both formats; TOON just drops the quotes and braces.
  • Lists of uniform objects: 30-44% savings. This is the sweet spot. JSON repeats every key name on every object. TOON's tabular format states them once as column headers.
  • Deeply nested structures: 3-10% savings. Indentation overhead partially offsets bracket/brace savings.

Tool call results are overwhelmingly lists of records — exactly where TOON performs best. If you're running LLM agents or RAG pipelines that pass structured data through context windows, TOON reduces both cost-per-token spend and context window consumption.

Why json-io?

json-io handles things Jackson and Gson can't — cyclic object graphs, automatic polymorphic types, 25+ annotations (including Jackson annotation compatibility with zero compile-time dependency) — all with zero configuration. 60+ built-in Java types, JDK 8-24, ~1MB total footprint. It reads JSON, JSON5, and TOON, and writes all three formats.

Try It

<dependency>
    <groupId>com.cedarsoftware</groupId>
    <artifactId>json-io</artifactId>
    <version>4.98.0</version>
</dependency>
Enter fullscreen mode Exit fullscreen mode

If you're running Spring AI tool calls in production, I'd be curious to hear what token savings you see on your actual payloads. The 40-50% range has been consistent in our benchmarks, but real-world data is always more interesting.

Happy to answer questions in the comments!

Top comments (0)