TERSE Tool Catalog (TTC): Cut Tool Catalog Token Usage by 66.6% in Your AI Agents

#llm #mcp #token #terse

If you’ve ever built or worked with AI agents that use tools via the Model Context Protocol (MCP), you’ve probably felt the pain that nobody talks about out loud:

The tool catalog is eating your entire context window and budget.

A single tool defined in MCP JSON Schema typically consumes 100–270 tokens. With 50 tools installed, you’re already spending 5,000–13,500 tokens before the user even writes their first message.

This isn’t just expensive — it actively hurts performance:

Higher cost on every single request
Lower tool-selection accuracy as the catalog grows (attention dilution)
Less room for actual user instructions, memory, or reasoning

The good news? There’s a clean, elegant solution: TERSE Tool Catalog (TTC).

The Problem with Today’s MCP JSON Schema

The current MCP format was designed for machine-to-machine execution contracts, not for LLM reasoning. As a result:

There is no explicit trigger condition (WHEN) — the LLM has to guess from a free-form description string.
There is no error contract (ERR) — the model has no idea what to do when a tool fails.
There is no retrieval taxonomy (TAGS) — dynamic tool retrieval (RAG over tools) becomes painful.
Verbose parameter descriptions add noise with almost zero signal for the LLM.

The result is high cost + mediocre tool selection.

Introducing the TERSE Tool Catalog (TTC)

TTC is an official extension of the TERSE Format — a specification for dense, deterministic, human-and-machine-readable representations optimized for LLMs.

It is not just a compression of MCP JSON. It is a semantic reformulation of the tool contract.

TTC keeps everything the LLM actually needs for execution and adds three fields that MCP is missing:

PURPOSE — clear one-line intent
WHEN — explicit semantic trigger (the most important field for selection)
ERR — declared failure modes
TAGS — taxonomy for semantic grouping and retrieval

Measured result: average 66.6% token reduction with net information gain.

TTC Syntax — Clean and Simple

TOOL <tool-id>
  PURPOSE: <one-line description of what the tool does>
  IN: <param1>:<type>, <param2>:<type>?
  OUT: <return-type>
  ERR: <error1> | <error2> | <error3>
  WHEN: <natural language trigger condition>
  TAGS: <tag1>, <tag2>, <tag3>

Supported Types

string, int, float, bool
array[string], array[int], etc.
object, any

The ? suffix marks an optional parameter.

Real-World Example: `gmail_send_email`

MCP JSON Schema (208 tokens):

{
  "name": "gmail_send_email",
  "description": "Sends an email message via the Gmail API to one or more recipients...",
  "input_schema": { ... }  // very verbose
}

TTC (55 tokens):

TOOL gmail_send_email
  PURPOSE: send email via Gmail
  IN: to:string, subject:string, body:string, cc:string?
  OUT: message_id:string
  ERR: auth_failed | quota_exceeded | invalid_recipient
  WHEN: user wants to send or compose an email
  TAGS: gmail, email, communication

Same semantic content. 73.6% fewer tokens. And the LLM now has structured fields to make much better decisions.

Real Benchmark (10 Production Tools)

Tool	JSON Schema	TTC	Reduction
gmail_send_email	208	55	73.6%
gmail_read_inbox	121	52	57.0%
drive_list_files	141	53	62.4%
calendar_create_event	262	78	70.2%
slack_send_message	206	69	66.5%
github_create_issue	269	84	68.8%
...	...	...	...
TOTAL (10 tools)	1948	650	66.6%

Projection at scale:

50 tools → ~9,740 → ~3,250 tokens
100 tools → ~19,480 → ~6,500 tokens Savings: ~13,000 tokens per request

Why TTC Works So Well

It follows the core TERSE principles:

Maximum information density per token
Determinism (same input → same output)
Human + machine readability
Full composability (tools → servers → agent context)

And it adds exactly what LLMs need for better reasoning:

WHEN becomes the primary discriminator for tool selection
ERR enables graceful degradation and fallback strategies
TAGS makes dynamic tool retrieval (RAG over tools) trivial

How to Use It in Your Agent Context

At the start of a conversation (or via dynamic retrieval), you inject:

TOOLS v1.0 [3/47]
  MCP gmail v1.2
    TOOL gmail_send_email
      ...
  MCP google_drive v2.0
    TOOL drive_read_file
      ...

With semantic tool retrieval, you only inject the 3–5 most relevant tools per request. Context cost becomes sub-linear no matter how large your total catalog grows.

Reference Converter (Python)

The author provides a ready-to-use reference implementation:

github.com/RudsonCarvalho/terse-format

It converts MCP JSON Schema → TTC with sensible defaults. For production use, you simply add explicit annotations for OUT, ERR, WHEN, and TAGS on the server side.

Planned Future Extensions

EXAMPLE block — input/output examples for few-shot learning
COST annotation — estimated token/latency cost per call
CHAIN annotation — tool dependencies and composition patterns
ALIAS field — alternative trigger phrases
AUTH annotation — required OAuth scopes

Conclusion

The TERSE Tool Catalog is not just a token-saving trick. It is a genuine improvement in agent quality — better tool selection, better error handling, and native support for semantic tool retrieval.

If you work with agents, MCP, LangGraph, CrewAI, AutoGen, or any modern agentic framework, TTC is worth trying today.