<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Rudson Kiyoshi Souza Carvalho</title>
    <description>The latest articles on DEV Community by Rudson Kiyoshi Souza Carvalho (@rudsoncarvalho).</description>
    <link>https://dev.to/rudsoncarvalho</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F151609%2F89dcc1f7-ea31-49e0-870e-83221de8c418.jpg</url>
      <title>DEV Community: Rudson Kiyoshi Souza Carvalho</title>
      <link>https://dev.to/rudsoncarvalho</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/rudsoncarvalho"/>
    <language>en</language>
    <item>
      <title>TERSE Tool Catalog (TTC): Cut Tool Catalog Token Usage by 66.6% in Your AI Agents</title>
      <dc:creator>Rudson Kiyoshi Souza Carvalho</dc:creator>
      <pubDate>Tue, 05 May 2026 15:14:53 +0000</pubDate>
      <link>https://dev.to/rudsoncarvalho/terse-tool-catalog-ttc-cut-tool-catalog-token-usage-by-666-in-your-ai-agents-2i9n</link>
      <guid>https://dev.to/rudsoncarvalho/terse-tool-catalog-ttc-cut-tool-catalog-token-usage-by-666-in-your-ai-agents-2i9n</guid>
      <description>&lt;p&gt;If you’ve ever built or worked with &lt;strong&gt;AI agents&lt;/strong&gt; that use tools via the Model Context Protocol (MCP), you’ve probably felt the pain that nobody talks about out loud:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The tool catalog is eating your entire context window and budget.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A single tool defined in MCP JSON Schema typically consumes &lt;strong&gt;100–270 tokens&lt;/strong&gt;. With 50 tools installed, you’re already spending &lt;strong&gt;5,000–13,500 tokens&lt;/strong&gt; &lt;em&gt;before the user even writes their first message&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;This isn’t just expensive — it actively hurts performance:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Higher cost on every single request&lt;/li&gt;
&lt;li&gt;Lower tool-selection accuracy as the catalog grows (attention dilution)&lt;/li&gt;
&lt;li&gt;Less room for actual user instructions, memory, or reasoning&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The good news? There’s a clean, elegant solution: &lt;strong&gt;TERSE Tool Catalog (TTC)&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem with Today’s MCP JSON Schema
&lt;/h2&gt;

&lt;p&gt;The current MCP format was designed for &lt;strong&gt;machine-to-machine execution contracts&lt;/strong&gt;, not for &lt;strong&gt;LLM reasoning&lt;/strong&gt;. As a result:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;There is &lt;strong&gt;no explicit trigger condition&lt;/strong&gt; (&lt;code&gt;WHEN&lt;/code&gt;) — the LLM has to guess from a free-form &lt;code&gt;description&lt;/code&gt; string.&lt;/li&gt;
&lt;li&gt;There is &lt;strong&gt;no error contract&lt;/strong&gt; (&lt;code&gt;ERR&lt;/code&gt;) — the model has no idea what to do when a tool fails.&lt;/li&gt;
&lt;li&gt;There is &lt;strong&gt;no retrieval taxonomy&lt;/strong&gt; (&lt;code&gt;TAGS&lt;/code&gt;) — dynamic tool retrieval (RAG over tools) becomes painful.&lt;/li&gt;
&lt;li&gt;Verbose parameter descriptions add noise with almost zero signal for the LLM.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The result is high cost + mediocre tool selection.&lt;/p&gt;

&lt;h2&gt;
  
  
  Introducing the TERSE Tool Catalog (TTC)
&lt;/h2&gt;

&lt;p&gt;TTC is an official &lt;strong&gt;extension of the TERSE Format&lt;/strong&gt; — a specification for dense, deterministic, human-and-machine-readable representations optimized for LLMs.&lt;/p&gt;

&lt;p&gt;It is &lt;strong&gt;not&lt;/strong&gt; just a compression of MCP JSON. It is a &lt;strong&gt;semantic reformulation&lt;/strong&gt; of the tool contract.&lt;/p&gt;

&lt;p&gt;TTC keeps everything the LLM actually needs for execution and adds three fields that MCP is missing:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;PURPOSE&lt;/code&gt; — clear one-line intent&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;WHEN&lt;/code&gt; — explicit semantic trigger (the most important field for selection)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;ERR&lt;/code&gt; — declared failure modes&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;TAGS&lt;/code&gt; — taxonomy for semantic grouping and retrieval&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Measured result&lt;/strong&gt;: average &lt;strong&gt;66.6% token reduction&lt;/strong&gt; &lt;em&gt;with net information gain&lt;/em&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  TTC Syntax — Clean and Simple
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;TOOL &amp;lt;tool-id&amp;gt;
  PURPOSE: &amp;lt;one-line description of what the tool does&amp;gt;
  IN: &amp;lt;param1&amp;gt;:&amp;lt;type&amp;gt;, &amp;lt;param2&amp;gt;:&amp;lt;type&amp;gt;?
  OUT: &amp;lt;return-type&amp;gt;
  ERR: &amp;lt;error1&amp;gt; | &amp;lt;error2&amp;gt; | &amp;lt;error3&amp;gt;
  WHEN: &amp;lt;natural language trigger condition&amp;gt;
  TAGS: &amp;lt;tag1&amp;gt;, &amp;lt;tag2&amp;gt;, &amp;lt;tag3&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Supported Types
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;string&lt;/code&gt;, &lt;code&gt;int&lt;/code&gt;, &lt;code&gt;float&lt;/code&gt;, &lt;code&gt;bool&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;array[string]&lt;/code&gt;, &lt;code&gt;array[int]&lt;/code&gt;, etc.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;object&lt;/code&gt;, &lt;code&gt;any&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The &lt;code&gt;?&lt;/code&gt; suffix marks an optional parameter.&lt;/p&gt;

&lt;h2&gt;
  
  
  Real-World Example: &lt;code&gt;gmail_send_email&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;MCP JSON Schema (208 tokens):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"gmail_send_email"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"description"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Sends an email message via the Gmail API to one or more recipients..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"input_schema"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;...&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;//&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;very&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;verbose&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;TTC (55 tokens):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;TOOL gmail_send_email
  PURPOSE: send email via Gmail
  IN: to:string, subject:string, body:string, cc:string?
  OUT: message_id:string
  ERR: auth_failed | quota_exceeded | invalid_recipient
  WHEN: user wants to send or compose an email
  TAGS: gmail, email, communication
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Same semantic content. 73.6% fewer tokens.&lt;/strong&gt; And the LLM now has structured fields to make much better decisions.&lt;/p&gt;

&lt;h2&gt;
  
  
  Real Benchmark (10 Production Tools)
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;JSON Schema&lt;/th&gt;
&lt;th&gt;TTC&lt;/th&gt;
&lt;th&gt;Reduction&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;gmail_send_email&lt;/td&gt;
&lt;td&gt;208&lt;/td&gt;
&lt;td&gt;55&lt;/td&gt;
&lt;td&gt;73.6%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;gmail_read_inbox&lt;/td&gt;
&lt;td&gt;121&lt;/td&gt;
&lt;td&gt;52&lt;/td&gt;
&lt;td&gt;57.0%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;drive_list_files&lt;/td&gt;
&lt;td&gt;141&lt;/td&gt;
&lt;td&gt;53&lt;/td&gt;
&lt;td&gt;62.4%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;calendar_create_event&lt;/td&gt;
&lt;td&gt;262&lt;/td&gt;
&lt;td&gt;78&lt;/td&gt;
&lt;td&gt;70.2%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;slack_send_message&lt;/td&gt;
&lt;td&gt;206&lt;/td&gt;
&lt;td&gt;69&lt;/td&gt;
&lt;td&gt;66.5%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;github_create_issue&lt;/td&gt;
&lt;td&gt;269&lt;/td&gt;
&lt;td&gt;84&lt;/td&gt;
&lt;td&gt;68.8%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;...&lt;/td&gt;
&lt;td&gt;...&lt;/td&gt;
&lt;td&gt;...&lt;/td&gt;
&lt;td&gt;...&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;TOTAL (10 tools)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;1948&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;650&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;66.6%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Projection at scale&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;50 tools → ~9,740 → ~3,250 tokens&lt;/li&gt;
&lt;li&gt;100 tools → ~19,480 → ~6,500 tokens
&lt;strong&gt;Savings: ~13,000 tokens per request&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Why TTC Works So Well
&lt;/h2&gt;

&lt;p&gt;It follows the core TERSE principles:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Maximum information density per token&lt;/li&gt;
&lt;li&gt;Determinism (same input → same output)&lt;/li&gt;
&lt;li&gt;Human + machine readability&lt;/li&gt;
&lt;li&gt;Full composability (tools → servers → agent context)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And it adds exactly what LLMs need for better reasoning:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;WHEN&lt;/code&gt; becomes the primary discriminator for tool selection&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;ERR&lt;/code&gt; enables graceful degradation and fallback strategies&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;TAGS&lt;/code&gt; makes dynamic tool retrieval (RAG over tools) trivial&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  How to Use It in Your Agent Context
&lt;/h2&gt;

&lt;p&gt;At the start of a conversation (or via dynamic retrieval), you inject:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;TOOLS v1.0 [3/47]
  MCP gmail v1.2
    TOOL gmail_send_email
      ...
  MCP google_drive v2.0
    TOOL drive_read_file
      ...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With semantic tool retrieval, you only inject the 3–5 most relevant tools per request. Context cost becomes sub-linear no matter how large your total catalog grows.&lt;/p&gt;

&lt;h2&gt;
  
  
  Reference Converter (Python)
&lt;/h2&gt;

&lt;p&gt;The author provides a ready-to-use reference implementation:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;github.com/RudsonCarvalho/terse-format&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;It converts MCP JSON Schema → TTC with sensible defaults. For production use, you simply add explicit annotations for &lt;code&gt;OUT&lt;/code&gt;, &lt;code&gt;ERR&lt;/code&gt;, &lt;code&gt;WHEN&lt;/code&gt;, and &lt;code&gt;TAGS&lt;/code&gt; on the server side.&lt;/p&gt;

&lt;h2&gt;
  
  
  Planned Future Extensions
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;EXAMPLE&lt;/code&gt; block — input/output examples for few-shot learning&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;COST&lt;/code&gt; annotation — estimated token/latency cost per call&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;CHAIN&lt;/code&gt; annotation — tool dependencies and composition patterns&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;ALIAS&lt;/code&gt; field — alternative trigger phrases&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;AUTH&lt;/code&gt; annotation — required OAuth scopes&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;The TERSE Tool Catalog is not just a token-saving trick. It is a &lt;strong&gt;genuine improvement in agent quality&lt;/strong&gt; — better tool selection, better error handling, and native support for semantic tool retrieval.&lt;/p&gt;

&lt;p&gt;If you work with agents, MCP, LangGraph, CrewAI, AutoGen, or any modern agentic framework, TTC is worth trying today.&lt;/p&gt;

&lt;h2&gt;
  
  
  Links
&lt;/h2&gt;

&lt;p&gt;📄 Full spec (Zenodo): &lt;a href="https://doi.org/10.5281/zenodo.19869007" rel="noopener noreferrer"&gt;https://doi.org/10.5281/zenodo.19869007&lt;/a&gt;&lt;br&gt;&lt;br&gt;
💻 GitHub: &lt;a href="https://github.com/RudsonCarvalho/terse-format/tree/main/extensions/ttc" rel="noopener noreferrer"&gt;https://github.com/RudsonCarvalho/terse-format/tree/main/extensions/ttc&lt;/a&gt;&lt;br&gt;&lt;br&gt;
🌐 Landing page: &lt;a href="https://rudsoncarvalho.github.io/terse-format/" rel="noopener noreferrer"&gt;https://rudsoncarvalho.github.io/terse-format/&lt;/a&gt;&lt;br&gt;&lt;br&gt;
📦 TERSE Format (parent spec): &lt;a href="https://doi.org/10.5281/zenodo.19058364" rel="noopener noreferrer"&gt;https://doi.org/10.5281/zenodo.19058364&lt;/a&gt;&lt;/p&gt;

</description>
      <category>llm</category>
      <category>mcp</category>
      <category>token</category>
      <category>terse</category>
    </item>
    <item>
      <title>Your AI agent wastes 13,000 tokens before saying "hello"</title>
      <dc:creator>Rudson Kiyoshi Souza Carvalho</dc:creator>
      <pubDate>Wed, 29 Apr 2026 01:22:37 +0000</pubDate>
      <link>https://dev.to/rudsoncarvalho/your-ai-agent-wastes-13000-tokens-before-saying-hello-3141</link>
      <guid>https://dev.to/rudsoncarvalho/your-ai-agent-wastes-13000-tokens-before-saying-hello-3141</guid>
      <description>&lt;p&gt;And you probably have no idea.&lt;/p&gt;




&lt;p&gt;If you have an agent with 50 MCP tools installed, here's what happens before any user message is processed:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"gmail_send_email"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"description"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Sends an email message via the Gmail API to one or more 
    recipients. Use this tool when the user explicitly requests to send, 
    compose and send, or deliver an email message to someone."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"input_schema"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"object"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"required"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"to"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"subject"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"body"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"properties"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"to"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"string"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"description"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"The recipient email address or comma-separated list"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"subject"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"string"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"description"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"The subject line of the email"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"body"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"string"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"description"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"The body content of the email in plain text or HTML"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's &lt;strong&gt;~195 tokens&lt;/strong&gt;. Per tool. Before anything else.&lt;/p&gt;

&lt;p&gt;50 tools × 195 tokens = &lt;strong&gt;9,750 tokens of pure overhead&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;And that's just the catalog. You haven't touched user context, conversation history, documents, or anything useful yet.&lt;/p&gt;




&lt;h2&gt;
  
  
  "But there's prompt caching, right?"
&lt;/h2&gt;

&lt;p&gt;Yes. It reduces the financial cost to ~10% of the base rate.&lt;/p&gt;

&lt;p&gt;But caching &lt;strong&gt;does not reduce attention cost&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Those tokens still occupy the context window. The model still attends to all of them on every request. And if you use dynamic tool retrieval — selecting different tools per request based on user intent — the cache breaks on every different selection.&lt;/p&gt;

&lt;p&gt;The bill doesn't disappear. It just gets cheaper.&lt;/p&gt;




&lt;h2&gt;
  
  
  The real problem nobody talks about
&lt;/h2&gt;

&lt;p&gt;MCP JSON Schema was designed as a tool execution contract. Not as a semantic tool selection contract.&lt;/p&gt;

&lt;p&gt;The result: information critical for LLM reasoning is either absent or buried in free-form text:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;No error contract&lt;/strong&gt; — the LLM doesn't know what to do when &lt;code&gt;auth_failed&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No explicit trigger&lt;/strong&gt; — it has to infer "when to use this tool" from a paragraph of description&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No retrieval taxonomy&lt;/strong&gt; — no standard way to group or filter tools by domain&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Verbose AND semantically incomplete. The worst of both worlds.&lt;/p&gt;




&lt;h2&gt;
  
  
  TTC — TERSE Tool Catalog
&lt;/h2&gt;

&lt;p&gt;I spent the last few weeks solving this problem. The result is an extension of the &lt;a href="https://github.com/RudsonCarvalho/terse-format" rel="noopener noreferrer"&gt;TERSE Format&lt;/a&gt; called &lt;strong&gt;TTC — TERSE Tool Catalog&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The same tool above in TTC:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="s"&gt;TOOL gmail_send_email&lt;/span&gt;
  &lt;span class="s"&gt;PURPOSE&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="s"&gt;send email via Gmail&lt;/span&gt;
  &lt;span class="s"&gt;IN&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="s"&gt;to:string, subject:string, body:string, cc:string?&lt;/span&gt;
  &lt;span class="s"&gt;OUT&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="s"&gt;message_id:string&lt;/span&gt;
  &lt;span class="s"&gt;ERR&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="s"&gt;auth_failed | quota_exceeded | invalid_recipient&lt;/span&gt;
  &lt;span class="s"&gt;WHEN&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="s"&gt;user wants to send or compose an email&lt;/span&gt;
  &lt;span class="s"&gt;TAGS&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="s"&gt;gmail, email, communication&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;~55 tokens. 73.6% reduction.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;And notice what was &lt;em&gt;added&lt;/em&gt;, not just removed:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Field&lt;/th&gt;
&lt;th&gt;MCP JSON&lt;/th&gt;
&lt;th&gt;TTC&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;ERR — failure contract&lt;/td&gt;
&lt;td&gt;❌ absent&lt;/td&gt;
&lt;td&gt;✅ explicit&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;WHEN — selection trigger&lt;/td&gt;
&lt;td&gt;❌ buried&lt;/td&gt;
&lt;td&gt;✅ explicit&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;TAGS — retrieval taxonomy&lt;/td&gt;
&lt;td&gt;❌ absent&lt;/td&gt;
&lt;td&gt;✅ explicit&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  It's not compression. It's reallocation.
&lt;/h2&gt;

&lt;p&gt;This is the most important point in the spec:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;TTC does not reduce tokens by removing semantic content. It reduces syntactic and documentary overhead from JSON Schema — which serves human readability, not LLM reasoning — and reinvests part of those savings into explicit tool-selection semantics.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The actual math:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;MCP JSON Schema:         ~195 tokens per tool
TTC without new fields:   ~35 tokens
TTC with all fields:      ~65 tokens

The 30-token "reinvestment" buys:
  ERR  → failure contract (absent from MCP)
  WHEN → selection trigger (absent from MCP)
  TAGS → retrieval taxonomy (absent from MCP)

Result: 195 → 65 tokens. -66.6%.
But those 65 tokens carry higher reasoning signal
than the original 195.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is &lt;strong&gt;net reasoning-signal gain&lt;/strong&gt; — not information gain in the classical sense. A critic might say you removed content (parameter descriptions, JSON Schema constraints). Correct. Content that serves human documentation, not LLM inference.&lt;/p&gt;




&lt;h2&gt;
  
  
  Real benchmark — 10 measured tools
&lt;/h2&gt;

&lt;p&gt;Measured with BPE tokenizer (cl100k_base) on 10 real MCP tool definitions:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;JSON Schema&lt;/th&gt;
&lt;th&gt;TTC&lt;/th&gt;
&lt;th&gt;Reduction&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;gmail_send_email&lt;/td&gt;
&lt;td&gt;208&lt;/td&gt;
&lt;td&gt;55&lt;/td&gt;
&lt;td&gt;73.6%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;calendar_create_event&lt;/td&gt;
&lt;td&gt;262&lt;/td&gt;
&lt;td&gt;78&lt;/td&gt;
&lt;td&gt;70.2%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;github_create_issue&lt;/td&gt;
&lt;td&gt;269&lt;/td&gt;
&lt;td&gt;84&lt;/td&gt;
&lt;td&gt;68.8%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;jira_create_ticket&lt;/td&gt;
&lt;td&gt;254&lt;/td&gt;
&lt;td&gt;77&lt;/td&gt;
&lt;td&gt;69.7%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;slack_send_message&lt;/td&gt;
&lt;td&gt;206&lt;/td&gt;
&lt;td&gt;69&lt;/td&gt;
&lt;td&gt;66.5%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Total (10 tools)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;1,948&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;650&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;66.6%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Projections for larger catalogs:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Catalog size&lt;/th&gt;
&lt;th&gt;JSON Schema&lt;/th&gt;
&lt;th&gt;TTC&lt;/th&gt;
&lt;th&gt;Absolute saving&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;20 tools&lt;/td&gt;
&lt;td&gt;~3,896&lt;/td&gt;
&lt;td&gt;~1,300&lt;/td&gt;
&lt;td&gt;~2,596 tokens&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;50 tools&lt;/td&gt;
&lt;td&gt;~9,740&lt;/td&gt;
&lt;td&gt;~3,250&lt;/td&gt;
&lt;td&gt;~6,490 tokens&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;100 tools&lt;/td&gt;
&lt;td&gt;~19,480&lt;/td&gt;
&lt;td&gt;~6,500&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~12,980 tokens&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The absolute saving grows linearly. The larger the catalog, the higher the ROI.&lt;/p&gt;




&lt;h2&gt;
  
  
  Normative WHEN vocabulary
&lt;/h2&gt;

&lt;p&gt;A natural language field without a standard creates another problem: two independent MCP server authors write incompatible WHEN conditions, degrading selection accuracy in large catalogs.&lt;/p&gt;

&lt;p&gt;TTC v1.0 solves this with a normative vocabulary:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;WHEN&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;user [wants|requests|asks|needs|intends] to [action] [object]&lt;/span&gt;

&lt;span class="na"&gt;Conformant examples&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;WHEN&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;user wants to send an email message&lt;/span&gt;
  &lt;span class="na"&gt;WHEN&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;user requests to list files in Google Drive&lt;/span&gt;
  &lt;span class="na"&gt;WHEN&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;user needs to create a calendar event&lt;/span&gt;

&lt;span class="na"&gt;Non-conformant&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;WHEN&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;send email          ← missing intent verb&lt;/span&gt;
  &lt;span class="na"&gt;WHEN&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;user email          ← missing action verb&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Accuracy simulation (TF-IDF cosine similarity, 12 tools, 36 queries):&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Condition&lt;/th&gt;
&lt;th&gt;Accuracy&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;MCP free-form description&lt;/td&gt;
&lt;td&gt;63.9%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;TTC WHEN controlled vocabulary&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;72.2%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Delta&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;+8.3 pp&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;em&gt;Caveat: TF-IDF simulation, not a real LLM benchmark. Directional evidence.&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Where it works best
&lt;/h2&gt;

&lt;p&gt;✅ &lt;strong&gt;Large catalogs&lt;/strong&gt; (20+ tools) — where absolute savings justify migration&lt;br&gt;&lt;br&gt;
✅ &lt;strong&gt;Local and smaller models&lt;/strong&gt; — Qwen 7B, Llama 3, Mistral — no cache, narrow windows&lt;br&gt;&lt;br&gt;
✅ &lt;strong&gt;Multi-agent pipelines&lt;/strong&gt; — overhead compounds with every context handoff&lt;br&gt;&lt;br&gt;
✅ &lt;strong&gt;RAG over tools&lt;/strong&gt; — compact TTC is ideal for vector DB indexing and subset injection  &lt;/p&gt;

&lt;p&gt;❌ Small catalogs with large LLM and wide context — marginal gain&lt;br&gt;&lt;br&gt;
❌ Replacing JSON Schema in API execution contracts — not the use case  &lt;/p&gt;




&lt;h2&gt;
  
  
  Links
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;📄 &lt;strong&gt;Full spec (Zenodo):&lt;/strong&gt; &lt;a href="https://doi.org/10.5281/zenodo.19869007" rel="noopener noreferrer"&gt;https://doi.org/10.5281/zenodo.19869007&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;💻 &lt;strong&gt;GitHub:&lt;/strong&gt; &lt;a href="https://github.com/RudsonCarvalho/terse-format/tree/main/extensions/ttc" rel="noopener noreferrer"&gt;https://github.com/RudsonCarvalho/terse-format/tree/main/extensions/ttc&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;🌐 &lt;strong&gt;Landing page:&lt;/strong&gt; &lt;a href="https://rudsoncarvalho.github.io/terse-format/" rel="noopener noreferrer"&gt;https://rudsoncarvalho.github.io/terse-format/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;📦 &lt;strong&gt;TERSE Format (parent spec):&lt;/strong&gt; &lt;a href="https://doi.org/10.5281/zenodo.19058364" rel="noopener noreferrer"&gt;https://doi.org/10.5281/zenodo.19058364&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;If your agent has 50 tools installed and you haven't thought about catalog attention cost yet — now is a good time.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Tags:&lt;/strong&gt; &lt;code&gt;ai&lt;/code&gt; &lt;code&gt;agents&lt;/code&gt; &lt;code&gt;mcp&lt;/code&gt; &lt;code&gt;llm&lt;/code&gt; &lt;code&gt;tooling&lt;/code&gt; &lt;code&gt;performance&lt;/code&gt; &lt;code&gt;opensource&lt;/code&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>agents</category>
      <category>mcp</category>
      <category>llm</category>
    </item>
    <item>
      <title>Seu agente de IA está desperdiçando 13.000 tokens antes de dizer "oi"</title>
      <dc:creator>Rudson Kiyoshi Souza Carvalho</dc:creator>
      <pubDate>Wed, 29 Apr 2026 01:22:14 +0000</pubDate>
      <link>https://dev.to/rudsoncarvalho/seu-agente-de-ia-esta-desperdicando-13000-tokens-antes-de-dizer-oi-3pfg</link>
      <guid>https://dev.to/rudsoncarvalho/seu-agente-de-ia-esta-desperdicando-13000-tokens-antes-de-dizer-oi-3pfg</guid>
      <description>&lt;p&gt;E você provavelmente nem sabe disso.&lt;/p&gt;




&lt;p&gt;Se você tem um agente com 50 tools MCP instaladas, aqui está o que acontece antes de qualquer mensagem do usuário ser processada:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"gmail_send_email"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"description"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Sends an email message via the Gmail API to one or more 
    recipients. Use this tool when the user explicitly requests to send, 
    compose and send, or deliver an email message to someone."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"input_schema"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"object"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"required"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"to"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"subject"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"body"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"properties"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"to"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"string"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"description"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"The recipient email address or comma-separated list"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"subject"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"string"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; 
        &lt;/span&gt;&lt;span class="nl"&gt;"description"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"The subject line of the email"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"body"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"string"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"description"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"The body content of the email in plain text or HTML"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Isso é &lt;strong&gt;~195 tokens&lt;/strong&gt;. Por ferramenta. Antes de qualquer coisa.&lt;/p&gt;

&lt;p&gt;50 tools × 195 tokens = &lt;strong&gt;9.750 tokens de overhead puro&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;E isso é só o catálogo. Ainda não chegou no contexto do usuário, na memória da conversa, nos documentos, em nada.&lt;/p&gt;




&lt;h2&gt;
  
  
  "Mas tem prompt caching, não?"
&lt;/h2&gt;

&lt;p&gt;Sim. E reduz o custo financeiro para ~10% do valor original. &lt;/p&gt;

&lt;p&gt;Mas caching &lt;strong&gt;não reduz o custo de atenção&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Esses tokens continuam ocupando a janela de contexto. O modelo ainda processa tudo na atenção a cada request. E se você usa retrieval dinâmico de tools — selecionando ferramentas diferentes por request — o cache quebra em cada seleção diferente.&lt;/p&gt;

&lt;p&gt;A conta não some. Ela só fica mais barata.&lt;/p&gt;




&lt;h2&gt;
  
  
  O problema real que ninguém fala
&lt;/h2&gt;

&lt;p&gt;O MCP JSON Schema foi projetado como contrato de execução de ferramenta. Não como contrato semântico de seleção.&lt;/p&gt;

&lt;p&gt;Resultado: informação crítica para o LLM raciocinar está ausente ou enterrada em texto livre:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Sem contrato de erro&lt;/strong&gt; — o LLM não sabe o que fazer quando &lt;code&gt;auth_failed&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Sem trigger explícito&lt;/strong&gt; — tem que inferir "quando usar essa tool" de uma description de parágrafo&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Sem taxonomia de retrieval&lt;/strong&gt; — não tem como agrupar ou filtrar tools por domínio&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Ou seja: verboso E semanticamente incompleto. O pior dos dois mundos.&lt;/p&gt;




&lt;h2&gt;
  
  
  TTC — TERSE Tool Catalog
&lt;/h2&gt;

&lt;p&gt;Passei as últimas semanas resolvendo esse problema. O resultado é uma extensão do &lt;a href="https://github.com/RudsonCarvalho/terse-format" rel="noopener noreferrer"&gt;TERSE Format&lt;/a&gt; chamada &lt;strong&gt;TTC — TERSE Tool Catalog&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;A mesma ferramenta acima em TTC:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;TOOL gmail_send_email
  PURPOSE: send email via Gmail
  IN: to:string, subject:string, body:string, cc:string?
  OUT: message_id:string
  ERR: auth_failed | quota_exceeded | invalid_recipient
  WHEN: user wants to send or compose an email
  TAGS: gmail, email, communication
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;~55 tokens. Redução de 73.6%.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;E repara no que foi adicionado, não só no que foi removido:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Campo&lt;/th&gt;
&lt;th&gt;MCP JSON&lt;/th&gt;
&lt;th&gt;TTC&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;ERR — contrato de falha&lt;/td&gt;
&lt;td&gt;❌ ausente&lt;/td&gt;
&lt;td&gt;✅ explícito&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;WHEN — trigger de seleção&lt;/td&gt;
&lt;td&gt;❌ enterrado&lt;/td&gt;
&lt;td&gt;✅ explícito&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;TAGS — taxonomia de retrieval&lt;/td&gt;
&lt;td&gt;❌ ausente&lt;/td&gt;
&lt;td&gt;✅ explícito&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Não é compressão. É realocação.
&lt;/h2&gt;

&lt;p&gt;Esse é o ponto mais importante do spec, e vale deixar claro:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;TTC não economiza tokens removendo conteúdo semântico. Ele elimina overhead sintático e documental do JSON Schema — que serve legibilidade humana, não raciocínio de LLM — e reinveste parte dessa economia em semântica explícita de seleção de ferramentas.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;A conta real:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;MCP JSON Schema:        ~195 tokens por tool
TTC sem campos novos:    ~35 tokens
TTC com todos os campos: ~65 tokens

Os 30 tokens de "reinvestimento" compram:
  ERR  → contrato de falha (ausente no MCP)
  WHEN → trigger semântico (ausente no MCP)  
  TAGS → taxonomia de retrieval (ausente no MCP)

Resultado: 195 → 65 tokens. -66.6%.
Mas os 65 tokens carregam mais sinal de raciocínio
do que os 195 originais.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;É &lt;strong&gt;ganho líquido de sinal de raciocínio&lt;/strong&gt;, não ganho de informação no sentido clássico.&lt;/p&gt;




&lt;h2&gt;
  
  
  Benchmark real — 10 tools medidas
&lt;/h2&gt;

&lt;p&gt;Medi com tokenizador BPE (cl100k_base) em 10 definições reais de tools MCP:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;JSON Schema&lt;/th&gt;
&lt;th&gt;TTC&lt;/th&gt;
&lt;th&gt;Redução&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;gmail_send_email&lt;/td&gt;
&lt;td&gt;208&lt;/td&gt;
&lt;td&gt;55&lt;/td&gt;
&lt;td&gt;73.6%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;calendar_create_event&lt;/td&gt;
&lt;td&gt;262&lt;/td&gt;
&lt;td&gt;78&lt;/td&gt;
&lt;td&gt;70.2%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;github_create_issue&lt;/td&gt;
&lt;td&gt;269&lt;/td&gt;
&lt;td&gt;84&lt;/td&gt;
&lt;td&gt;68.8%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;jira_create_ticket&lt;/td&gt;
&lt;td&gt;254&lt;/td&gt;
&lt;td&gt;77&lt;/td&gt;
&lt;td&gt;69.7%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;slack_send_message&lt;/td&gt;
&lt;td&gt;206&lt;/td&gt;
&lt;td&gt;69&lt;/td&gt;
&lt;td&gt;66.5%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Total (10 tools)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;1.948&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;650&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;66.6%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Projeção para catálogos maiores:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Catálogo&lt;/th&gt;
&lt;th&gt;JSON Schema&lt;/th&gt;
&lt;th&gt;TTC&lt;/th&gt;
&lt;th&gt;Economia&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;20 tools&lt;/td&gt;
&lt;td&gt;~3.896&lt;/td&gt;
&lt;td&gt;~1.300&lt;/td&gt;
&lt;td&gt;~2.596 tokens&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;50 tools&lt;/td&gt;
&lt;td&gt;~9.740&lt;/td&gt;
&lt;td&gt;~3.250&lt;/td&gt;
&lt;td&gt;~6.490 tokens&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;100 tools&lt;/td&gt;
&lt;td&gt;~19.480&lt;/td&gt;
&lt;td&gt;~6.500&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~12.980 tokens&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;A economia absoluta cresce linearmente. Quanto maior o catálogo, maior o ROI.&lt;/p&gt;




&lt;h2&gt;
  
  
  Vocabulário normativo para WHEN
&lt;/h2&gt;

&lt;p&gt;Um campo de linguagem natural sem padrão cria outro problema: dois autores de servidores MCP diferentes escrevem &lt;code&gt;WHEN&lt;/code&gt; de formas incompatíveis, degradando a acurácia de seleção em catálogos grandes.&lt;/p&gt;

&lt;p&gt;O TTC v1.0 resolve isso com vocabulário normativo:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;WHEN: user [wants|requests|asks|needs|intends] to [ação] [objeto]

Exemplos conformantes:
  WHEN: user wants to send an email message
  WHEN: user requests to list files in Google Drive
  WHEN: user needs to create a calendar event

Não-conformante:
  WHEN: send email          ← falta verbo de intenção
  WHEN: user email          ← falta verbo de ação
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Simulação de acurácia (TF-IDF cosine similarity, 12 tools, 36 queries):&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Condição&lt;/th&gt;
&lt;th&gt;Acurácia&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;MCP description livre&lt;/td&gt;
&lt;td&gt;63.9%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;TTC WHEN vocabulário controlado&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;72.2%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Delta&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;+8.3 pp&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;em&gt;Caveat: simulação TF-IDF, não benchmark real com LLM. Evidência direcional.&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Onde funciona melhor
&lt;/h2&gt;

&lt;p&gt;✅ &lt;strong&gt;Catálogos grandes&lt;/strong&gt; (20+ tools) — onde a economia absoluta justifica a migração&lt;br&gt;&lt;br&gt;
✅ &lt;strong&gt;Modelos locais e menores&lt;/strong&gt; — Qwen 7B, Llama 3, Mistral — sem cache, janelas estreitas&lt;br&gt;&lt;br&gt;
✅ &lt;strong&gt;Pipelines multi-agente&lt;/strong&gt; — o overhead se acumula a cada passagem de contexto&lt;br&gt;&lt;br&gt;
✅ &lt;strong&gt;RAG de tools&lt;/strong&gt; — TTC compacto é ideal para indexar em vetor DB e injetar subsets  &lt;/p&gt;

&lt;p&gt;❌ Catálogos pequenos com LLM grande e contexto amplo — ganho marginal&lt;br&gt;&lt;br&gt;
❌ Substituir JSON Schema em contratos de API — não é o propósito  &lt;/p&gt;




&lt;h2&gt;
  
  
  Links
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;📄 &lt;strong&gt;Spec completo (Zenodo):&lt;/strong&gt; &lt;a href="https://doi.org/10.5281/zenodo.19869007" rel="noopener noreferrer"&gt;https://doi.org/10.5281/zenodo.19869007&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;💻 &lt;strong&gt;GitHub:&lt;/strong&gt; &lt;a href="https://github.com/RudsonCarvalho/terse-format/tree/main/extensions/ttc" rel="noopener noreferrer"&gt;https://github.com/RudsonCarvalho/terse-format/tree/main/extensions/ttc&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;🌐 &lt;strong&gt;Landing page:&lt;/strong&gt; &lt;a href="https://rudsoncarvalho.github.io/terse-format/" rel="noopener noreferrer"&gt;https://rudsoncarvalho.github.io/terse-format/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;📦 &lt;strong&gt;TERSE Format (parent spec):&lt;/strong&gt; &lt;a href="https://doi.org/10.5281/zenodo.19058364" rel="noopener noreferrer"&gt;https://doi.org/10.5281/zenodo.19058364&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;Se o seu agente tem 50 tools instaladas e você ainda não pensou no custo de atenção do catálogo, esse é um bom momento para repensar.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Tags:&lt;/strong&gt; &lt;code&gt;ai&lt;/code&gt; &lt;code&gt;agents&lt;/code&gt; &lt;code&gt;mcp&lt;/code&gt; &lt;code&gt;llm&lt;/code&gt; &lt;code&gt;tooling&lt;/code&gt; &lt;code&gt;performance&lt;/code&gt; &lt;code&gt;opensource&lt;/code&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>agents</category>
      <category>mcp</category>
      <category>llm</category>
    </item>
    <item>
      <title>COA-MAS v2: A Meta-Framework for Cross-Domain Multi-Agent Governance</title>
      <dc:creator>Rudson Kiyoshi Souza Carvalho</dc:creator>
      <pubDate>Wed, 01 Apr 2026 23:29:15 +0000</pubDate>
      <link>https://dev.to/rudsoncarvalho/coa-mas-v2-a-meta-framework-for-cross-domain-multi-agent-governance-4mji</link>
      <guid>https://dev.to/rudsoncarvalho/coa-mas-v2-a-meta-framework-for-cross-domain-multi-agent-governance-4mji</guid>
      <description>&lt;p&gt;AI agents are crossing organizational boundaries. They call tools in partner domains, delegate tasks to external services, and operate in chains where no single actor sees the full picture.&lt;/p&gt;

&lt;p&gt;COA-MAS v1 solved the intra-domain governance problem — a four-layer architecture, the Action Claim contract, and the AASG enforcement boundary that ensures zero cognitive load at runtime. If you haven't read it, the paper is at &lt;a href="https://doi.org/10.5281/zenodo.19057202" rel="noopener noreferrer"&gt;doi.org/10.5281/zenodo.19057202&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The cross-domain problem is different. And it took a full architectural pivot to solve it correctly.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Silver Bullet Fallacy
&lt;/h2&gt;

&lt;p&gt;Early iterations of COA-MAS v2 tried to build a universal calibration mechanism — a way to translate risk scores between domains with different semantic spaces. After several rounds of debate and stress-testing, it became clear that this approach has the same flaw as trying to replace PIX, TED, wire transfers, and letters of credit with a single payment instrument.&lt;/p&gt;

&lt;p&gt;Each of those instruments exists because different transaction contexts require different guarantees. Resilience in distributed systems comes from routing to the right pattern based on context — not from finding the pattern that works everywhere.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Thesis
&lt;/h2&gt;

&lt;p&gt;COA-MAS v2 is a meta-framework, not a protocol. It standardizes one thing: the &lt;strong&gt;Action Intent&lt;/strong&gt; — a universal artifact that any federated governance pattern can consume. The choice of execution topology is delegated to a &lt;strong&gt;Pattern Selection Protocol&lt;/strong&gt; negotiated during trust peering.&lt;/p&gt;

&lt;p&gt;The Action Intent is the common currency. The federation mode is the exchange mechanism.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Action Intent
&lt;/h2&gt;

&lt;p&gt;The Action Intent is the "passport" of the COA-MAS federation. It is a standardized, cryptographically signed declaration of:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Who&lt;/strong&gt; is acting — SPIFFE identity, delegation chain, GOV-RISK attestation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;What&lt;/strong&gt; they intend to do — tool URI, operation type, resource scope&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;What effect&lt;/strong&gt; they declare — reversibility, estimated scope, data sensitivity&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cryptographic binding&lt;/strong&gt; — ephemeral DPoP public key for proof-of-possession&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Domain A's internal policy, prompts, and risk weights are never transmitted. Only the declared intent, authenticated by Domain A's governance layer.&lt;/p&gt;

&lt;p&gt;If Domain A lies — declares &lt;code&gt;bounded_set&lt;/code&gt; but attempts a full-table deletion — the signed intent becomes irrefutable forensic evidence. The problem moves from governance mathematics to organizational accountability, backed by cryptographic proof.&lt;/p&gt;

&lt;p&gt;The canonical JSON Schema is published at &lt;a href="https://doi.org/10.5281/zenodo.19376419" rel="noopener noreferrer"&gt;doi.org/10.5281/zenodo.19376419&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Four Federation Modes
&lt;/h2&gt;

&lt;p&gt;The Pattern Selection Protocol routes each cross-domain interaction to the appropriate mode based on trust distance, acceptable latency, and cognitive burden tolerance.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Mode 0 — Intra-Domain (COA-MAS V1)&lt;/strong&gt;&lt;br&gt;
Same domain. Deterministic, microsecond latency, zero external dependencies. The foundation everything else builds on.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Mode 1 — Sovereign Visa&lt;/strong&gt;&lt;br&gt;
Domain A submits the Action Intent to Domain B's authorization endpoint. Domain B's GOV-RISK evaluates it using its own Executable Culture — full sovereignty, no calibration across semantic spaces. GOV-RISK-B issues a standard COA-MAS v1 Action Claim with DPoP binding. AASG-B validates a locally-trusted signature at runtime. Zero cognitive load.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Mode 2 — Ambassador&lt;/strong&gt;&lt;br&gt;
Domain B doesn't expose tools to foreign agents at all. It exposes an agent communication interface. Domain A's intent becomes the opening message of an A2A conversation. Domain B's Ambassador agent formulates its own plan, submits it to GOV-RISK-B via Mode 0, and executes locally. Maximum isolation. Non-deterministic latency.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Mode 3 — Clearinghouse&lt;/strong&gt;&lt;br&gt;
A neutral Domain C — a regulated hub both domains trust — evaluates the intent and issues a universally-accepted Action Claim. Appropriate for regulated industries (Open Finance, healthcare prior authorization). Opt-in only: it trades polycentric sovereignty for operational simplicity.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Future Mode 4 — ZK-Policy&lt;/strong&gt;&lt;br&gt;
The CAGA-compliant target. Domain A generates a zero-knowledge proof of correct policy execution without revealing internal data. Domain B verifies mathematically. Not implementable in production today due to ZKML hardware constraints — but the meta-framework is explicitly designed to incorporate it as Mode 4 when viable, without requiring changes to the Action Intent schema or SPIFFE infrastructure.&lt;/p&gt;
&lt;h2&gt;
  
  
  The Pattern Selection Protocol
&lt;/h2&gt;

&lt;p&gt;Domains don't negotiate a single mode — they negotiate a Federation Policy that maps operation families and resource classes to modes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mode_by_operation"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"read"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"mode"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"ttl_seconds"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1800&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"single_use"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"delete"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"mode"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"ttl_seconds"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;120&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nl"&gt;"single_use"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"configure"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"mode"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mode_by_resource_class"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"pii"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;       &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"mode"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"regulated"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"mode"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The same pair of domains can use Mode 1 for routine reads and Mode 2 for infrastructure operations — without renegotiating the peering relationship.&lt;/p&gt;

&lt;h2&gt;
  
  
  Positioning Against CAGA
&lt;/h2&gt;

&lt;p&gt;Meyman [SSRN 6299461] formalizes the Cross-Agent Governance Alignment (CAGA) problem and identifies zero-knowledge proofs as the theoretically correct solution. COA-MAS v2 is the operationally deployable answer while ZKML hardware matures — trading full policy confidentiality for sub-millisecond runtime enforcement, zero integration cost for Domain B, and compatibility with stochastic LLM-based GOV-RISKs.&lt;/p&gt;

&lt;p&gt;The relationship is complementary. CAGA defines what a correct solution must prove. COA-MAS v2 defines how production systems navigate the space between the theoretically ideal and the operationally deployable.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's Published
&lt;/h2&gt;

&lt;p&gt;📄 &lt;strong&gt;Working Paper v0.3&lt;/strong&gt;&lt;br&gt;
&lt;a href="https://doi.org/10.5281/zenodo.19376738" rel="noopener noreferrer"&gt;doi.org/10.5281/zenodo.19376738&lt;/a&gt;&lt;br&gt;
&lt;a href="https://zenodo.org/records/19376739" rel="noopener noreferrer"&gt;zenodo.org/records/19376739&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;🔧 &lt;strong&gt;Action Intent Schema v1.0.0&lt;/strong&gt;&lt;br&gt;
&lt;a href="https://doi.org/10.5281/zenodo.19376419" rel="noopener noreferrer"&gt;doi.org/10.5281/zenodo.19376419&lt;/a&gt;&lt;br&gt;
&lt;a href="https://zenodo.org/records/19376420" rel="noopener noreferrer"&gt;zenodo.org/records/19376420&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;📚 &lt;strong&gt;COA-MAS v1 (foundation)&lt;/strong&gt;&lt;br&gt;
&lt;a href="https://doi.org/10.5281/zenodo.19057202" rel="noopener noreferrer"&gt;doi.org/10.5281/zenodo.19057202&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;If you're building cross-domain multi-agent systems and the governance layer is an afterthought, the meta-framework and the schema are open access. Feedback, critique, and stress-testing welcome.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>security</category>
      <category>architecture</category>
      <category>multiagent</category>
    </item>
    <item>
      <title>AI Agents Can Delete Your Production Database. Here's the Governance Framework That Stops Them.</title>
      <dc:creator>Rudson Kiyoshi Souza Carvalho</dc:creator>
      <pubDate>Tue, 31 Mar 2026 12:51:25 +0000</pubDate>
      <link>https://dev.to/rudsoncarvalho/ai-agents-can-delete-your-production-database-heres-the-governance-framework-that-stops-them-ccj</link>
      <guid>https://dev.to/rudsoncarvalho/ai-agents-can-delete-your-production-database-heres-the-governance-framework-that-stops-them-ccj</guid>
      <description>&lt;p&gt;&lt;em&gt;This article presents COA-MAS — a governance framework for autonomous agents grounded in organizational theory, institutional design, and normative multi-agent systems research. The full paper is published on Zenodo: &lt;a href="https://zenodo.org/records/19057202" rel="noopener noreferrer"&gt;doi.org/10.5281/zenodo.19057202&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem No One Is Talking About
&lt;/h2&gt;

&lt;p&gt;Something unusual happened in early 2026. The IETF published a formal Internet-Draft on AI agent authentication and authorization. Eight major technology companies released version 1.0 of the Agent-to-Agent Protocol. And a widely-read post demonstrated why the prevailing credential model for AI agents was structurally broken.&lt;/p&gt;

&lt;p&gt;The convergence wasn't coincidental. It was the signal that a structural problem — long present in early agentic deployments — had reached the threshold of production consequence.&lt;/p&gt;

&lt;p&gt;We've built agents that can:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Delete production databases&lt;/li&gt;
&lt;li&gt;Execute financial transactions&lt;/li&gt;
&lt;li&gt;Modify business logic&lt;/li&gt;
&lt;li&gt;Spawn other agents&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And we gave them &lt;strong&gt;API keys&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;An API key authorizes &lt;em&gt;access&lt;/em&gt;. It does not authorize a &lt;em&gt;specific action with a specific impact in a specific context&lt;/em&gt;. That distinction is the entire problem.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Structural Failure Mode: Distributed Cognitive Chaos
&lt;/h2&gt;

&lt;p&gt;I call this failure mode &lt;strong&gt;Distributed Cognitive Chaos (DCC)&lt;/strong&gt;: the structural consequence of deploying agents without formal authority hierarchies, authorization contracts, or enforcement boundaries.&lt;/p&gt;

&lt;p&gt;DCC has three symptoms:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Action hallucination&lt;/strong&gt; — an agent executes an action it was never authorized to perform, because nothing formally defined "authorized"&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Mandate drift&lt;/strong&gt; — through a chain of agent-to-agent delegations, the original human intent gets distorted beyond recognition&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Accountability collapse&lt;/strong&gt; — when something goes wrong, there is no tamper-evident record connecting the action to the authority that (supposedly) permitted it&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This is not a new problem. It's the oldest problem in organizational theory: how do you coordinate partially autonomous actors toward collective goals while preventing any individual actor from harming the collective?&lt;/p&gt;

&lt;p&gt;Herbert Simon identified it in 1947. Elinor Ostrom solved it in 1990. We just haven't applied those solutions to AI agents yet.&lt;/p&gt;




&lt;h2&gt;
  
  
  COA-MAS: A Governance Framework Grounded in Theory
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;COA-MAS&lt;/strong&gt; (&lt;em&gt;Cognitive Organization Architecture for Multi-Agent Systems&lt;/em&gt;) is my answer. It synthesizes four intellectual traditions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Simon's bounded rationality&lt;/strong&gt; → why agents need external governance&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Ostrom's institutional design principles&lt;/strong&gt; → how to structure governance for durability&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Normative multi-agent systems research&lt;/strong&gt; → how to formalize governance as computable norms&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Sociotechnical systems theory&lt;/strong&gt; → how to make social norms technically enforceable&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The framework has three components. Each answers a different question.&lt;/p&gt;




&lt;h2&gt;
  
  
  Component 1: The Four-Layer Architecture
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Question: Who is in charge?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Think of it as a corporate structure for AI agents:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌─────────────────────────────────────────────┐
│ LAYER 4 — STRATEGIC ORCHESTRATION                  │
│ Receives human objectives · decomposes into tasks  │
└─────────────────────────────────────────────┘
                        ↕
┌─────────────────────────────────────────────┐
│ LAYER 3 — COGNITIVE GOVERNANCE                     │
│ Evaluates proposed actions · issues authorization  │
│ documents · maintains audit ledger                 │
└─────────────────────────────────────────────┘
                        ↕
┌─────────────────────────────────────────────┐
│ LAYER 2 — FUNCTIONAL SPECIALIZATION                │
│ Domain agents · execute tasks within their         │
│ cognitive authority boundary                       │
└─────────────────────────────────────────────┘
                        ↕
┌─────────────────────────────────────────────┐
│ LAYER 1 — EXECUTABLE CULTURE (Constitutional)      │
│ Versioned YAML policies · weights · thresholds     │
│ Human-authored before runtime. Immutable during.   │
└─────────────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The critical insight, drawn from both Simon and Ostrom, is the &lt;strong&gt;separation between those who propose actions and those who authorize them&lt;/strong&gt;. An agent cannot authorize its own actions. This mirrors the principle of checks and balances in constitutional systems: the body that proposes is not the body that authorizes is not the body that records.&lt;/p&gt;




&lt;h2&gt;
  
  
  Component 2: The Action Claim
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Question: What exactly is the agent authorized to do?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;An &lt;strong&gt;Action Claim&lt;/strong&gt; is a formal authorization document that agents must present before executing any real-world action. It's analogous to a building permit — not just "you're allowed to build," but: the location, the dimensions, the materials, the timeline, the inspector, and the version of the building code that governed the approval.&lt;/p&gt;

&lt;p&gt;The Action Claim has three parts:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="err"&gt;//&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;DECLARED&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;FIELDS&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;—&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;filled&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;by&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;agent&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"proposed_transition"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"DELETE expired sessions older than 90 days"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"originating_goal"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"scheduled maintenance task #4421"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"delegation_chain"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"human:ops-team"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"agent:orchestrator-01"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"agent:db-cleaner"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"estimated_impact"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"destructivity"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.25&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"data_exposure"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.00&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"resource_consumption"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.30&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"privilege_escalation"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.00&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"logic_integrity"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.05&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"recursive_autonomy"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.10&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;

  &lt;/span&gt;&lt;span class="err"&gt;//&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;DERIVED&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;FIELDS&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;—&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;filled&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;by&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;GOV-RISK&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;(Layer&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="err"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"justification_gap"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.08&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"decision"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"APPROVE"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"governance_signature"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"sha256:a3f9..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"policy_digest"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"sha256:1b2c..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;

  &lt;/span&gt;&lt;span class="err"&gt;//&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;AUDIT&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;FIELDS&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;—&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;filled&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;by&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;infrastructure&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"ac_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ac-2026-03-31-00421"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"state"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"AUTHORIZED"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"committed_at"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2026-03-31T14:22:01Z"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The tripartite structure reflects Ostrom's principle of separating operational decisions from the collective-choice rules that govern them. The agent operates at the operational level; Layer 3 applies institutional norms; the audit trail creates an immutable record connecting every decision to the rules that governed it.&lt;/p&gt;




&lt;h2&gt;
  
  
  Component 3: The AASG (Autonomous Agent Security Gateway)
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Question: How is authorization enforced?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Think of the AASG as a customs inspector at the boundary between the agents' cognitive world and the real world of executing tools.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Agent Cognition (A2A) ────────────────► Real World (MCP)
                              │
                         [ AASG ]
                              │
                    Checks exactly 3 things:
                    1. Is the Action Claim valid and signed?
                    2. Is the agent identity correct?
                    3. Does the actual tool call match 
                       what was declared?
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The AASG is a &lt;strong&gt;physically separate process&lt;/strong&gt; — not a library in the agent, not a plugin in the MCP server. This separation is the architectural expression of Simon's insight: governance constraints must be external to the decision-maker they constrain. An agent cannot reliably police itself.&lt;/p&gt;

&lt;h3&gt;
  
  
  What happens when an agent lies?
&lt;/h3&gt;

&lt;p&gt;Consider an agent authorized to clean up expired sessions. It gets an AUTHORIZED Action Claim with &lt;code&gt;destructivity: 0.25&lt;/code&gt;. Then it tries to execute:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;DELETE&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt;  &lt;span class="c1"&gt;-- no WHERE clause&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The AASG projects the actual tool call onto the governance space:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Declared: destructivity = 0.25 (bounded deletion)
Actual:   destructivity = 1.00 (full table wipe)

Congruence delta: 0.75 &amp;gt;&amp;gt; threshold (0.20)
→ ERR_AASG_003: CONGRUENCE_VIOLATION
→ Action never reaches the database
→ Merkle Ledger: INTERCEPTED entry recorded
→ GOV-AUDIT: SCOPE_SUBDECLARATION_DETECTED alert
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The action is blocked before any damage occurs — not discovered in a log review afterward.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Justification Gap: The First Computable Proportionality Metric
&lt;/h2&gt;

&lt;p&gt;The core innovation in COA-MAS is the &lt;strong&gt;Justification Gap&lt;/strong&gt; — a metric that answers: &lt;em&gt;is this specific action proportional to the mandate that authorized it?&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The governance space has six dimensions:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dimension&lt;/th&gt;
&lt;th&gt;What It Measures&lt;/th&gt;
&lt;th&gt;Weight&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;d₁ Destructivity&lt;/td&gt;
&lt;td&gt;Irreversible state change (DELETE, DROP, PURGE)&lt;/td&gt;
&lt;td&gt;0.30&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;d₂ Data Exposure&lt;/td&gt;
&lt;td&gt;Cross-zone data movement, PII handling&lt;/td&gt;
&lt;td&gt;0.25&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;d₃ Resource Consumption&lt;/td&gt;
&lt;td&gt;Financial/computational impact&lt;/td&gt;
&lt;td&gt;0.10&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;d₄ Privilege Escalation&lt;/td&gt;
&lt;td&gt;IAM creation, credential access&lt;/td&gt;
&lt;td&gt;0.20&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;d₅ Logic Integrity&lt;/td&gt;
&lt;td&gt;Business logic modification&lt;/td&gt;
&lt;td&gt;0.10&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;d₆ Recursive Autonomy&lt;/td&gt;
&lt;td&gt;Sub-agent spawning, delegation depth&lt;/td&gt;
&lt;td&gt;0.05&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The formula:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Gⱼ(V_A, V_M) = Σ wᵢ · φ(aᵢ, mᵢ)

where φ(aᵢ, mᵢ) =
  0               if aᵢ ≤ mᵢ   (within mandate)
  ((aᵢ - mᵢ)/mᵢ)² if aᵢ &amp;gt; mᵢ  (quadratic penalty)
  +∞              if aᵢ &amp;gt; 0 ∧ mᵢ = 0  (absolute prohibition)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The decision:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Gⱼ ≤ 0.15  →  APPROVE
0.15 &amp;lt; Gⱼ ≤ 0.40  →  ESCALATE (human review with full dossier)
Gⱼ &amp;gt; 0.40 or Gⱼ = +∞  →  BLOCK
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The quadratic penalty means small excesses may be tolerable, but large excesses signal a qualitatively different action from what was authorized. The ESCALATE tier implements "meaningful human oversight" as required by EU AI Act Article 14 — with actual quantitative evidence, not just a kill switch.&lt;/p&gt;




&lt;h2&gt;
  
  
  Mandate Laundering: The Attack You Didn't Know Existed
&lt;/h2&gt;

&lt;p&gt;Here's the attack that no existing security framework addresses: &lt;strong&gt;mandate laundering&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;In a delegation chain &lt;code&gt;Human → Agent₁ → Agent₂ → ... → Agentₖ&lt;/code&gt;, each intermediate agent can marginally expand the mandate it passes on. Each local expansion looks proportionate. But the cumulative expansion is not.&lt;/p&gt;

&lt;p&gt;COA-MAS anchors the Justification Gap to the &lt;strong&gt;root human mandate&lt;/strong&gt;, regardless of intermediate expansions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;G_chain(Aₖ) = Gⱼ(V_{Aₖ}, V_{M₀})  ← root mandate, always

G_total = 0.30 · G_local + 0.70 · G_chain
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Non-Improvement Theorem&lt;/strong&gt;: For any permissive subdelegation, &lt;code&gt;G_chain&lt;/code&gt; is monotone non-decreasing. You cannot launder your way out of the original constraint.&lt;/p&gt;




&lt;h2&gt;
  
  
  How COA-MAS Fits the Standards Ecosystem
&lt;/h2&gt;

&lt;p&gt;COA-MAS doesn't compete with existing standards — it implements what they defer:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Initiative&lt;/th&gt;
&lt;th&gt;What It Solves&lt;/th&gt;
&lt;th&gt;What It Defers&lt;/th&gt;
&lt;th&gt;COA-MAS Role&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;IETF draft-klrc-aiagent-auth&lt;/td&gt;
&lt;td&gt;Identity, authentication, authorization (SPIFFE, OAuth 2.0)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Policy model explicitly out of scope&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Implements the policy model&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;A2A Protocol v1.0&lt;/td&gt;
&lt;td&gt;Agent coordination standard&lt;/td&gt;
&lt;td&gt;Authorization at execution boundary&lt;/td&gt;
&lt;td&gt;AASG is the enforcement point A2A lacks&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MCP v1.0&lt;/td&gt;
&lt;td&gt;Agent-to-tool communication&lt;/td&gt;
&lt;td&gt;No semantic authorization layer&lt;/td&gt;
&lt;td&gt;AASG is the authorization gate MCP doesn't have&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The IETF draft's Section 12 explicitly states: "the policy model and document format are out of scope." That is precisely where COA-MAS contributes.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Failure Mode Transition
&lt;/h2&gt;

&lt;p&gt;The most consequential architectural property of COA-MAS is the failure mode it introduces.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Traditional agentic systems&lt;/strong&gt;: fail semantically and silently. The agent reinterprets a guideline, slightly expands a scope, finds an unanticipated interpretation. Detectable only after damage, through log analysis.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;COA-MAS&lt;/strong&gt;: introduces the explicit &lt;code&gt;CONGRUENCE_VIOLATION&lt;/code&gt; failure mode. When an agent attempts an action that violates its declared impact vector, the AASG returns:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A specific error code&lt;/li&gt;
&lt;li&gt;The dimension violated&lt;/li&gt;
&lt;li&gt;The quantitative delta&lt;/li&gt;
&lt;li&gt;A Merkle Ledger entry with full context&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is the organizational equivalent of a building inspector catching a code violation before the foundation is poured — not after the building collapses.&lt;/p&gt;




&lt;h2&gt;
  
  
  What's Published
&lt;/h2&gt;

&lt;p&gt;The full paper, &lt;strong&gt;COA-MAS: A Governance Framework for Autonomous Agents in Production Environments&lt;/strong&gt;, is available on Zenodo:&lt;/p&gt;

&lt;p&gt;📄 &lt;strong&gt;&lt;a href="https://zenodo.org/records/19057202" rel="noopener noreferrer"&gt;zenodo.org/records/19057202&lt;/a&gt;&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
🔑 &lt;strong&gt;DOI: &lt;a href="https://doi.org/10.5281/zenodo.19057202" rel="noopener noreferrer"&gt;doi.org/10.5281/zenodo.19057202&lt;/a&gt;&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
📜 License: CC BY 4.0&lt;/p&gt;

&lt;p&gt;The paper covers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Full formal specification of the Action Claim ontology&lt;/li&gt;
&lt;li&gt;Complete mathematical treatment of the Justification Gap&lt;/li&gt;
&lt;li&gt;Attack pattern neutralization (scope subdeclaration, decomposition attack, mandate laundering)&lt;/li&gt;
&lt;li&gt;EU AI Act regulatory alignment (Articles 9, 11, 13, 14)&lt;/li&gt;
&lt;li&gt;Positioning against IETF, A2A, MCP, and AIMS model&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Final Thought
&lt;/h2&gt;

&lt;p&gt;The governance of autonomous agents is not a new problem. Simon identified its theoretical roots in 1947. Ostrom identified the institutional design solutions in 1990. Normative MAS researchers formalized the computational analogues through the 1990s and 2000s.&lt;/p&gt;

&lt;p&gt;What's new in 2026 is the urgency.&lt;/p&gt;

&lt;p&gt;Agents that can delete production databases and execute financial transactions are being deployed without the governance infrastructure this body of knowledge prescribes.&lt;/p&gt;

&lt;p&gt;COA-MAS applies established principles to a new domain. The question is not whether governance is necessary — it's whether we build it before or after the first major incident.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;If you're building multi-agent systems in production, I'd be genuinely interested in feedback on whether these primitives map to the problems you're encountering. The paper is open access — feel free to cite, critique, or extend.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;— Rudson Kiyoshi Souza Carvalho, Independent Researcher&lt;/em&gt;&lt;br&gt;&lt;br&gt;
&lt;em&gt;&lt;a href="https://doi.org/10.5281/zenodo.19057202" rel="noopener noreferrer"&gt;doi.org/10.5281/zenodo.19057202&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>security</category>
      <category>architecture</category>
      <category>multiagent</category>
    </item>
    <item>
      <title>TERSE — A New Serialization Format Built for LLMs</title>
      <dc:creator>Rudson Kiyoshi Souza Carvalho</dc:creator>
      <pubDate>Tue, 31 Mar 2026 12:10:36 +0000</pubDate>
      <link>https://dev.to/rudsoncarvalho/terse-a-new-serialization-format-built-for-llms-4n34</link>
      <guid>https://dev.to/rudsoncarvalho/terse-a-new-serialization-format-built-for-llms-4n34</guid>
      <description>&lt;p&gt;&lt;em&gt;JSON is the default. But defaults were built for a different world.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;Every time you send structured data to a Large Language Model, you pay for it token by token. And if you're using JSON — which almost everyone is — you're paying for a lot of characters that carry no information.&lt;/p&gt;

&lt;p&gt;Take this simple payload:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"user_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1001&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"active"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"data"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"feature_a"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"feature_b"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"verified"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Count the noise: braces, quotes around every key and string value, commas, colons with spaces. Now imagine this multiplied across thousands of API calls per day. That's real money.&lt;/p&gt;

&lt;p&gt;I built &lt;strong&gt;TERSE&lt;/strong&gt; to address this.&lt;/p&gt;




&lt;h2&gt;
  
  
  What is TERSE?
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;TERSE&lt;/strong&gt; (Token-Efficient Recursive Serialization Encoding) is a text-based data serialization format designed to represent the complete JSON data model with substantially fewer tokens — making it significantly more cost-efficient for use as input to Large Language Models.&lt;/p&gt;

&lt;p&gt;The same payload in TERSE:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;user_id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1001&lt;/span&gt;
&lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;active&lt;/span&gt;
&lt;span class="na"&gt;data&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;feature_a feature_b&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
&lt;span class="na"&gt;verified&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;T&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Same information. ~47% fewer tokens.&lt;/p&gt;




&lt;h2&gt;
  
  
  How it compares
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Format&lt;/th&gt;
&lt;th&gt;Token savings vs JSON&lt;/th&gt;
&lt;th&gt;Full JSON coverage?&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;JSON&lt;/td&gt;
&lt;td&gt;baseline&lt;/td&gt;
&lt;td&gt;✓&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;YAML&lt;/td&gt;
&lt;td&gt;~20%&lt;/td&gt;
&lt;td&gt;✓ (verbose arrays)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;TOON&lt;/td&gt;
&lt;td&gt;~40%&lt;/td&gt;
&lt;td&gt;✗ (flat data only)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;TERSE&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~47%&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;✓&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;YAML is a genuine improvement over JSON — it's more compact and covers the full data model. But it was designed for humans to write, not for LLMs to consume. Verbose arrays (&lt;code&gt;- item&lt;/code&gt; per line), full-word booleans (&lt;code&gt;true&lt;/code&gt;/&lt;code&gt;false&lt;/code&gt;), and a notoriously complex parser spec limit its token savings.&lt;/p&gt;

&lt;p&gt;TOON goes further on token reduction but falls apart with nested objects — it only works for flat, uniform tabular data. If your payload has any nesting, TOON can't represent it.&lt;/p&gt;

&lt;p&gt;TERSE was designed to close that gap: full JSON data model coverage, with token efficiency as the primary design constraint.&lt;/p&gt;




&lt;h2&gt;
  
  
  The five design principles
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1. Bare strings&lt;/strong&gt; — identifiers and common values require no quotation marks. &lt;code&gt;production&lt;/code&gt; stays &lt;code&gt;production&lt;/code&gt;, not &lt;code&gt;"production"&lt;/code&gt;. Quotes are reserved for strings that actually need them — those containing spaces, reserved characters, or special syntax.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Compact primitives&lt;/strong&gt; — &lt;code&gt;null&lt;/code&gt;, &lt;code&gt;true&lt;/code&gt;, and &lt;code&gt;false&lt;/code&gt; become single characters: &lt;code&gt;~&lt;/code&gt;, &lt;code&gt;T&lt;/code&gt;, &lt;code&gt;F&lt;/code&gt;. Three of the most common values in any payload, each reduced to one token.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Implicit delimiters&lt;/strong&gt; — spaces separate values inside objects and arrays. No trailing commas, no colons between array elements.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Schema arrays&lt;/strong&gt; — the biggest token win for tabular data. Uniform arrays of objects declare their fields once, then list values positionally:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;users&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="c1"&gt;#[id name role active]&lt;/span&gt;
  &lt;span class="s"&gt;1 Alice admin T&lt;/span&gt;
  &lt;span class="s"&gt;2 Bruno editor T&lt;/span&gt;
  &lt;span class="s"&gt;3 Carla viewer F&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The equivalent JSON repeats &lt;code&gt;"id"&lt;/code&gt;, &lt;code&gt;"name"&lt;/code&gt;, &lt;code&gt;"role"&lt;/code&gt;, &lt;code&gt;"active"&lt;/code&gt; on every single row. For a 100-row dataset, that's 400 unnecessary key repetitions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. Recursive structure&lt;/strong&gt; — all constructs nest arbitrarily. Objects inside arrays inside schema arrays — all valid, all compact. No flat-only limitations.&lt;/p&gt;




&lt;h2&gt;
  
  
  A real example: nested order
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;JSON&lt;/strong&gt; (~180 tokens):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"orderId"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ORD-001"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"customer"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Rafael Torres"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"email"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"r@email.com"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"items"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"sku"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"A1"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"qty"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"price"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;9.99&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"sku"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"B3"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"qty"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"price"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;24.50&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"paid"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"notes"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;TERSE&lt;/strong&gt; (~95 tokens):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;orderId&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ORD-001&lt;/span&gt;
&lt;span class="na"&gt;customer&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Rafael&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;Torres"&lt;/span&gt; &lt;span class="nv"&gt;email&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;&lt;span class="nv"&gt;r@email.com&lt;/span&gt;&lt;span class="pi"&gt;}&lt;/span&gt;
&lt;span class="na"&gt;items&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="c1"&gt;#[sku qty price]&lt;/span&gt;
  &lt;span class="s"&gt;A1 2 &lt;/span&gt;&lt;span class="m"&gt;9.99&lt;/span&gt;
  &lt;span class="s"&gt;B3 1 &lt;/span&gt;&lt;span class="m"&gt;24.50&lt;/span&gt;
&lt;span class="na"&gt;paid&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;T&lt;/span&gt;
&lt;span class="na"&gt;notes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;~&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is where TERSE separates itself from TOON and CSV — deeply nested structures work exactly as expected.&lt;/p&gt;




&lt;h2&gt;
  
  
  You don't write TERSE by hand
&lt;/h2&gt;

&lt;p&gt;The workflow is identical to JSON:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Your data (object/dict)
      ↓
serialize()        ← terse-js or terse-py
      ↓
TERSE string       ← sent to the LLM
      ↓
parse()            ← if you need it back
      ↓
Your data again
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Just like nobody writes &lt;code&gt;JSON.stringify()&lt;/code&gt; output by hand — you call the function. TERSE works the same way. The format is optimized for the one reader that actually matters: the LLM.&lt;/p&gt;




&lt;h2&gt;
  
  
  On design intent: why not compress further?
&lt;/h2&gt;

&lt;p&gt;TERSE could go deeper — automatic key abbreviation, binary type encoding, dictionary compression. We deliberately stopped short of that.&lt;/p&gt;

&lt;p&gt;The goal is a format that remains &lt;strong&gt;human-auditable&lt;/strong&gt;: you can open a &lt;code&gt;.terse&lt;/code&gt; file in any text editor and understand what you're looking at without tooling. In LLM pipelines, auditability is a safety property, not just a convenience. When an agent misbehaves, you need to inspect its inputs.&lt;/p&gt;




&lt;h2&gt;
  
  
  Two questions that come up
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Can I use TERSE for REST API communication between microservices?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;You can, but it's not the primary use case. REST APIs are consumed by many clients across different teams and languages — JSON's universal support is a real advantage there. TERSE shines where you control both ends: serializing data before sending it to an LLM, and parsing the response on the other side.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Can I use TERSE for application configuration, like YAML?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Yes — the format supports everything YAML does for config files: nested objects, arrays, typed values, comments. Worth considering if your config is also consumed by an LLM as context.&lt;/p&gt;




&lt;h2&gt;
  
  
  What's available today
&lt;/h2&gt;

&lt;p&gt;The project includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Formal specification&lt;/strong&gt; (v0.7) with ABNF grammar, conformance rules, and security considerations — published on Zenodo with DOI: &lt;a href="https://doi.org/10.5281/zenodo.19058364" rel="noopener noreferrer"&gt;10.5281/zenodo.19058364&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reference implementations&lt;/strong&gt; in TypeScript, Python, Java, and Go&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Live playground&lt;/strong&gt; where you can paste JSON and see the TERSE output in real time&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Everything is open source under MIT (implementations) and CC BY 4.0 (specification).&lt;/p&gt;




&lt;h2&gt;
  
  
  Links
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;🌐 &lt;strong&gt;Landing page + playground&lt;/strong&gt;: &lt;a href="https://rudsoncarvalho.github.io/terse-format" rel="noopener noreferrer"&gt;rudsoncarvalho.github.io/terse-format&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;📦 &lt;strong&gt;GitHub&lt;/strong&gt;: &lt;a href="https://github.com/RudsonCarvalho/terse-format" rel="noopener noreferrer"&gt;github.com/RudsonCarvalho/terse-format&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;📄 &lt;strong&gt;Spec (Zenodo DOI)&lt;/strong&gt;: &lt;a href="https://doi.org/10.5281/zenodo.19058364" rel="noopener noreferrer"&gt;10.5281/zenodo.19058364&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;&lt;code&gt;npm install terse-js&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;pip install terse-py&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;TERSE is still a draft — v0.7 is open for community review. If you work with LLM pipelines at scale, I'd love to hear whether this addresses a real pain point in your stack.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Rudson Kiyoshi Souza Carvalho — Independent Researcher&lt;/em&gt;&lt;/p&gt;

</description>
      <category>llm</category>
      <category>opensource</category>
      <category>ai</category>
      <category>token</category>
    </item>
    <item>
      <title>Resilience Evaluation and Optimization Framework — REOF</title>
      <dc:creator>Rudson Kiyoshi Souza Carvalho</dc:creator>
      <pubDate>Wed, 12 Jun 2024 12:23:30 +0000</pubDate>
      <link>https://dev.to/rudsoncarvalho/resilience-evaluation-and-optimization-framework-reof-4f9c</link>
      <guid>https://dev.to/rudsoncarvalho/resilience-evaluation-and-optimization-framework-reof-4f9c</guid>
      <description>&lt;p&gt;Autor: Rudson Kiyoshi Souza Carvalho&lt;/p&gt;

&lt;p&gt;Data: Abril de 2024&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Objetivo:&lt;/strong&gt; Este documento apresenta o REOF, um framework para avaliar, quantificar e otimizar a resiliência e confiabilidade de sistemas, com foco em aplicações de software.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Ao avaliar sistematicamente cada componente crítico, a metodologia ajuda a identificar proativamente áreas de vulnerabilidade que podem comprometer a confiabilidade/resiliência do sistema.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;1. Introdução ao REOF:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;O REOF é uma ferramenta padronizada que permite a análise, quantificação e expressão da resiliência e confiabilidade de um sistema através de um índice numérico (IRC - Índice de Resiliência e Confiabilidade).&lt;br&gt;
A metodologia foca na prevenção de falhas e na implementação de melhores práticas para aumentar a confiabilidade.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Metodologia de Análise REOF:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;O método considera Verticais de Avaliação: O REOF divide a análise em "verticais" que representam pontos críticos de um sistema, como:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;EE - Entrada Externa (pontos de interação com o cliente)&lt;/li&gt;
&lt;li&gt;SE - Saídas Externas (envio de dados para outros sistemas)&lt;/li&gt;
&lt;li&gt;CE - Consultas Externas (integrações com outros sistemas)&lt;/li&gt;
&lt;li&gt;DI - Dados Internos (consultas a banco de dados, cache, etc.)&lt;/li&gt;
&lt;li&gt;AC - Aplicação em Container (configurações de health check)&lt;/li&gt;
&lt;li&gt;SEC - Framework de Segurança Habilitado (ex: Spring Security)&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;Um dos pontos mais importantes sobre este framework é que ele foi concebido para ser flexível a qualquer vertical criada, portanto, você pode criar suas próprias verticais de avaliação e poderá avaliar qualquer processo que tenha um conjunto de boas práticas a serem avaliados. (logo poderia avaliar verticais de infraestrutura, técnicas de construções de aplicativos mobile, entre outros processos. &lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;Proteções e Pesos:&lt;/strong&gt; Para cada vertical, são definidas "proteções" (melhores práticas) que aumentam a resiliência, cada uma com um peso específico.&lt;br&gt;
"Com sua equipe de engenharia ou arquitetura, você poderá listar as melhores práticas de proteção para promover resiliência e confiabilidade ao sistema, definindo pesos para cada proteção aplicada."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cálculo do Índice:&lt;/strong&gt; O IRC é calculado pela soma ponderada das pontuações de cada vertical.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fator de Degradação:&lt;/strong&gt; Um fator de degradação é aplicado para considerar o impacto de múltiplos domínios/funcionalidades em um mesmo microsserviço (micromonolitos).&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Para cada domínio adicional, quero reduzir a qualidade do índice geral em 10% para cada domínio/funcionalidade adicionada, pois incluir novas/extras funcionalidades/domínios diferentes faz com que seu serviço tenha que compartilhar recursos, e uma lentidão em uma funcionalidade pode esgotar recursos para outras funcionalidades no mesmo microsserviço.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;Normalização do Índice:&lt;/strong&gt; O IRC é normalizado para uma escala de 0 a 10, facilitando a comunicação e comparação entre diferentes sistemas.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. IRC/REOF como SLA:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;O REOF permite expressar o IRC em níveis de serviço (SLA):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;item 1 Excelente (8 a 10)&lt;/li&gt;
&lt;li&gt;item 2 Bom (5 a 7.9)&lt;/li&gt;
&lt;li&gt;item 3 Aceitável (3 a 4.9)&lt;/li&gt;
&lt;li&gt;item 4 Insatisfatório (abaixo de 3)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;em&gt;Pirâmide de confiabilidade REOF de Ruds&lt;/em&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc3ph37jwowxim33497zo.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc3ph37jwowxim33497zo.png" alt=" " width="800" height="407"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;SLA para Serviço Excelente:&lt;/strong&gt; O IRC/REOF deve ser maior ou igual a 8, indicando um nível de serviço excelente. Isso reflete a alta confiabilidade e eficiência do microserviço, sem sobrecarga de domínios adicionais.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;SLA para Serviço Bom:&lt;/strong&gt; O IRC/REOF deve ser entre 5 e 7.9, indicando um nível de serviço bom. Isso reflete a confiabilidade do microserviço.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;SLA para Serviço Aceitável:&lt;/strong&gt; O IRC/REOF deve ser entre 3 e 4.9, indicando um nível de serviço aceitável. Isso indica que há espaço para melhoria. Medidas corretivas devem ser aplicadas para aumentar a confiabilidade deste serviço e reduzir impactos de paradas do serviço por causa da aplicação.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;SLA para Serviço Insatisfatório:&lt;/strong&gt; O IRC/REOF deve estar abaixo de 3, indicando um nível de serviço insatisfatório. Isso indica que este serviço precisa de revisões e melhorias, não sendo um serviço confiável.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Flexibilidade e Automação:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;O REOF é flexível e pode ser personalizado com novas verticais e proteções.&lt;br&gt;
É possível automatizar o cálculo do IRC através de análise estática de código, mas a precisão pode ser limitada.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. REOF vs. MTBF:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;O REOF é uma medida proativa que avalia a robustez do sistema com base em sua construção, enquanto o MTBF é uma medida reativa que considera apenas o tempo médio entre falhas.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;O MTBF é a métrica da sorte ao longo do tempo, um MTBF alto pode indicar que um sistema teve um bom histórico operacional, dadas as condições ideais de operação ambiental desse sistema, no entanto, não diferencia necessariamente sistemas genuinamente bem projetados daqueles que Você pode ter tido 'sorte' de ter um ambiente estável durante o período de execução e avaliação.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;O REOF é mais abrangente e fornece insights mais acionáveis para melhorar a resiliência.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;6. Relação com Chaos Engineering:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;REOF e Chaos Engineering são abordagens complementares.&lt;br&gt;
O REOF garante que as melhores práticas de resiliência sejam aplicadas durante o desenvolvimento, enquanto o Chaos Engineering testa a resiliência do sistema em produção.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;7. Benefícios do REOF:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Comunicação eficaz sobre a confiabilidade do sistema.&lt;/li&gt;
&lt;li&gt;Identificação precisa de áreas de melhoria.&lt;/li&gt;
&lt;li&gt;Cultura de melhoria contínua e prevenção de falhas.&lt;/li&gt;
&lt;li&gt;Gerenciamento de riscos e conformidade com SLAs.&lt;/li&gt;
&lt;li&gt;Melhor experiência do usuário.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;8. Considerações sobre Custos:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Implementação do REOF pode ter custo inicial significativo, mas reduz custos operacionais a longo prazo.&lt;br&gt;
Chaos Engineering pode ter baixo custo de implementação, mas custos operacionais podem ser altos durante os testes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Como o método REOF é melhor do que o método MTBF?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;O MTBF é uma estatística de funcionamento do seu sistema, segundo um histórico operacional, uma medição ao longo do tempo, onde um sistema pode funcionar muito bem dada as condições ideais de operação, se nada de anormal acontecer no seu ambiente/infra, o MTBF indicará que seu sistema é extremamente confiável, pois ele depende das condições sob a qual o seu sistema opera para que possam ocorrer falhas, este método não sabe como seu sistema foi construído, considera a freqüência de falhas num período de tempo, e não a robustez como o sistema foi construído para lidar com diferentes tipos de variações no ambiente e consequentemente se proteger das falhas, é um método reativo.&lt;/p&gt;

&lt;p&gt;O MTBF é a métrica da sorte em função do tempo, um MTBF alto pode indicar que um sistema teve um bom histórico de funcionamento dada as condições de ambiente ideais de operação deste sistema, porém, não necessariamente distingue entre sistemas genuinamente bem projetados e aqueles que pode ter tido "sorte" de ter um ambiente estável durante o período de execução e avaliação.&lt;/p&gt;

&lt;p&gt;O REOF genuinamente avalia a robustez do sistema, como o sistema foi construído para lidar com os diferentes tipos de problemas que possam ocorrer no ambiente produtivo, é um método proativo.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Relação entre o método REOF e o Chaos Monkey/Engineering&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;O método REOF, contrasta com a aplicação de ferramentas como o Chaos Monkey em vários aspectos fundamentais. Ambas as abordagens visam melhorar a resiliência e a confiabilidade dos sistemas, mas fazem isso de maneiras complementares, a engenharia do caos é uma disciplina de experimentação em um sistema para criar confiança na capacidade do sistema de resistir a condições turbulentas na produção, enquanto este método garante que foram aplicadas as melhores práticas para resistir ao caos, ou seja, garante a preparação para falhas, os pontos fortes da metodologia de avaliação de confiabilidade em relação ao uso de um Chaos Monkey são:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Foco na Prevenção e Melhoria Contínua&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Avaliação Holística: A metodologia fornece uma visão abrangente da performance do sistema ao longo do tempo, permitindo identificar tendências, áreas de melhoria e impactos das mudanças, ao contrário do Chaos Monkey, que testa a resiliência de forma mais imediata e isolada.&lt;/p&gt;

&lt;p&gt;Incentivo à Inovação: A gamificação incentiva (proposta tópico desafio de excelência) as equipes a buscar melhorias contínuas e soluções inovadoras para elevar os índices de confiabilidade, promovendo uma cultura de excelência operacional.&lt;/p&gt;

&lt;p&gt;Planejamento Estratégico: Oferece uma base para o planejamento estratégico e a alocação de recursos, ao identificar áreas críticas que necessitam de atenção e investimento, algo que a aplicação isolada do Chaos Monkey não proporciona diretamente.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Gestão de Riscos e Conformidade&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Redução de Riscos Operacionais: Ao focar na avaliação e melhoria contínuas da confiabilidade, esta metodologia ajuda a mitigar riscos operacionais de longo prazo, enquanto o Chaos Monkey é mais uma ferramenta de teste de estresse que expõe vulnerabilidades.&lt;/p&gt;

&lt;p&gt;Conformidade com SLAs: A metodologia permite a monitoração proativa e a garantia de que os serviços atendam ou excedam os SLAs acordados, o que é fundamental para a satisfação do cliente e a conformidade regulatória.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Melhoria da Experiência do Usuário&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Foco no Usuário: Avaliar e melhorar a confiabilidade com base nos SLAs enfatiza a importância da experiência do usuário, visando garantir uma operação sem interrupções e desempenho otimizado dos serviços.&lt;/p&gt;

&lt;p&gt;Antecipação de Problemas: Permite a identificação e correção proativa de possíveis falhas antes que afetem os usuários finais, enquanto o Chaos Monkey simula falhas para testar a resiliência, o que pode ou não ser diretamente relacionado à experiência do usuário.&lt;/p&gt;

&lt;p&gt;Complementaridade com Ferramentas de Teste de Resiliência&lt;br&gt;
Abordagem Integrada: Embora focada em avaliação e melhoria, essa metodologia pode ser complementada por ferramentas como o Chaos Monkey para uma abordagem mais robusta à resiliência. Juntas, elas oferecem uma estratégia de defesa em profundidade contra falhas e interrupções.&lt;/p&gt;

&lt;p&gt;Em resumo, a metodologia de avaliação de confiabilidade traz uma abordagem preventiva e estratégica para a gestão da confiabilidade dos sistemas, enfocando a melhoria contínua, a inovação e a satisfação do cliente. Enquanto o Chaos Monkey é uma ferramenta valiosa para testar a resiliência de forma específica e isolada, a combinação das duas abordagens oferece um caminho poderoso para alcançar a excelência operacional e a resiliência do sistema.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Conclusão:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;O REOF é um framework poderoso para construir e gerenciar sistemas resilientes. Sua abordagem proativa, foco na prevenção e flexibilidade o tornam uma ferramenta valiosa para qualquer organização que busca alcançar a excelência operacional e garantir a satisfação do cliente.&lt;/p&gt;

&lt;p&gt;Siga o link para mais detalhes: &lt;br&gt;
Follow the medium link for more details about this framework: &lt;a href="https://medium.com/@rudsonkiyoshicarvalho/resilience-evaluation-and-optimization-framework-reof-541d23018460" rel="noopener noreferrer"&gt;Medium REOF&lt;/a&gt;&lt;/p&gt;

</description>
      <category>resilience</category>
      <category>microservices</category>
      <category>softwareengineer</category>
      <category>performance</category>
    </item>
  </channel>
</rss>
