<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Rudson Kiyoshi Souza Carvalho</title>
    <description>The latest articles on DEV Community by Rudson Kiyoshi Souza Carvalho (@rudsoncarvalho).</description>
    <link>https://dev.to/rudsoncarvalho</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F151609%2F1e0d94d5-7d61-421f-973f-b6c991833fba.png</url>
      <title>DEV Community: Rudson Kiyoshi Souza Carvalho</title>
      <link>https://dev.to/rudsoncarvalho</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/rudsoncarvalho"/>
    <language>en</language>
    <item>
      <title>COA-MAS v2: A Meta-Framework for Cross-Domain Multi-Agent Governance</title>
      <dc:creator>Rudson Kiyoshi Souza Carvalho</dc:creator>
      <pubDate>Wed, 01 Apr 2026 23:29:15 +0000</pubDate>
      <link>https://dev.to/rudsoncarvalho/coa-mas-v2-a-meta-framework-for-cross-domain-multi-agent-governance-4mji</link>
      <guid>https://dev.to/rudsoncarvalho/coa-mas-v2-a-meta-framework-for-cross-domain-multi-agent-governance-4mji</guid>
      <description>&lt;p&gt;AI agents are crossing organizational boundaries. They call tools in partner domains, delegate tasks to external services, and operate in chains where no single actor sees the full picture.&lt;/p&gt;

&lt;p&gt;COA-MAS v1 solved the intra-domain governance problem — a four-layer architecture, the Action Claim contract, and the AASG enforcement boundary that ensures zero cognitive load at runtime. If you haven't read it, the paper is at &lt;a href="https://doi.org/10.5281/zenodo.19057202" rel="noopener noreferrer"&gt;doi.org/10.5281/zenodo.19057202&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The cross-domain problem is different. And it took a full architectural pivot to solve it correctly.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Silver Bullet Fallacy
&lt;/h2&gt;

&lt;p&gt;Early iterations of COA-MAS v2 tried to build a universal calibration mechanism — a way to translate risk scores between domains with different semantic spaces. After several rounds of debate and stress-testing, it became clear that this approach has the same flaw as trying to replace PIX, TED, wire transfers, and letters of credit with a single payment instrument.&lt;/p&gt;

&lt;p&gt;Each of those instruments exists because different transaction contexts require different guarantees. Resilience in distributed systems comes from routing to the right pattern based on context — not from finding the pattern that works everywhere.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Thesis
&lt;/h2&gt;

&lt;p&gt;COA-MAS v2 is a meta-framework, not a protocol. It standardizes one thing: the &lt;strong&gt;Action Intent&lt;/strong&gt; — a universal artifact that any federated governance pattern can consume. The choice of execution topology is delegated to a &lt;strong&gt;Pattern Selection Protocol&lt;/strong&gt; negotiated during trust peering.&lt;/p&gt;

&lt;p&gt;The Action Intent is the common currency. The federation mode is the exchange mechanism.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Action Intent
&lt;/h2&gt;

&lt;p&gt;The Action Intent is the "passport" of the COA-MAS federation. It is a standardized, cryptographically signed declaration of:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Who&lt;/strong&gt; is acting — SPIFFE identity, delegation chain, GOV-RISK attestation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;What&lt;/strong&gt; they intend to do — tool URI, operation type, resource scope&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;What effect&lt;/strong&gt; they declare — reversibility, estimated scope, data sensitivity&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cryptographic binding&lt;/strong&gt; — ephemeral DPoP public key for proof-of-possession&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Domain A's internal policy, prompts, and risk weights are never transmitted. Only the declared intent, authenticated by Domain A's governance layer.&lt;/p&gt;

&lt;p&gt;If Domain A lies — declares &lt;code&gt;bounded_set&lt;/code&gt; but attempts a full-table deletion — the signed intent becomes irrefutable forensic evidence. The problem moves from governance mathematics to organizational accountability, backed by cryptographic proof.&lt;/p&gt;

&lt;p&gt;The canonical JSON Schema is published at &lt;a href="https://doi.org/10.5281/zenodo.19376419" rel="noopener noreferrer"&gt;doi.org/10.5281/zenodo.19376419&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Four Federation Modes
&lt;/h2&gt;

&lt;p&gt;The Pattern Selection Protocol routes each cross-domain interaction to the appropriate mode based on trust distance, acceptable latency, and cognitive burden tolerance.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Mode 0 — Intra-Domain (COA-MAS V1)&lt;/strong&gt;&lt;br&gt;
Same domain. Deterministic, microsecond latency, zero external dependencies. The foundation everything else builds on.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Mode 1 — Sovereign Visa&lt;/strong&gt;&lt;br&gt;
Domain A submits the Action Intent to Domain B's authorization endpoint. Domain B's GOV-RISK evaluates it using its own Executable Culture — full sovereignty, no calibration across semantic spaces. GOV-RISK-B issues a standard COA-MAS v1 Action Claim with DPoP binding. AASG-B validates a locally-trusted signature at runtime. Zero cognitive load.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Mode 2 — Ambassador&lt;/strong&gt;&lt;br&gt;
Domain B doesn't expose tools to foreign agents at all. It exposes an agent communication interface. Domain A's intent becomes the opening message of an A2A conversation. Domain B's Ambassador agent formulates its own plan, submits it to GOV-RISK-B via Mode 0, and executes locally. Maximum isolation. Non-deterministic latency.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Mode 3 — Clearinghouse&lt;/strong&gt;&lt;br&gt;
A neutral Domain C — a regulated hub both domains trust — evaluates the intent and issues a universally-accepted Action Claim. Appropriate for regulated industries (Open Finance, healthcare prior authorization). Opt-in only: it trades polycentric sovereignty for operational simplicity.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Future Mode 4 — ZK-Policy&lt;/strong&gt;&lt;br&gt;
The CAGA-compliant target. Domain A generates a zero-knowledge proof of correct policy execution without revealing internal data. Domain B verifies mathematically. Not implementable in production today due to ZKML hardware constraints — but the meta-framework is explicitly designed to incorporate it as Mode 4 when viable, without requiring changes to the Action Intent schema or SPIFFE infrastructure.&lt;/p&gt;
&lt;h2&gt;
  
  
  The Pattern Selection Protocol
&lt;/h2&gt;

&lt;p&gt;Domains don't negotiate a single mode — they negotiate a Federation Policy that maps operation families and resource classes to modes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mode_by_operation"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"read"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"mode"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"ttl_seconds"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1800&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"single_use"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"delete"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"mode"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"ttl_seconds"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;120&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nl"&gt;"single_use"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"configure"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"mode"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mode_by_resource_class"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"pii"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;       &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"mode"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"regulated"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"mode"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The same pair of domains can use Mode 1 for routine reads and Mode 2 for infrastructure operations — without renegotiating the peering relationship.&lt;/p&gt;

&lt;h2&gt;
  
  
  Positioning Against CAGA
&lt;/h2&gt;

&lt;p&gt;Meyman [SSRN 6299461] formalizes the Cross-Agent Governance Alignment (CAGA) problem and identifies zero-knowledge proofs as the theoretically correct solution. COA-MAS v2 is the operationally deployable answer while ZKML hardware matures — trading full policy confidentiality for sub-millisecond runtime enforcement, zero integration cost for Domain B, and compatibility with stochastic LLM-based GOV-RISKs.&lt;/p&gt;

&lt;p&gt;The relationship is complementary. CAGA defines what a correct solution must prove. COA-MAS v2 defines how production systems navigate the space between the theoretically ideal and the operationally deployable.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's Published
&lt;/h2&gt;

&lt;p&gt;📄 &lt;strong&gt;Working Paper v0.3&lt;/strong&gt;&lt;br&gt;
&lt;a href="https://doi.org/10.5281/zenodo.19376738" rel="noopener noreferrer"&gt;doi.org/10.5281/zenodo.19376738&lt;/a&gt;&lt;br&gt;
&lt;a href="https://zenodo.org/records/19376739" rel="noopener noreferrer"&gt;zenodo.org/records/19376739&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;🔧 &lt;strong&gt;Action Intent Schema v1.0.0&lt;/strong&gt;&lt;br&gt;
&lt;a href="https://doi.org/10.5281/zenodo.19376419" rel="noopener noreferrer"&gt;doi.org/10.5281/zenodo.19376419&lt;/a&gt;&lt;br&gt;
&lt;a href="https://zenodo.org/records/19376420" rel="noopener noreferrer"&gt;zenodo.org/records/19376420&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;📚 &lt;strong&gt;COA-MAS v1 (foundation)&lt;/strong&gt;&lt;br&gt;
&lt;a href="https://doi.org/10.5281/zenodo.19057202" rel="noopener noreferrer"&gt;doi.org/10.5281/zenodo.19057202&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;If you're building cross-domain multi-agent systems and the governance layer is an afterthought, the meta-framework and the schema are open access. Feedback, critique, and stress-testing welcome.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>security</category>
      <category>architecture</category>
      <category>multiagent</category>
    </item>
    <item>
      <title>AI Agents Can Delete Your Production Database. Here's the Governance Framework That Stops Them.</title>
      <dc:creator>Rudson Kiyoshi Souza Carvalho</dc:creator>
      <pubDate>Tue, 31 Mar 2026 12:51:25 +0000</pubDate>
      <link>https://dev.to/rudsoncarvalho/ai-agents-can-delete-your-production-database-heres-the-governance-framework-that-stops-them-ccj</link>
      <guid>https://dev.to/rudsoncarvalho/ai-agents-can-delete-your-production-database-heres-the-governance-framework-that-stops-them-ccj</guid>
      <description>&lt;p&gt;&lt;em&gt;This article presents COA-MAS — a governance framework for autonomous agents grounded in organizational theory, institutional design, and normative multi-agent systems research. The full paper is published on Zenodo: &lt;a href="https://zenodo.org/records/19057202" rel="noopener noreferrer"&gt;doi.org/10.5281/zenodo.19057202&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem No One Is Talking About
&lt;/h2&gt;

&lt;p&gt;Something unusual happened in early 2026. The IETF published a formal Internet-Draft on AI agent authentication and authorization. Eight major technology companies released version 1.0 of the Agent-to-Agent Protocol. And a widely-read post demonstrated why the prevailing credential model for AI agents was structurally broken.&lt;/p&gt;

&lt;p&gt;The convergence wasn't coincidental. It was the signal that a structural problem — long present in early agentic deployments — had reached the threshold of production consequence.&lt;/p&gt;

&lt;p&gt;We've built agents that can:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Delete production databases&lt;/li&gt;
&lt;li&gt;Execute financial transactions&lt;/li&gt;
&lt;li&gt;Modify business logic&lt;/li&gt;
&lt;li&gt;Spawn other agents&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And we gave them &lt;strong&gt;API keys&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;An API key authorizes &lt;em&gt;access&lt;/em&gt;. It does not authorize a &lt;em&gt;specific action with a specific impact in a specific context&lt;/em&gt;. That distinction is the entire problem.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Structural Failure Mode: Distributed Cognitive Chaos
&lt;/h2&gt;

&lt;p&gt;I call this failure mode &lt;strong&gt;Distributed Cognitive Chaos (DCC)&lt;/strong&gt;: the structural consequence of deploying agents without formal authority hierarchies, authorization contracts, or enforcement boundaries.&lt;/p&gt;

&lt;p&gt;DCC has three symptoms:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Action hallucination&lt;/strong&gt; — an agent executes an action it was never authorized to perform, because nothing formally defined "authorized"&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Mandate drift&lt;/strong&gt; — through a chain of agent-to-agent delegations, the original human intent gets distorted beyond recognition&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Accountability collapse&lt;/strong&gt; — when something goes wrong, there is no tamper-evident record connecting the action to the authority that (supposedly) permitted it&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This is not a new problem. It's the oldest problem in organizational theory: how do you coordinate partially autonomous actors toward collective goals while preventing any individual actor from harming the collective?&lt;/p&gt;

&lt;p&gt;Herbert Simon identified it in 1947. Elinor Ostrom solved it in 1990. We just haven't applied those solutions to AI agents yet.&lt;/p&gt;




&lt;h2&gt;
  
  
  COA-MAS: A Governance Framework Grounded in Theory
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;COA-MAS&lt;/strong&gt; (&lt;em&gt;Cognitive Organization Architecture for Multi-Agent Systems&lt;/em&gt;) is my answer. It synthesizes four intellectual traditions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Simon's bounded rationality&lt;/strong&gt; → why agents need external governance&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Ostrom's institutional design principles&lt;/strong&gt; → how to structure governance for durability&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Normative multi-agent systems research&lt;/strong&gt; → how to formalize governance as computable norms&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Sociotechnical systems theory&lt;/strong&gt; → how to make social norms technically enforceable&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The framework has three components. Each answers a different question.&lt;/p&gt;




&lt;h2&gt;
  
  
  Component 1: The Four-Layer Architecture
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Question: Who is in charge?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Think of it as a corporate structure for AI agents:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌─────────────────────────────────────────────┐
│ LAYER 4 — STRATEGIC ORCHESTRATION                  │
│ Receives human objectives · decomposes into tasks  │
└─────────────────────────────────────────────┘
                        ↕
┌─────────────────────────────────────────────┐
│ LAYER 3 — COGNITIVE GOVERNANCE                     │
│ Evaluates proposed actions · issues authorization  │
│ documents · maintains audit ledger                 │
└─────────────────────────────────────────────┘
                        ↕
┌─────────────────────────────────────────────┐
│ LAYER 2 — FUNCTIONAL SPECIALIZATION                │
│ Domain agents · execute tasks within their         │
│ cognitive authority boundary                       │
└─────────────────────────────────────────────┘
                        ↕
┌─────────────────────────────────────────────┐
│ LAYER 1 — EXECUTABLE CULTURE (Constitutional)      │
│ Versioned YAML policies · weights · thresholds     │
│ Human-authored before runtime. Immutable during.   │
└─────────────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The critical insight, drawn from both Simon and Ostrom, is the &lt;strong&gt;separation between those who propose actions and those who authorize them&lt;/strong&gt;. An agent cannot authorize its own actions. This mirrors the principle of checks and balances in constitutional systems: the body that proposes is not the body that authorizes is not the body that records.&lt;/p&gt;




&lt;h2&gt;
  
  
  Component 2: The Action Claim
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Question: What exactly is the agent authorized to do?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;An &lt;strong&gt;Action Claim&lt;/strong&gt; is a formal authorization document that agents must present before executing any real-world action. It's analogous to a building permit — not just "you're allowed to build," but: the location, the dimensions, the materials, the timeline, the inspector, and the version of the building code that governed the approval.&lt;/p&gt;

&lt;p&gt;The Action Claim has three parts:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="err"&gt;//&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;DECLARED&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;FIELDS&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;—&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;filled&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;by&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;agent&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"proposed_transition"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"DELETE expired sessions older than 90 days"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"originating_goal"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"scheduled maintenance task #4421"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"delegation_chain"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"human:ops-team"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"agent:orchestrator-01"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"agent:db-cleaner"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"estimated_impact"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"destructivity"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.25&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"data_exposure"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.00&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"resource_consumption"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.30&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"privilege_escalation"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.00&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"logic_integrity"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.05&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"recursive_autonomy"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.10&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;

  &lt;/span&gt;&lt;span class="err"&gt;//&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;DERIVED&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;FIELDS&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;—&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;filled&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;by&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;GOV-RISK&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;(Layer&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="err"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"justification_gap"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.08&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"decision"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"APPROVE"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"governance_signature"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"sha256:a3f9..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"policy_digest"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"sha256:1b2c..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;

  &lt;/span&gt;&lt;span class="err"&gt;//&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;AUDIT&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;FIELDS&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;—&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;filled&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;by&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;infrastructure&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"ac_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ac-2026-03-31-00421"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"state"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"AUTHORIZED"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"committed_at"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2026-03-31T14:22:01Z"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The tripartite structure reflects Ostrom's principle of separating operational decisions from the collective-choice rules that govern them. The agent operates at the operational level; Layer 3 applies institutional norms; the audit trail creates an immutable record connecting every decision to the rules that governed it.&lt;/p&gt;




&lt;h2&gt;
  
  
  Component 3: The AASG (Autonomous Agent Security Gateway)
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Question: How is authorization enforced?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Think of the AASG as a customs inspector at the boundary between the agents' cognitive world and the real world of executing tools.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Agent Cognition (A2A) ────────────────► Real World (MCP)
                              │
                         [ AASG ]
                              │
                    Checks exactly 3 things:
                    1. Is the Action Claim valid and signed?
                    2. Is the agent identity correct?
                    3. Does the actual tool call match 
                       what was declared?
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The AASG is a &lt;strong&gt;physically separate process&lt;/strong&gt; — not a library in the agent, not a plugin in the MCP server. This separation is the architectural expression of Simon's insight: governance constraints must be external to the decision-maker they constrain. An agent cannot reliably police itself.&lt;/p&gt;

&lt;h3&gt;
  
  
  What happens when an agent lies?
&lt;/h3&gt;

&lt;p&gt;Consider an agent authorized to clean up expired sessions. It gets an AUTHORIZED Action Claim with &lt;code&gt;destructivity: 0.25&lt;/code&gt;. Then it tries to execute:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;DELETE&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt;  &lt;span class="c1"&gt;-- no WHERE clause&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The AASG projects the actual tool call onto the governance space:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Declared: destructivity = 0.25 (bounded deletion)
Actual:   destructivity = 1.00 (full table wipe)

Congruence delta: 0.75 &amp;gt;&amp;gt; threshold (0.20)
→ ERR_AASG_003: CONGRUENCE_VIOLATION
→ Action never reaches the database
→ Merkle Ledger: INTERCEPTED entry recorded
→ GOV-AUDIT: SCOPE_SUBDECLARATION_DETECTED alert
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The action is blocked before any damage occurs — not discovered in a log review afterward.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Justification Gap: The First Computable Proportionality Metric
&lt;/h2&gt;

&lt;p&gt;The core innovation in COA-MAS is the &lt;strong&gt;Justification Gap&lt;/strong&gt; — a metric that answers: &lt;em&gt;is this specific action proportional to the mandate that authorized it?&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The governance space has six dimensions:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dimension&lt;/th&gt;
&lt;th&gt;What It Measures&lt;/th&gt;
&lt;th&gt;Weight&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;d₁ Destructivity&lt;/td&gt;
&lt;td&gt;Irreversible state change (DELETE, DROP, PURGE)&lt;/td&gt;
&lt;td&gt;0.30&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;d₂ Data Exposure&lt;/td&gt;
&lt;td&gt;Cross-zone data movement, PII handling&lt;/td&gt;
&lt;td&gt;0.25&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;d₃ Resource Consumption&lt;/td&gt;
&lt;td&gt;Financial/computational impact&lt;/td&gt;
&lt;td&gt;0.10&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;d₄ Privilege Escalation&lt;/td&gt;
&lt;td&gt;IAM creation, credential access&lt;/td&gt;
&lt;td&gt;0.20&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;d₅ Logic Integrity&lt;/td&gt;
&lt;td&gt;Business logic modification&lt;/td&gt;
&lt;td&gt;0.10&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;d₆ Recursive Autonomy&lt;/td&gt;
&lt;td&gt;Sub-agent spawning, delegation depth&lt;/td&gt;
&lt;td&gt;0.05&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The formula:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Gⱼ(V_A, V_M) = Σ wᵢ · φ(aᵢ, mᵢ)

where φ(aᵢ, mᵢ) =
  0               if aᵢ ≤ mᵢ   (within mandate)
  ((aᵢ - mᵢ)/mᵢ)² if aᵢ &amp;gt; mᵢ  (quadratic penalty)
  +∞              if aᵢ &amp;gt; 0 ∧ mᵢ = 0  (absolute prohibition)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The decision:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Gⱼ ≤ 0.15  →  APPROVE
0.15 &amp;lt; Gⱼ ≤ 0.40  →  ESCALATE (human review with full dossier)
Gⱼ &amp;gt; 0.40 or Gⱼ = +∞  →  BLOCK
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The quadratic penalty means small excesses may be tolerable, but large excesses signal a qualitatively different action from what was authorized. The ESCALATE tier implements "meaningful human oversight" as required by EU AI Act Article 14 — with actual quantitative evidence, not just a kill switch.&lt;/p&gt;




&lt;h2&gt;
  
  
  Mandate Laundering: The Attack You Didn't Know Existed
&lt;/h2&gt;

&lt;p&gt;Here's the attack that no existing security framework addresses: &lt;strong&gt;mandate laundering&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;In a delegation chain &lt;code&gt;Human → Agent₁ → Agent₂ → ... → Agentₖ&lt;/code&gt;, each intermediate agent can marginally expand the mandate it passes on. Each local expansion looks proportionate. But the cumulative expansion is not.&lt;/p&gt;

&lt;p&gt;COA-MAS anchors the Justification Gap to the &lt;strong&gt;root human mandate&lt;/strong&gt;, regardless of intermediate expansions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;G_chain(Aₖ) = Gⱼ(V_{Aₖ}, V_{M₀})  ← root mandate, always

G_total = 0.30 · G_local + 0.70 · G_chain
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Non-Improvement Theorem&lt;/strong&gt;: For any permissive subdelegation, &lt;code&gt;G_chain&lt;/code&gt; is monotone non-decreasing. You cannot launder your way out of the original constraint.&lt;/p&gt;




&lt;h2&gt;
  
  
  How COA-MAS Fits the Standards Ecosystem
&lt;/h2&gt;

&lt;p&gt;COA-MAS doesn't compete with existing standards — it implements what they defer:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Initiative&lt;/th&gt;
&lt;th&gt;What It Solves&lt;/th&gt;
&lt;th&gt;What It Defers&lt;/th&gt;
&lt;th&gt;COA-MAS Role&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;IETF draft-klrc-aiagent-auth&lt;/td&gt;
&lt;td&gt;Identity, authentication, authorization (SPIFFE, OAuth 2.0)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Policy model explicitly out of scope&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Implements the policy model&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;A2A Protocol v1.0&lt;/td&gt;
&lt;td&gt;Agent coordination standard&lt;/td&gt;
&lt;td&gt;Authorization at execution boundary&lt;/td&gt;
&lt;td&gt;AASG is the enforcement point A2A lacks&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MCP v1.0&lt;/td&gt;
&lt;td&gt;Agent-to-tool communication&lt;/td&gt;
&lt;td&gt;No semantic authorization layer&lt;/td&gt;
&lt;td&gt;AASG is the authorization gate MCP doesn't have&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The IETF draft's Section 12 explicitly states: "the policy model and document format are out of scope." That is precisely where COA-MAS contributes.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Failure Mode Transition
&lt;/h2&gt;

&lt;p&gt;The most consequential architectural property of COA-MAS is the failure mode it introduces.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Traditional agentic systems&lt;/strong&gt;: fail semantically and silently. The agent reinterprets a guideline, slightly expands a scope, finds an unanticipated interpretation. Detectable only after damage, through log analysis.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;COA-MAS&lt;/strong&gt;: introduces the explicit &lt;code&gt;CONGRUENCE_VIOLATION&lt;/code&gt; failure mode. When an agent attempts an action that violates its declared impact vector, the AASG returns:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A specific error code&lt;/li&gt;
&lt;li&gt;The dimension violated&lt;/li&gt;
&lt;li&gt;The quantitative delta&lt;/li&gt;
&lt;li&gt;A Merkle Ledger entry with full context&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is the organizational equivalent of a building inspector catching a code violation before the foundation is poured — not after the building collapses.&lt;/p&gt;




&lt;h2&gt;
  
  
  What's Published
&lt;/h2&gt;

&lt;p&gt;The full paper, &lt;strong&gt;COA-MAS: A Governance Framework for Autonomous Agents in Production Environments&lt;/strong&gt;, is available on Zenodo:&lt;/p&gt;

&lt;p&gt;📄 &lt;strong&gt;&lt;a href="https://zenodo.org/records/19057202" rel="noopener noreferrer"&gt;zenodo.org/records/19057202&lt;/a&gt;&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
🔑 &lt;strong&gt;DOI: &lt;a href="https://doi.org/10.5281/zenodo.19057202" rel="noopener noreferrer"&gt;doi.org/10.5281/zenodo.19057202&lt;/a&gt;&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
📜 License: CC BY 4.0&lt;/p&gt;

&lt;p&gt;The paper covers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Full formal specification of the Action Claim ontology&lt;/li&gt;
&lt;li&gt;Complete mathematical treatment of the Justification Gap&lt;/li&gt;
&lt;li&gt;Attack pattern neutralization (scope subdeclaration, decomposition attack, mandate laundering)&lt;/li&gt;
&lt;li&gt;EU AI Act regulatory alignment (Articles 9, 11, 13, 14)&lt;/li&gt;
&lt;li&gt;Positioning against IETF, A2A, MCP, and AIMS model&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Final Thought
&lt;/h2&gt;

&lt;p&gt;The governance of autonomous agents is not a new problem. Simon identified its theoretical roots in 1947. Ostrom identified the institutional design solutions in 1990. Normative MAS researchers formalized the computational analogues through the 1990s and 2000s.&lt;/p&gt;

&lt;p&gt;What's new in 2026 is the urgency.&lt;/p&gt;

&lt;p&gt;Agents that can delete production databases and execute financial transactions are being deployed without the governance infrastructure this body of knowledge prescribes.&lt;/p&gt;

&lt;p&gt;COA-MAS applies established principles to a new domain. The question is not whether governance is necessary — it's whether we build it before or after the first major incident.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;If you're building multi-agent systems in production, I'd be genuinely interested in feedback on whether these primitives map to the problems you're encountering. The paper is open access — feel free to cite, critique, or extend.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;— Rudson Kiyoshi Souza Carvalho, Independent Researcher&lt;/em&gt;&lt;br&gt;&lt;br&gt;
&lt;em&gt;&lt;a href="https://doi.org/10.5281/zenodo.19057202" rel="noopener noreferrer"&gt;doi.org/10.5281/zenodo.19057202&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>security</category>
      <category>architecture</category>
      <category>multiagent</category>
    </item>
    <item>
      <title>TERSE — A New Serialization Format Built for LLMs</title>
      <dc:creator>Rudson Kiyoshi Souza Carvalho</dc:creator>
      <pubDate>Tue, 31 Mar 2026 12:10:36 +0000</pubDate>
      <link>https://dev.to/rudsoncarvalho/terse-a-new-serialization-format-built-for-llms-4n34</link>
      <guid>https://dev.to/rudsoncarvalho/terse-a-new-serialization-format-built-for-llms-4n34</guid>
      <description>&lt;p&gt;&lt;em&gt;JSON is the default. But defaults were built for a different world.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;Every time you send structured data to a Large Language Model, you pay for it token by token. And if you're using JSON — which almost everyone is — you're paying for a lot of characters that carry no information.&lt;/p&gt;

&lt;p&gt;Take this simple payload:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"user_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1001&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"active"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"data"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"feature_a"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"feature_b"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"verified"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Count the noise: braces, quotes around every key and string value, commas, colons with spaces. Now imagine this multiplied across thousands of API calls per day. That's real money.&lt;/p&gt;

&lt;p&gt;I built &lt;strong&gt;TERSE&lt;/strong&gt; to address this.&lt;/p&gt;




&lt;h2&gt;
  
  
  What is TERSE?
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;TERSE&lt;/strong&gt; (Token-Efficient Recursive Serialization Encoding) is a text-based data serialization format designed to represent the complete JSON data model with substantially fewer tokens — making it significantly more cost-efficient for use as input to Large Language Models.&lt;/p&gt;

&lt;p&gt;The same payload in TERSE:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;user_id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1001&lt;/span&gt;
&lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;active&lt;/span&gt;
&lt;span class="na"&gt;data&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;feature_a feature_b&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
&lt;span class="na"&gt;verified&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;T&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Same information. ~47% fewer tokens.&lt;/p&gt;




&lt;h2&gt;
  
  
  How it compares
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Format&lt;/th&gt;
&lt;th&gt;Token savings vs JSON&lt;/th&gt;
&lt;th&gt;Full JSON coverage?&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;JSON&lt;/td&gt;
&lt;td&gt;baseline&lt;/td&gt;
&lt;td&gt;✓&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;YAML&lt;/td&gt;
&lt;td&gt;~20%&lt;/td&gt;
&lt;td&gt;✓ (verbose arrays)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;TOON&lt;/td&gt;
&lt;td&gt;~40%&lt;/td&gt;
&lt;td&gt;✗ (flat data only)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;TERSE&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~47%&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;✓&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;YAML is a genuine improvement over JSON — it's more compact and covers the full data model. But it was designed for humans to write, not for LLMs to consume. Verbose arrays (&lt;code&gt;- item&lt;/code&gt; per line), full-word booleans (&lt;code&gt;true&lt;/code&gt;/&lt;code&gt;false&lt;/code&gt;), and a notoriously complex parser spec limit its token savings.&lt;/p&gt;

&lt;p&gt;TOON goes further on token reduction but falls apart with nested objects — it only works for flat, uniform tabular data. If your payload has any nesting, TOON can't represent it.&lt;/p&gt;

&lt;p&gt;TERSE was designed to close that gap: full JSON data model coverage, with token efficiency as the primary design constraint.&lt;/p&gt;




&lt;h2&gt;
  
  
  The five design principles
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1. Bare strings&lt;/strong&gt; — identifiers and common values require no quotation marks. &lt;code&gt;production&lt;/code&gt; stays &lt;code&gt;production&lt;/code&gt;, not &lt;code&gt;"production"&lt;/code&gt;. Quotes are reserved for strings that actually need them — those containing spaces, reserved characters, or special syntax.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Compact primitives&lt;/strong&gt; — &lt;code&gt;null&lt;/code&gt;, &lt;code&gt;true&lt;/code&gt;, and &lt;code&gt;false&lt;/code&gt; become single characters: &lt;code&gt;~&lt;/code&gt;, &lt;code&gt;T&lt;/code&gt;, &lt;code&gt;F&lt;/code&gt;. Three of the most common values in any payload, each reduced to one token.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Implicit delimiters&lt;/strong&gt; — spaces separate values inside objects and arrays. No trailing commas, no colons between array elements.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Schema arrays&lt;/strong&gt; — the biggest token win for tabular data. Uniform arrays of objects declare their fields once, then list values positionally:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;users&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="c1"&gt;#[id name role active]&lt;/span&gt;
  &lt;span class="s"&gt;1 Alice admin T&lt;/span&gt;
  &lt;span class="s"&gt;2 Bruno editor T&lt;/span&gt;
  &lt;span class="s"&gt;3 Carla viewer F&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The equivalent JSON repeats &lt;code&gt;"id"&lt;/code&gt;, &lt;code&gt;"name"&lt;/code&gt;, &lt;code&gt;"role"&lt;/code&gt;, &lt;code&gt;"active"&lt;/code&gt; on every single row. For a 100-row dataset, that's 400 unnecessary key repetitions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. Recursive structure&lt;/strong&gt; — all constructs nest arbitrarily. Objects inside arrays inside schema arrays — all valid, all compact. No flat-only limitations.&lt;/p&gt;




&lt;h2&gt;
  
  
  A real example: nested order
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;JSON&lt;/strong&gt; (~180 tokens):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"orderId"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ORD-001"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"customer"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Rafael Torres"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"email"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"r@email.com"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"items"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"sku"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"A1"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"qty"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"price"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;9.99&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"sku"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"B3"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"qty"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"price"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;24.50&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"paid"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"notes"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;TERSE&lt;/strong&gt; (~95 tokens):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;orderId&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ORD-001&lt;/span&gt;
&lt;span class="na"&gt;customer&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Rafael&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;Torres"&lt;/span&gt; &lt;span class="nv"&gt;email&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;&lt;span class="nv"&gt;r@email.com&lt;/span&gt;&lt;span class="pi"&gt;}&lt;/span&gt;
&lt;span class="na"&gt;items&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="c1"&gt;#[sku qty price]&lt;/span&gt;
  &lt;span class="s"&gt;A1 2 &lt;/span&gt;&lt;span class="m"&gt;9.99&lt;/span&gt;
  &lt;span class="s"&gt;B3 1 &lt;/span&gt;&lt;span class="m"&gt;24.50&lt;/span&gt;
&lt;span class="na"&gt;paid&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;T&lt;/span&gt;
&lt;span class="na"&gt;notes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;~&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is where TERSE separates itself from TOON and CSV — deeply nested structures work exactly as expected.&lt;/p&gt;




&lt;h2&gt;
  
  
  You don't write TERSE by hand
&lt;/h2&gt;

&lt;p&gt;The workflow is identical to JSON:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Your data (object/dict)
      ↓
serialize()        ← terse-js or terse-py
      ↓
TERSE string       ← sent to the LLM
      ↓
parse()            ← if you need it back
      ↓
Your data again
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Just like nobody writes &lt;code&gt;JSON.stringify()&lt;/code&gt; output by hand — you call the function. TERSE works the same way. The format is optimized for the one reader that actually matters: the LLM.&lt;/p&gt;




&lt;h2&gt;
  
  
  On design intent: why not compress further?
&lt;/h2&gt;

&lt;p&gt;TERSE could go deeper — automatic key abbreviation, binary type encoding, dictionary compression. We deliberately stopped short of that.&lt;/p&gt;

&lt;p&gt;The goal is a format that remains &lt;strong&gt;human-auditable&lt;/strong&gt;: you can open a &lt;code&gt;.terse&lt;/code&gt; file in any text editor and understand what you're looking at without tooling. In LLM pipelines, auditability is a safety property, not just a convenience. When an agent misbehaves, you need to inspect its inputs.&lt;/p&gt;




&lt;h2&gt;
  
  
  Two questions that come up
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Can I use TERSE for REST API communication between microservices?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;You can, but it's not the primary use case. REST APIs are consumed by many clients across different teams and languages — JSON's universal support is a real advantage there. TERSE shines where you control both ends: serializing data before sending it to an LLM, and parsing the response on the other side.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Can I use TERSE for application configuration, like YAML?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Yes — the format supports everything YAML does for config files: nested objects, arrays, typed values, comments. Worth considering if your config is also consumed by an LLM as context.&lt;/p&gt;




&lt;h2&gt;
  
  
  What's available today
&lt;/h2&gt;

&lt;p&gt;The project includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Formal specification&lt;/strong&gt; (v0.7) with ABNF grammar, conformance rules, and security considerations — published on Zenodo with DOI: &lt;a href="https://doi.org/10.5281/zenodo.19058364" rel="noopener noreferrer"&gt;10.5281/zenodo.19058364&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reference implementations&lt;/strong&gt; in TypeScript, Python, Java, and Go&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Live playground&lt;/strong&gt; where you can paste JSON and see the TERSE output in real time&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Everything is open source under MIT (implementations) and CC BY 4.0 (specification).&lt;/p&gt;




&lt;h2&gt;
  
  
  Links
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;🌐 &lt;strong&gt;Landing page + playground&lt;/strong&gt;: &lt;a href="https://rudsoncarvalho.github.io/terse-format" rel="noopener noreferrer"&gt;rudsoncarvalho.github.io/terse-format&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;📦 &lt;strong&gt;GitHub&lt;/strong&gt;: &lt;a href="https://github.com/RudsonCarvalho/terse-format" rel="noopener noreferrer"&gt;github.com/RudsonCarvalho/terse-format&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;📄 &lt;strong&gt;Spec (Zenodo DOI)&lt;/strong&gt;: &lt;a href="https://doi.org/10.5281/zenodo.19058364" rel="noopener noreferrer"&gt;10.5281/zenodo.19058364&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;&lt;code&gt;npm install terse-js&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;pip install terse-py&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;TERSE is still a draft — v0.7 is open for community review. If you work with LLM pipelines at scale, I'd love to hear whether this addresses a real pain point in your stack.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Rudson Kiyoshi Souza Carvalho — Independent Researcher&lt;/em&gt;&lt;/p&gt;

</description>
      <category>llm</category>
      <category>opensource</category>
      <category>ai</category>
      <category>token</category>
    </item>
    <item>
      <title>Resilience Evaluation and Optimization Framework — REOF</title>
      <dc:creator>Rudson Kiyoshi Souza Carvalho</dc:creator>
      <pubDate>Wed, 12 Jun 2024 12:23:30 +0000</pubDate>
      <link>https://dev.to/rudsoncarvalho/resilience-evaluation-and-optimization-framework-reof-4f9c</link>
      <guid>https://dev.to/rudsoncarvalho/resilience-evaluation-and-optimization-framework-reof-4f9c</guid>
      <description>&lt;p&gt;Autor: Rudson Kiyoshi Souza Carvalho&lt;/p&gt;

&lt;p&gt;Data: Abril de 2024&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Objetivo:&lt;/strong&gt; Este documento apresenta o REOF, um framework para avaliar, quantificar e otimizar a resiliência e confiabilidade de sistemas, com foco em aplicações de software.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Ao avaliar sistematicamente cada componente crítico, a metodologia ajuda a identificar proativamente áreas de vulnerabilidade que podem comprometer a confiabilidade/resiliência do sistema.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;1. Introdução ao REOF:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;O REOF é uma ferramenta padronizada que permite a análise, quantificação e expressão da resiliência e confiabilidade de um sistema através de um índice numérico (IRC - Índice de Resiliência e Confiabilidade).&lt;br&gt;
A metodologia foca na prevenção de falhas e na implementação de melhores práticas para aumentar a confiabilidade.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Metodologia de Análise REOF:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;O método considera Verticais de Avaliação: O REOF divide a análise em "verticais" que representam pontos críticos de um sistema, como:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;EE - Entrada Externa (pontos de interação com o cliente)&lt;/li&gt;
&lt;li&gt;SE - Saídas Externas (envio de dados para outros sistemas)&lt;/li&gt;
&lt;li&gt;CE - Consultas Externas (integrações com outros sistemas)&lt;/li&gt;
&lt;li&gt;DI - Dados Internos (consultas a banco de dados, cache, etc.)&lt;/li&gt;
&lt;li&gt;AC - Aplicação em Container (configurações de health check)&lt;/li&gt;
&lt;li&gt;SEC - Framework de Segurança Habilitado (ex: Spring Security)&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;Um dos pontos mais importantes sobre este framework é que ele foi concebido para ser flexível a qualquer vertical criada, portanto, você pode criar suas próprias verticais de avaliação e poderá avaliar qualquer processo que tenha um conjunto de boas práticas a serem avaliados. (logo poderia avaliar verticais de infraestrutura, técnicas de construções de aplicativos mobile, entre outros processos. &lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;Proteções e Pesos:&lt;/strong&gt; Para cada vertical, são definidas "proteções" (melhores práticas) que aumentam a resiliência, cada uma com um peso específico.&lt;br&gt;
"Com sua equipe de engenharia ou arquitetura, você poderá listar as melhores práticas de proteção para promover resiliência e confiabilidade ao sistema, definindo pesos para cada proteção aplicada."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cálculo do Índice:&lt;/strong&gt; O IRC é calculado pela soma ponderada das pontuações de cada vertical.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fator de Degradação:&lt;/strong&gt; Um fator de degradação é aplicado para considerar o impacto de múltiplos domínios/funcionalidades em um mesmo microsserviço (micromonolitos).&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Para cada domínio adicional, quero reduzir a qualidade do índice geral em 10% para cada domínio/funcionalidade adicionada, pois incluir novas/extras funcionalidades/domínios diferentes faz com que seu serviço tenha que compartilhar recursos, e uma lentidão em uma funcionalidade pode esgotar recursos para outras funcionalidades no mesmo microsserviço.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;Normalização do Índice:&lt;/strong&gt; O IRC é normalizado para uma escala de 0 a 10, facilitando a comunicação e comparação entre diferentes sistemas.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. IRC/REOF como SLA:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;O REOF permite expressar o IRC em níveis de serviço (SLA):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;item 1 Excelente (8 a 10)&lt;/li&gt;
&lt;li&gt;item 2 Bom (5 a 7.9)&lt;/li&gt;
&lt;li&gt;item 3 Aceitável (3 a 4.9)&lt;/li&gt;
&lt;li&gt;item 4 Insatisfatório (abaixo de 3)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;em&gt;Pirâmide de confiabilidade REOF de Ruds&lt;/em&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc3ph37jwowxim33497zo.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc3ph37jwowxim33497zo.png" alt=" " width="800" height="407"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;SLA para Serviço Excelente:&lt;/strong&gt; O IRC/REOF deve ser maior ou igual a 8, indicando um nível de serviço excelente. Isso reflete a alta confiabilidade e eficiência do microserviço, sem sobrecarga de domínios adicionais.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;SLA para Serviço Bom:&lt;/strong&gt; O IRC/REOF deve ser entre 5 e 7.9, indicando um nível de serviço bom. Isso reflete a confiabilidade do microserviço.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;SLA para Serviço Aceitável:&lt;/strong&gt; O IRC/REOF deve ser entre 3 e 4.9, indicando um nível de serviço aceitável. Isso indica que há espaço para melhoria. Medidas corretivas devem ser aplicadas para aumentar a confiabilidade deste serviço e reduzir impactos de paradas do serviço por causa da aplicação.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;SLA para Serviço Insatisfatório:&lt;/strong&gt; O IRC/REOF deve estar abaixo de 3, indicando um nível de serviço insatisfatório. Isso indica que este serviço precisa de revisões e melhorias, não sendo um serviço confiável.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Flexibilidade e Automação:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;O REOF é flexível e pode ser personalizado com novas verticais e proteções.&lt;br&gt;
É possível automatizar o cálculo do IRC através de análise estática de código, mas a precisão pode ser limitada.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. REOF vs. MTBF:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;O REOF é uma medida proativa que avalia a robustez do sistema com base em sua construção, enquanto o MTBF é uma medida reativa que considera apenas o tempo médio entre falhas.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;O MTBF é a métrica da sorte ao longo do tempo, um MTBF alto pode indicar que um sistema teve um bom histórico operacional, dadas as condições ideais de operação ambiental desse sistema, no entanto, não diferencia necessariamente sistemas genuinamente bem projetados daqueles que Você pode ter tido 'sorte' de ter um ambiente estável durante o período de execução e avaliação.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;O REOF é mais abrangente e fornece insights mais acionáveis para melhorar a resiliência.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;6. Relação com Chaos Engineering:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;REOF e Chaos Engineering são abordagens complementares.&lt;br&gt;
O REOF garante que as melhores práticas de resiliência sejam aplicadas durante o desenvolvimento, enquanto o Chaos Engineering testa a resiliência do sistema em produção.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;7. Benefícios do REOF:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Comunicação eficaz sobre a confiabilidade do sistema.&lt;/li&gt;
&lt;li&gt;Identificação precisa de áreas de melhoria.&lt;/li&gt;
&lt;li&gt;Cultura de melhoria contínua e prevenção de falhas.&lt;/li&gt;
&lt;li&gt;Gerenciamento de riscos e conformidade com SLAs.&lt;/li&gt;
&lt;li&gt;Melhor experiência do usuário.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;8. Considerações sobre Custos:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Implementação do REOF pode ter custo inicial significativo, mas reduz custos operacionais a longo prazo.&lt;br&gt;
Chaos Engineering pode ter baixo custo de implementação, mas custos operacionais podem ser altos durante os testes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Como o método REOF é melhor do que o método MTBF?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;O MTBF é uma estatística de funcionamento do seu sistema, segundo um histórico operacional, uma medição ao longo do tempo, onde um sistema pode funcionar muito bem dada as condições ideais de operação, se nada de anormal acontecer no seu ambiente/infra, o MTBF indicará que seu sistema é extremamente confiável, pois ele depende das condições sob a qual o seu sistema opera para que possam ocorrer falhas, este método não sabe como seu sistema foi construído, considera a freqüência de falhas num período de tempo, e não a robustez como o sistema foi construído para lidar com diferentes tipos de variações no ambiente e consequentemente se proteger das falhas, é um método reativo.&lt;/p&gt;

&lt;p&gt;O MTBF é a métrica da sorte em função do tempo, um MTBF alto pode indicar que um sistema teve um bom histórico de funcionamento dada as condições de ambiente ideais de operação deste sistema, porém, não necessariamente distingue entre sistemas genuinamente bem projetados e aqueles que pode ter tido "sorte" de ter um ambiente estável durante o período de execução e avaliação.&lt;/p&gt;

&lt;p&gt;O REOF genuinamente avalia a robustez do sistema, como o sistema foi construído para lidar com os diferentes tipos de problemas que possam ocorrer no ambiente produtivo, é um método proativo.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Relação entre o método REOF e o Chaos Monkey/Engineering&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;O método REOF, contrasta com a aplicação de ferramentas como o Chaos Monkey em vários aspectos fundamentais. Ambas as abordagens visam melhorar a resiliência e a confiabilidade dos sistemas, mas fazem isso de maneiras complementares, a engenharia do caos é uma disciplina de experimentação em um sistema para criar confiança na capacidade do sistema de resistir a condições turbulentas na produção, enquanto este método garante que foram aplicadas as melhores práticas para resistir ao caos, ou seja, garante a preparação para falhas, os pontos fortes da metodologia de avaliação de confiabilidade em relação ao uso de um Chaos Monkey são:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Foco na Prevenção e Melhoria Contínua&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Avaliação Holística: A metodologia fornece uma visão abrangente da performance do sistema ao longo do tempo, permitindo identificar tendências, áreas de melhoria e impactos das mudanças, ao contrário do Chaos Monkey, que testa a resiliência de forma mais imediata e isolada.&lt;/p&gt;

&lt;p&gt;Incentivo à Inovação: A gamificação incentiva (proposta tópico desafio de excelência) as equipes a buscar melhorias contínuas e soluções inovadoras para elevar os índices de confiabilidade, promovendo uma cultura de excelência operacional.&lt;/p&gt;

&lt;p&gt;Planejamento Estratégico: Oferece uma base para o planejamento estratégico e a alocação de recursos, ao identificar áreas críticas que necessitam de atenção e investimento, algo que a aplicação isolada do Chaos Monkey não proporciona diretamente.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Gestão de Riscos e Conformidade&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Redução de Riscos Operacionais: Ao focar na avaliação e melhoria contínuas da confiabilidade, esta metodologia ajuda a mitigar riscos operacionais de longo prazo, enquanto o Chaos Monkey é mais uma ferramenta de teste de estresse que expõe vulnerabilidades.&lt;/p&gt;

&lt;p&gt;Conformidade com SLAs: A metodologia permite a monitoração proativa e a garantia de que os serviços atendam ou excedam os SLAs acordados, o que é fundamental para a satisfação do cliente e a conformidade regulatória.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Melhoria da Experiência do Usuário&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Foco no Usuário: Avaliar e melhorar a confiabilidade com base nos SLAs enfatiza a importância da experiência do usuário, visando garantir uma operação sem interrupções e desempenho otimizado dos serviços.&lt;/p&gt;

&lt;p&gt;Antecipação de Problemas: Permite a identificação e correção proativa de possíveis falhas antes que afetem os usuários finais, enquanto o Chaos Monkey simula falhas para testar a resiliência, o que pode ou não ser diretamente relacionado à experiência do usuário.&lt;/p&gt;

&lt;p&gt;Complementaridade com Ferramentas de Teste de Resiliência&lt;br&gt;
Abordagem Integrada: Embora focada em avaliação e melhoria, essa metodologia pode ser complementada por ferramentas como o Chaos Monkey para uma abordagem mais robusta à resiliência. Juntas, elas oferecem uma estratégia de defesa em profundidade contra falhas e interrupções.&lt;/p&gt;

&lt;p&gt;Em resumo, a metodologia de avaliação de confiabilidade traz uma abordagem preventiva e estratégica para a gestão da confiabilidade dos sistemas, enfocando a melhoria contínua, a inovação e a satisfação do cliente. Enquanto o Chaos Monkey é uma ferramenta valiosa para testar a resiliência de forma específica e isolada, a combinação das duas abordagens oferece um caminho poderoso para alcançar a excelência operacional e a resiliência do sistema.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Conclusão:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;O REOF é um framework poderoso para construir e gerenciar sistemas resilientes. Sua abordagem proativa, foco na prevenção e flexibilidade o tornam uma ferramenta valiosa para qualquer organização que busca alcançar a excelência operacional e garantir a satisfação do cliente.&lt;/p&gt;

&lt;p&gt;Siga o link para mais detalhes: &lt;br&gt;
Follow the medium link for more details about this framework: &lt;a href="https://medium.com/@rudsonkiyoshicarvalho/resilience-evaluation-and-optimization-framework-reof-541d23018460" rel="noopener noreferrer"&gt;Medium REOF&lt;/a&gt;&lt;/p&gt;

</description>
      <category>resilience</category>
      <category>microservices</category>
      <category>softwareengineer</category>
      <category>performance</category>
    </item>
  </channel>
</rss>
