<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: AOS Architect</title>
    <description>The latest articles on DEV Community by AOS Architect (@aos_standard).</description>
    <link>https://dev.to/aos_standard</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3864661%2Ffb24f366-82cd-4814-9853-9e612950fa0a.png</url>
      <title>DEV Community: AOS Architect</title>
      <link>https://dev.to/aos_standard</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/aos_standard"/>
    <language>en</language>
    <item>
      <title>Why AI Agents Don't Follow Rules — The Case for Physical Governance</title>
      <dc:creator>AOS Architect</dc:creator>
      <pubDate>Mon, 06 Apr 2026 23:18:38 +0000</pubDate>
      <link>https://dev.to/aos_standard/why-ai-agents-dont-follow-rules-the-case-for-physical-governance-382f</link>
      <guid>https://dev.to/aos_standard/why-ai-agents-dont-follow-rules-the-case-for-physical-governance-382f</guid>
      <description>&lt;h2&gt;
  
  
  The Fact That Started This
&lt;/h2&gt;

&lt;p&gt;A repository had over 130KB of governance documentation.&lt;/p&gt;

&lt;p&gt;The AI agent read it. Acknowledged it. Then violated it on the next tool call.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;This is not a failure of instruction. &lt;strong&gt;It is a failure of architecture.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Why Textual Rules Fail
&lt;/h2&gt;

&lt;p&gt;The current standard approach to AI agent governance is: write a rule in a prompt.&lt;/p&gt;

&lt;h3&gt;
  
  
  Rules
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Never edit the evals/ directory&lt;/li&gt;
&lt;li&gt;Write operations to 00_Management/ are forbidden&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This has a structural flaw.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Textual rules enforce at read time. They assume the agent will choose compliance.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;There is no mechanism that enforces this choice at execution time.&lt;/p&gt;

&lt;p&gt;This is why &lt;code&gt;rm -rf /&lt;/code&gt; requires a confirmation flag, not a policy document.&lt;br&gt;
Physical constraints enforce at execution time.&lt;br&gt;
Textual rules enforce at reading time — which is the wrong moment.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Verification Contamination Problem
&lt;/h2&gt;

&lt;p&gt;There is a second structural problem.&lt;/p&gt;

&lt;p&gt;If an agent can evaluate its own output, it can contaminate the evaluation criteria —&lt;br&gt;
not intentionally, but by carrying the same failure modes from generation into evaluation.&lt;/p&gt;

&lt;p&gt;A system where tests always pass may be a system where tests don't work.&lt;/p&gt;




&lt;h2&gt;
  
  
  What AOS Defines
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;AI Operating Standard (AOS)&lt;/strong&gt; defines the minimum physical constraint layer&lt;br&gt;
for AI agent operations in a shared codebase.&lt;/p&gt;

&lt;p&gt;Three components:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Zones — Classify every path into one of three types
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Zone&lt;/th&gt;
&lt;th&gt;Class&lt;/th&gt;
&lt;th&gt;Write Permission&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Oracle&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Read-only, absolute&lt;/td&gt;
&lt;td&gt;No agent may write&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Permitted&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Agent workspace&lt;/td&gt;
&lt;td&gt;Allowed within role limits&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Prohibited&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Out of scope&lt;/td&gt;
&lt;td&gt;Sovereign authorization only&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  2. Roles — Non-overlapping responsibilities
&lt;/h3&gt;

&lt;p&gt;Three roles: Architect, Executor, Sovereign.&lt;br&gt;
An agent MUST NOT act outside its assigned role.&lt;br&gt;
When a role boundary is reached, the agent stops and escalates to a human.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Physical Enforcement — Intercept at execution time
&lt;/h3&gt;

&lt;p&gt;A PreToolUse hook blocks Write operations before filesystem access occurs.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Write to Oracle Zone → exit 2 (call is never executed)&lt;/li&gt;
&lt;li&gt;Destructive patterns (&lt;code&gt;sed -i&lt;/code&gt;, &lt;code&gt;perl -i&lt;/code&gt;, &lt;code&gt;truncate&lt;/code&gt;) → exit 2&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;No assumption of agent goodwill. Physical law enforces compliance.&lt;/p&gt;




&lt;h2&gt;
  
  
  Reference Implementation: iron_cage
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;iron_cage&lt;/strong&gt; is the AOS reference implementation.&lt;br&gt;
It implements §4.1–§4.5 via Claude Code's PreToolUse Hook system.&lt;/p&gt;

&lt;p&gt;Behind iron_cage is a design principle called &lt;strong&gt;Type-91 Governance&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Forensic isolation&lt;/strong&gt; — physical evidence trails that are tamper-evident&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Physical isolation&lt;/strong&gt; — agents cannot modify their own evaluation criteria&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The scripts are the surface. The architecture runs deeper.&lt;/p&gt;

&lt;p&gt;AOS is the standard. iron_cage is the proof that it works.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Specification (AOS-v0.1):&lt;/strong&gt; &lt;a href="https://github.com/aos-standard/AOS-spec" rel="noopener noreferrer"&gt;https://github.com/aos-standard/AOS-spec&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Feed the Spec to the Agent
&lt;/h2&gt;

&lt;p&gt;This specification is not written only for human readers.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;AOS-v0.1.md&lt;/code&gt; opens with &lt;strong&gt;§0: Machine-Reading Instructions&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Load this spec into an agent's context window, and the agent understands —&lt;br&gt;
at specification level — what it must not do.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Not "do not do X because the prompt says so."&lt;br&gt;
"Do not do X because the specification defines it as a hard constraint&lt;br&gt;
with a physical enforcement mechanism."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This is the second design intent of AOS:&lt;br&gt;
agents that read the spec become self-constraining.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why Now
&lt;/h2&gt;

&lt;p&gt;In 2026, "how do you trust what an AI agent produced" remains unsolved.&lt;/p&gt;

&lt;p&gt;Most teams are still trying to solve it with prompts.&lt;/p&gt;

&lt;p&gt;There is no standard for the physical governance layer.&lt;br&gt;
Someone has to define it.&lt;/p&gt;

&lt;p&gt;AOS is that attempt.&lt;/p&gt;




&lt;h2&gt;
  
  
  This Is a Draft
&lt;/h2&gt;

&lt;p&gt;AOS v0.1 is not a finished standard.&lt;/p&gt;

&lt;p&gt;Issues, pull requests, and implementation reports are welcome.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/aos-standard/AOS-spec" rel="noopener noreferrer"&gt;https://github.com/aos-standard/AOS-spec&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

</description>
      <category>agents</category>
      <category>ai</category>
      <category>architecture</category>
      <category>security</category>
    </item>
  </channel>
</rss>
