<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: slucerodev</title>
    <description>The latest articles on DEV Community by slucerodev (@slucerodev).</description>
    <link>https://dev.to/slucerodev</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3873900%2Ff57d7efc-a90c-400f-84f4-ff111c4cb0f6.png</url>
      <title>DEV Community: slucerodev</title>
      <link>https://dev.to/slucerodev</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/slucerodev"/>
    <language>en</language>
    <item>
      <title>Your AI Agent Just Did Something. Can You Prove What It Was?</title>
      <dc:creator>slucerodev</dc:creator>
      <pubDate>Sat, 11 Apr 2026 18:06:38 +0000</pubDate>
      <link>https://dev.to/slucerodev/your-ai-agent-just-did-something-can-you-prove-what-it-was-1gdc</link>
      <guid>https://dev.to/slucerodev/your-ai-agent-just-did-something-can-you-prove-what-it-was-1gdc</guid>
      <description>&lt;p&gt;You deployed an AI agent. It took an action. Something went wrong.&lt;/p&gt;

&lt;p&gt;Now answer these questions:&lt;/p&gt;

&lt;p&gt;What exactly did it do?&lt;br&gt;
What input caused it?&lt;br&gt;
Can you reproduce the exact sequence?&lt;br&gt;
Can you prove to your legal team that it didn't do something else?&lt;br&gt;
If the answer to any of those is "I'm not sure" — you have a governance problem.&lt;/p&gt;

&lt;p&gt;The Actual Problem&lt;br&gt;
Most AI agent frameworks are built around getting things done. That's fine. But they have no answer for proving what was done, why it was done, or reconstructing the exact execution trace after the fact.&lt;/p&gt;

&lt;p&gt;Logs help. Traces help. But neither gives you deterministic replay — the ability to take a set of inputs and provably reconstruct the same execution, byte for byte, every time.&lt;/p&gt;

&lt;p&gt;Without that, your audit trail is a story you're telling. With it, it's evidence.&lt;/p&gt;

&lt;p&gt;What ExoArmur Does&lt;br&gt;
ExoArmur is a governance layer that sits between your AI decision engine and your execution targets. It doesn't replace your agent framework. It wraps it.&lt;/p&gt;

&lt;p&gt;Every action that passes through ExoArmur:&lt;/p&gt;

&lt;p&gt;Gets evaluated by a policy decision point before it runs&lt;br&gt;
Produces a cryptographically bound audit record tied to the original intent&lt;br&gt;
Is deterministically replayable — same inputs always reconstruct the same trace&lt;br&gt;
Can be vetoed or queued for human operator approval&lt;br&gt;
The pipeline looks like this:&lt;/p&gt;

&lt;p&gt;Decision Source → ActionIntent → PolicyDecisionPoint → SafetyGate → [Approval?] → Executor → ExecutionProofBundle&lt;br&gt;
The key invariant: no action executes without passing through the governance boundary. Executors are untrusted plugins. The core is immutable. CI enforces determinism with a three-run stability gate on every push.&lt;/p&gt;

&lt;p&gt;See It in 5 Minutes&lt;br&gt;
bash&lt;br&gt;
pip install exoarmur-core&lt;br&gt;
Then run this:&lt;/p&gt;

&lt;p&gt;python&lt;br&gt;
from exoarmur import ReplayEngine&lt;br&gt;
from exoarmur.replay.event_envelope import CanonicalEvent&lt;br&gt;
import hashlib, json&lt;/p&gt;

&lt;p&gt;payload = {"kind": "inline", "ref": {"event_id": "01ARZ3NDEKTSV4RRFFQ69G5FAV"}}&lt;br&gt;
event = CanonicalEvent(&lt;br&gt;
    event_id="01ARZ3NDEKTSV4RRFFQ69G5FAV",&lt;br&gt;
    event_type="belief_creation_started",&lt;br&gt;
    actor="demo",&lt;br&gt;
    correlation_id="corr-1",&lt;br&gt;
    payload=payload,&lt;br&gt;
    payload_hash=hashlib.sha256(&lt;br&gt;
        json.dumps(payload, sort_keys=True, separators=(",", ":")).encode()&lt;br&gt;
    ).hexdigest(),&lt;br&gt;
)&lt;/p&gt;

&lt;p&gt;engine = ReplayEngine(audit_store={"corr-1": [event]})&lt;br&gt;
report = engine.replay_correlation("corr-1")&lt;/p&gt;

&lt;p&gt;print("Replay result:", getattr(report.result, "value", report.result))&lt;br&gt;
print("Failures:", report.failures or "none")&lt;br&gt;
Output:&lt;/p&gt;

&lt;p&gt;Replay result: success&lt;br&gt;
Failures: none&lt;br&gt;
That's a real deterministic replay over a real cryptographically-bound audit event. Not a mock. Not a demo stub. The same code runs in CI on every push.&lt;/p&gt;

&lt;p&gt;The Governance Pipeline&lt;br&gt;
Turn on the full V2 governance pipeline with three environment flags:&lt;/p&gt;

&lt;p&gt;bash&lt;br&gt;
EXOARMUR_FLAG_V2_FEDERATION_ENABLED=true \&lt;br&gt;
EXOARMUR_FLAG_V2_CONTROL_PLANE_ENABLED=true \&lt;br&gt;
EXOARMUR_FLAG_V2_OPERATOR_APPROVAL_REQUIRED=true \&lt;br&gt;
python scripts/demo_v2_restrained_autonomy.py --operator-decision deny&lt;br&gt;
Output:&lt;/p&gt;

&lt;p&gt;DEMO_RESULT=DENIED&lt;br&gt;
ACTION_EXECUTED=false&lt;br&gt;
AUDIT_STREAM_ID=det-...&lt;br&gt;
The action was vetoed. The denial is in the audit trail. Replay the stream:&lt;/p&gt;

&lt;p&gt;bash python scripts/demo_v2_restrained_autonomy.py --replay &lt;br&gt;
Same trace. Every time. Provably.&lt;/p&gt;

&lt;p&gt;What It's Not&lt;br&gt;
ExoArmur is not an LLM. Not an agent framework. Not a workflow engine. It doesn't care what's making decisions — it only cares that whatever executes passes through the governance boundary and leaves a verifiable trail.&lt;/p&gt;

&lt;p&gt;Use it with LangChain. Use it with CrewAI. Use it with a custom decision layer. It wraps whatever you have.&lt;/p&gt;

&lt;p&gt;Why Determinism Is the Core Bet&lt;br&gt;
The three-run stability gate in CI isn't ceremony. It's the central guarantee: if your system can't reproduce the same trace from the same inputs, your audit trail isn't an audit trail. It's a log. Logs can be explained away. Deterministic replay cannot.&lt;/p&gt;

&lt;p&gt;This matters right now because:&lt;/p&gt;

&lt;p&gt;Enterprises deploying AI agents are facing internal compliance reviews&lt;br&gt;
Regulated industries (finance, healthcare, legal) need more than "the model decided"&lt;br&gt;
Incident response for AI systems has no tooling — ExoArmur is a start&lt;br&gt;
Try It&lt;br&gt;
bash&lt;br&gt;
git clone &lt;a href="https://github.com/slucerodev/ExoArmur-Core.git" rel="noopener noreferrer"&gt;https://github.com/slucerodev/ExoArmur-Core.git&lt;/a&gt;&lt;br&gt;
cd ExoArmur-Core&lt;br&gt;
pip install ".[dev]"&lt;br&gt;
python -m pytest -q&lt;br&gt;
1033 tests. Three deterministic runs. No external infrastructure required for the core suite.&lt;/p&gt;

&lt;p&gt;Repo: github.com/slucerodev/ExoArmur-Core&lt;/p&gt;

&lt;p&gt;If you're building agents in production and care about what they actually did — this is built for you.&lt;/p&gt;

</description>
      <category>python</category>
      <category>ai</category>
      <category>security</category>
      <category>opensource</category>
    </item>
  </channel>
</rss>
