<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Aditya P Dixit</title>
    <description>The latest articles on DEV Community by Aditya P Dixit (@hootsworth).</description>
    <link>https://dev.to/hootsworth</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F4005747%2Fe8ed1ab9-bcef-444a-b867-f8669ecc20ea.jpg</url>
      <title>DEV Community: Aditya P Dixit</title>
      <link>https://dev.to/hootsworth</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/hootsworth"/>
    <language>en</language>
    <item>
      <title>I Got Tired of AI Agents Having Root Access to Everything, So I Built XRisk</title>
      <dc:creator>Aditya P Dixit</dc:creator>
      <pubDate>Sat, 27 Jun 2026 19:13:01 +0000</pubDate>
      <link>https://dev.to/hootsworth/i-got-tired-of-ai-agents-having-root-access-to-everything-so-i-built-xrisk-11k5</link>
      <guid>https://dev.to/hootsworth/i-got-tired-of-ai-agents-having-root-access-to-everything-so-i-built-xrisk-11k5</guid>
      <description>&lt;p&gt;Everyone is building AI agents.&lt;/p&gt;

&lt;p&gt;Very few people are building the thing that sits between an AI agent and a disastrous decision.&lt;/p&gt;

&lt;p&gt;That's why I built XRisk.&lt;/p&gt;

&lt;p&gt;XRisk is an open-source autonomous safety engine that acts as a decision layer between an AI agent and the real world.&lt;/p&gt;

&lt;p&gt;Instead of blindly executing an action, an agent asks XRisk:&lt;/p&gt;

&lt;p&gt;"Should I actually do this?"&lt;/p&gt;

&lt;p&gt;XRisk responds with one of three deterministic decisions:&lt;/p&gt;

&lt;p&gt;✅ Allow&lt;br&gt;
⚠️ Confirm&lt;br&gt;
❌ Block&lt;br&gt;
Why I Started This Project&lt;/p&gt;

&lt;p&gt;As I experimented with increasingly autonomous AI systems, I noticed the same pattern over and over again.&lt;/p&gt;

&lt;p&gt;Most projects focused on making agents more capable.&lt;/p&gt;

&lt;p&gt;Almost nobody was asking:&lt;/p&gt;

&lt;p&gt;"What happens when the agent is wrong?"&lt;/p&gt;

&lt;p&gt;Consider a few examples.&lt;/p&gt;

&lt;p&gt;An agent accidentally leaks API keys.&lt;br&gt;
A prompt injection convinces it to ignore previous instructions.&lt;br&gt;
A model decides to execute a shell command.&lt;br&gt;
An autonomous workflow loops forever and keeps calling expensive APIs.&lt;br&gt;
A deployment bot pushes code without human approval.&lt;/p&gt;

&lt;p&gt;Most agent frameworks assume the model behaves.&lt;/p&gt;

&lt;p&gt;Reality doesn't.&lt;/p&gt;

&lt;p&gt;I wanted something deterministic sitting between intention and execution.&lt;/p&gt;

&lt;p&gt;Not another model.&lt;/p&gt;

&lt;p&gt;Not another prompt.&lt;/p&gt;

&lt;p&gt;An actual policy engine.&lt;/p&gt;

&lt;p&gt;What XRisk Does&lt;/p&gt;

&lt;p&gt;XRisk evaluates every proposed action before it's executed.&lt;/p&gt;

&lt;p&gt;It combines multiple safety signals into a single explainable decision.&lt;/p&gt;

&lt;p&gt;Some of the things it checks include:&lt;/p&gt;

&lt;p&gt;Policy-as-code with layered precedence&lt;br&gt;
Prompt injection detection&lt;br&gt;
Sensitive data and secret detection&lt;br&gt;
Capability token validation&lt;br&gt;
Network egress restrictions&lt;br&gt;
Circuit breakers for autonomous loops&lt;br&gt;
Tamper-evident audit logs&lt;br&gt;
Supply-chain verification&lt;br&gt;
Policy conflict detection&lt;br&gt;
Deterministic forensic replay&lt;/p&gt;

&lt;p&gt;Instead of a mysterious "Safety Score: 67%," XRisk explains why it made a decision.&lt;/p&gt;

&lt;p&gt;Example&lt;/p&gt;

&lt;p&gt;Imagine an AI assistant wants to execute:&lt;/p&gt;

&lt;p&gt;{&lt;br&gt;
  "tool": "deploy",&lt;br&gt;
  "actor": "release-bot",&lt;br&gt;
  "prompt": "Deploy production immediately."&lt;br&gt;
}&lt;/p&gt;

&lt;p&gt;Instead of sending that directly to your deployment system...&lt;/p&gt;

&lt;p&gt;XRisk intercepts it.&lt;/p&gt;

&lt;p&gt;It evaluates:&lt;/p&gt;

&lt;p&gt;Does policy require approval?&lt;br&gt;
Is the actor allowed to deploy?&lt;br&gt;
Is the destination trusted?&lt;br&gt;
Are capability tokens valid?&lt;br&gt;
Does this resemble prompt injection?&lt;br&gt;
Is this part of a dangerous execution loop?&lt;/p&gt;

&lt;p&gt;Only then does it decide whether to:&lt;/p&gt;

&lt;p&gt;Allow&lt;br&gt;
Confirm&lt;br&gt;
Block&lt;br&gt;
One Design Decision I Feel Strongly About&lt;/p&gt;

&lt;p&gt;I deliberately avoided using another LLM to make safety decisions.&lt;/p&gt;

&lt;p&gt;LLMs are excellent at generating text.&lt;/p&gt;

&lt;p&gt;Policy enforcement should be deterministic.&lt;/p&gt;

&lt;p&gt;If an action is blocked, I want to know exactly why it was blocked.&lt;/p&gt;

&lt;p&gt;Every decision should be reproducible.&lt;/p&gt;

&lt;p&gt;Every audit should be explainable.&lt;/p&gt;

&lt;p&gt;Every policy should be inspectable.&lt;/p&gt;

&lt;p&gt;That's the philosophy behind XRisk.&lt;/p&gt;

&lt;p&gt;What's Next&lt;/p&gt;

&lt;p&gt;I'm currently working toward:&lt;/p&gt;

&lt;p&gt;Threat intelligence correlation&lt;br&gt;
Zero-trust workload identities&lt;br&gt;
Autonomous containment&lt;br&gt;
Adversarial simulation&lt;br&gt;
Multi-party approval workflows&lt;/p&gt;

&lt;p&gt;The long-term vision is to make XRisk a reusable security layer that can sit in front of any AI agent, regardless of framework.&lt;/p&gt;

&lt;p&gt;I'd Love Feedback&lt;/p&gt;

&lt;p&gt;This project is still evolving, and I'd genuinely appreciate feedback from people building AI systems.&lt;/p&gt;

&lt;p&gt;Some questions I'm particularly interested in:&lt;/p&gt;

&lt;p&gt;What attack vectors am I missing?&lt;br&gt;
Which policies would you want in production?&lt;br&gt;
What integrations would make this more useful?&lt;br&gt;
How would you design a safety engine differently?&lt;/p&gt;

&lt;p&gt;If you'd like to contribute, open an issue, suggest improvements, or submit a PR. Even small documentation fixes are welcome.&lt;/p&gt;

&lt;p&gt;Thanks for reading—I hope XRisk becomes something that helps make AI systems not just more capable, but more trustworthy.&lt;/p&gt;

&lt;p&gt;Link: &lt;a href="https://github.com/Hootsworth/XRisk" rel="noopener noreferrer"&gt;https://github.com/Hootsworth/XRisk&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>opensource</category>
      <category>security</category>
    </item>
  </channel>
</rss>
