DEV Community

yhc
yhc

Posted on

I Spent the Last Few Days Testing AI Agents and Got Scared — So I Built Sentinel v0.3.0

Hey dev.to community 👋

Over the past few days, I’ve been running dozens of AI Agent experiments. The more powerful they got, the more nervous I became.

They can do amazing things, but they’re also shockingly easy to jailbreak, abuse tools, or quietly exfiltrate data. And once they go rogue? Traditional “safety inside the agent” approaches just don’t cut it.

So I decided to solve it differently.

The Solution: Sentinel v0.3.0 “The Shield Release”
I pulled the entire security layer completely outside the agent using an independent Shield Sidecar process.
The agent literally cannot see it or kill it. Every risky action (shell commands, file I/O, API calls, etc.) must request permission from the Shield first.

Key Features
• Shield Sidecar — True out-of-band protection + instant SIGKILL + forensic snapshot
• Deterministic Shadow Sandbox — Preview any action safely before it touches your real system
• Red Team Engine — 34 attack vectors, auto scoring (0-100)
• EU AI Act Compliance — One-click report generation
• Python @protect decorator + LangChain plugin
• Clean dashboard for monitoring + one-click kill

Full project here: https://github.com/byte271/Sentinel
Still very early (v0.3.0 just dropped), but I’m really happy with the direction — strong focus on determinism, auditability, and local-first.
If you’re building AI Agents, I’d love your honest feedback. What safety problems are you running into the most?
Let me know in the comments! 🔥

Top comments (0)