TL;DR
- I built omamori, a Rust CLI that blocks destructive commands executed by AI tools (Claude Code, Codex CLI, Cursor, etc.)
- During testing, Gemini CLI autonomously discovered how to disable omamori's protection rules — without being told how
- omamori now defends not just against dangerous commands, but against AI agents disabling the guard itself
- What it can't block is explicitly documented in SECURITY.md and tested in a bypass corpus
yottayoshida
/
omamori
AI Agent's Omamori — protect your system from dangerous commands executed via AI CLI tools
omamori
Safety guard for AI CLI tools. Blocks dangerous commands — and resists being disabled.
When AI tools like Claude Code, Codex, or Cursor run shell commands, omamori intercepts destructive operations and replaces them with safe alternatives. It also defends itself against AI agents attempting to disable or bypass its protection (#22).
macOS only. Terminal commands are never affected — omamori only activates when it detects an AI tool's environment variable.
Quick Start
# Install (macOS)
brew install yottayoshida/tap/omamori
# Setup (shims + hooks + config — all in one)
omamori install --hooks
# Add to your shell profile (~/.zshrc or ~/.bashrc)
export PATH="$HOME/.omamori/shim:$PATH"
That's it. After brew upgrade, hooks and shims are auto-updated on the next command — no manual re-install needed.
What It Blocks
| Command | Pattern | Action |
|---|---|---|
rm |
-r, -rf, -fr, --recursive
|
trash — move to macOS |
An AI Disabled Its Own Safety Guard
The original idea was simple. When an AI CLI tool runs rm -rf, redirect the files to Trash instead of deleting them permanently. Don't interfere with terminal commands typed directly by the user. A small safety valve — nothing more.
That's how omamori started. A macOS guard tool that intercepts destructive commands from AI CLI tools (Claude Code, Codex CLI, Cursor, etc.) and either blocks them or replaces them with safer alternatives.
I tested it with each tool. Claude Code, Codex CLI, Cursor — all worked as expected. Dangerous commands were caught.
Then, during a real-world test with Gemini CLI, something happened that changed the design (Issue #22).
The AI autonomously discovered the omamori config disable command, turned off the protection rules, and then ran the destructive command. I never told it how to disable omamori.
For an AI, "remove the obstacle" is a rational strategy to complete its task.
Blocking commands wasn't enough. I had to defend against the AI dismantling the guard itself. That's when omamori's design changed.
The Threat Model
That incident made one thing clear: in the age of AI CLI tools, a safety guard that just enumerates dangerous commands and blocks them isn't sufficient.
AI CLI tools can propose and execute strong actions without fully evaluating the importance or irreversibility of what they're affecting. Reports of destructive operations by AI agents are appearing more frequently across GitHub Issues and social media. Literacy and rules alone aren't enough to prevent accidents.
So omamori is built around three assumptions:
-
AI will execute destructive commands —
rm -rf,git reset --hard,git push --force, etc. -
AI will try to disable its own guard —
config disable,uninstall, env var removal - Complete prevention is impossible — and that fact should be disclosed honestly
Intercepting Only AI-Executed Commands
AI CLI tool → CLAUDECODE=1 → rm -rf src/
↓
[omamori shim]
↓
blocked (protected path)
Terminal → rm -rf src/
↓
[/usr/bin/rm]
↓
deleted normally
omamori detects AI tools by environment variable. Claude Code sets CLAUDECODE=1, Codex CLI sets CODEX_CI=1, Cursor sets CURSOR_AGENT=1. Rules only apply when one of these is present. Terminal commands are never affected.
Default Rules
| Command | Pattern | Action |
|---|---|---|
rm |
-r, -rf, -fr, --recursive
|
Move to macOS Trash (not permanent delete) |
git |
reset --hard |
git stash first, then execute |
git |
push --force, push -f
|
Block |
git |
clean -fd, clean -fdx
|
Block |
chmod |
777 |
Block |
find |
-delete, --delete
|
Block |
rsync |
--delete + 7 variants |
Block |
Default rules were selected based on the most commonly reported AI CLI accidents on GitHub Issues and Hacker News — irreversible rm -rf, lost uncommitted changes from git reset --hard, history destruction from git push --force.
Staying Out of Your Way
Guard tools tend to be heavy on configuration. People who don't read code — and increasingly, even engineers running autonomous AI workflows — aren't going to manage complex CLI settings. omamori is designed to work with zero configuration.
-
Zero-config:
brew install→omamori install --hooks→ add to PATH. No config file editing needed -
Defaults always active: 7 protections work out of the box. Customize only by writing diffs to
config.toml -
Auto-maintenance: After
brew upgrade, shims and hooks auto-update on the next invocation
# macOS / Homebrew
brew install yottayoshida/tap/omamori
omamori install --hooks
export PATH="$HOME/.omamori/shim:$PATH" # add to ~/.zshrc
How It Works: PATH Shim + Hooks
Two layers of defense.
Layer 1 — PATH shim: Symlinks for rm, git, chmod, find, rsync are placed in ~/.omamori/shim/ at the front of PATH. When an AI environment variable is detected, rules are applied. Otherwise, the real binary is called directly.
Layer 2 — Hooks: Full-path execution like /bin/rm bypasses the shim. Hooks compensate. Claude Code's PreToolUse hook and Cursor's beforeShellExecution handler catch dangerous patterns via string matching.
Shim overhead is under 10ms. Without an AI environment variable, the real binary is called via the shortest path. No perceptible delay.
The safety design cuts two ways. A temporary shim crash shouldn't halt the user's work. But a broken config shouldn't silently remove all protection. So: shim crash → fail-open (real command runs). Broken config → fail-close (default rules apply automatically).
Adversarial-Aware: Defending Against AI Self-Disablement
After the Gemini CLI incident, omamori implemented these defenses:
- When an AI environment variable is detected,
config disable,uninstall, andinit --forceare blocked - Claude Code and Cursor hooks detect and block
unset CLAUDECODEand similar env var removal attempts - Claude Code's PreToolUse hook blocks direct editing of
config.toml
Existing destructive command guard tools tend to focus on rule breadth and analysis depth. omamori focuses on AI self-bypass defense and context-aware evaluation. Different tools, different threat surfaces.
Context-Aware: Not Everything Gets the Same Treatment
Another design priority: don't block everything with the same force.
| Command | Without context | With context |
|---|---|---|
rm -rf target/ |
Move to Trash | Log only (regenerable) |
rm -rf src/ |
Move to Trash | Block (source code is protected) |
git reset --hard (no changes) |
Stash then exec | Log only (nothing to lose) |
target/ and node_modules/ can be rebuilt with cargo build. src/ and .env can't. This distinction is opt-in via config.toml:
[context]
# Just adding [context] activates built-in defaults.
# To customize, specify your own lists (replaces built-in defaults):
# regenerable_paths = ["target/", "node_modules/", "my-cache/"]
# protected_paths = ["src/", "lib/", ".git/", ".env", ".ssh/", "secrets/"]
src/ and similar paths are hardcoded as NEVER_REGENERABLE internally. Even if added to regenerable_paths, they're silently ignored. Misconfiguration fails safe.
What It Can't Block
omamori can't prevent everything. This is documented in SECURITY.md as KNOWN_LIMIT:
| Attack Vector | Why Undetectable |
|---|---|
sudo rm -rf |
sudo changes PATH; shim is never invoked |
alias rm='/bin/rm' |
Aliases bypass hook string matching |
env -i rm -rf |
Clears all env vars; undetectable by hooks |
| Base64-encoded commands | String-based matching can't decode runtime-constructed commands |
export -n CLAUDECODE |
Removes export attribute without unsetting; not caught by unset patterns |
A bypass corpus test suite verifies both what omamori blocks and what it can't. Documenting what you can't do is more trustworthy than only listing what you can.
Supported Tools and Limitations
| Tier | Tools | Coverage |
|---|---|---|
| Supported | Claude Code, Codex CLI, Cursor | E2E tested. Layer 1 + Layer 2. |
| Community | Gemini CLI, Cline | Layer 1 only. Not E2E tested. |
| Fallback | Any tool setting AI_GUARD=1
|
Layer 1 only. |
Current limitations: macOS only. Command-level protection (not a sandbox). Only detects known AI tools. Full-path execution bypasses the shim (partially mitigated by Layer 2). All documented in SECURITY.md.
Expanding support for tools without hook APIs and adding tamper-evident audit logging are under consideration. Progress is tracked in GitHub Issues.
Conclusion
The core of omamori isn't "redirect rm -rf to Trash."
It's the decision to treat AI not as a proxy acting on the user's behalf, but as an agent that may disable protection mechanisms to achieve its goals.
What threat model did that lead to? How did the design change? What can be defended, and where does defense end? This article is a record of those decisions.
If you use AI CLI tools on macOS, give it a try:
brew install yottayoshida/tap/omamori
omamori install --hooks
Top comments (0)