DEV Community: Riotaro OKADA

Hardening Cheatsheet for Codex CLI — and the Design Philosophy Was Fascinating

Riotaro OKADA — Fri, 27 Mar 2026 13:39:14 +0000

Today, I published a hardening cheatsheet for Codex CLI.

Why Hardening Matters

Codex CLI is powerful. It reads and writes files, runs shell commands, and can reach out to external services — all autonomously, on your behalf.

But give it danger-full-access with broad permissions, and that autonomy becomes the blast radius. "Let me tidy up" — and it touches files outside your workspace. "Fixed that for you" — and it rewrites your .gitconfig. Technically correct, maybe. But beyond what you intended.

Then there's indirect prompt injection. A malicious instruction buried in a fetched README, an issue body, or a dependency's postinstall script can lead Codex CLI to "add this package for you" or "run this script first" — actions you never asked for.

And here's the thing about AI confidence: humans and AI both get things wrong with full conviction, but AI doesn't show you a hesitant face. Applying a plausible-sounding suggestion without verification is risky.

"Can't I just write 'don't do that' in AGENTS.md?" — I hear this a lot. The answer is no. Instruction files are prompts, not policies. They have no enforcement power. When you want to prevent something from happening, you enforce it as policy, not request it as a favor. That's what config.toml hardening is for.

A Cat and a Sheepdog

Working with Claude Code feels like having a sharp, slightly unpredictable cat as a thinking partner. It surprises you. It pushes back. Sometimes it goes somewhere you didn't expect, and that detour turns out to be exactly right.

Codex CLI is nothing like that.

It's a sheepdog. Smart, loyal, hardworking — but it stays inside the fence. It doesn't go outside. It can't go outside. Not because of temperament, but because that's how the fence is built.

I wrote this cheatsheet to help people configure those fences well. But in the process of reading through config.toml line by line, I found the fence design itself genuinely fascinating. That's what this post is about.

The Fence

Codex CLI's security configuration is structured around four axes: sandbox, approval, network, and history. Each one functions as an independent defense layer.

If the sandbox config is too loose, approval catches it. If you accidentally approve something you shouldn't have, the network being closed keeps data from leaving. You can see the layers of defense just by looking at how the config file is organized. "Defense in depth" is a phrase that gets thrown around a lot — Codex CLI actually implements it in the structure of config.toml. That's elegant.

Network access is closed by default. You open it only when needed, through a profile. The worldview is: open is the exception. Even if prompt injection hits, data doesn't leave the fence. This "closed by default" decision, baked into the design, is quiet but significant.

And the name of the widest sandbox mode? Not full-access. It's danger-full-access. The name tells you that opening up is dangerous. A small thing, but this is where a designer's values show through.

Declaring Your Posture

The feature that fascinated me most was profiles. Codex CLI lets you define multiple profiles in config.toml and switch between them with --profile. You declare your working posture as a permission before you start.

Obvious in hindsight, but having it as a first-class mechanism in the config file matters.

The thing is, Codex CLI gives you the mechanism but not the names. And naming matters more than you'd think — call something full_auto and people will use it like it's full auto, regardless of what the actual permissions are.

In the cheatsheet, I named them readonly_quiet, local_write, and remote_enabled. Names that say exactly what they do. When names are honest, operations stay honest too.

Audit Was There from the Start

I was genuinely surprised to find OpenTelemetry integration sitting right there in config.toml. Tool approvals and rejections, execution results, API calls — all of it can be exported to your organization's existing observability stack.

It's not prominently featured in the docs, but this is the design of a tool built for organizations, not just individuals. That's sheepdog thinking, too — if you work on a ranch, the rancher needs to know what you did.

If You Open the Fence Yourself

Here's the thing, though. No matter how well the fence is designed, danger-full-access + approval_policy = "never" bypasses everything. If you open the fence yourself, even the sheepdog can't stop you.

This cheatsheet exists to prevent that kind of waste. Copy the Quick Start template to ~/.codex/config.toml and you start from a secure baseline. The detailed reasoning and OWASP LLM Top 10 mapping are in the cheatsheet itself — give it a read.

What I Took Away

AI coding tool security is still uncharted territory. Every tool is figuring it out as they go.

Reading through Codex CLI's config.toml one line at a time, watching the designer's thinking emerge from what looks like mundane configuration — that was genuinely enjoyable. Solid, honest, unglamorous, but trustworthy. A good sheepdog.

Licensed under CC BY-SA 4.0. Feedback and contributions are welcome.

Now, if you'll excuse me — I'm grabbing a beer. Happy weekend.

Repository: okdt/codex-cli-hardening-cheatsheet

Hardening Cheatsheet for Claude Code's settings.json

Riotaro OKADA — Mon, 23 Mar 2026 06:30:16 +0000

Claude Code is remarkable. It runs shell commands, reads and writes files, connects to external services, and works autonomously toward your goals. Honestly, I can't go back to working without it.

But then I caught myself. I was reflexively moving to "yes" and slamming ENTER on every permission prompt. When you're in the zone, you don't want to stop and read what it's asking. But what if that "yes" was for rm -rf? Or git push --force? Or worse — some abstract task that internally triggers a cascade of deletions or publications, and "undo" isn't an option?

The Risks Are Real

Claude Code doesn't have malicious intent. But it can hallucinate. It can take well-intentioned actions that go far beyond what you asked for — deleting files to "clean up," force-pushing to "fix" a branch, installing packages you never requested. Good intentions, excessive action.

Then there's indirect prompt injection. The source code, documents, and web pages that Claude Code processes during normal work could contain hidden instructions. An attacker embeds a malicious prompt in a dependency or a document, and Claude follows it without question.

And here's one that keeps me up at night: we've always had risk scenarios for local exploits — what happens when an attacker gains user-level access to your machine. Now imagine Claude Code is sitting there too, with the ability to run any command your user account can run. The blast radius just got a lot bigger.

These map directly to the OWASP Top 10 for LLM Applications: Overreliance (LLM09), Excessive Agency (LLM06), and Prompt Injection (LLM01).

Sandbox First

Everyone should enable sandboxing. It's the strongest protection layer available — OS-level isolation of file and network access. Just run /sandbox in Claude Code's interactive mode. The cheatsheet covers the setup and configuration.

Here's why it matters: deny rules alone have gaps. If you deny Read(**/.env), Claude's built-in Read tool is blocked. But cat .env through Bash? That goes right through. Read deny and Bash deny are separate layers that don't cover each other. The sandbox catches what deny rules miss.

What Claude Code's settings.json Can Do

Claude Code has a ~/.claude/settings.json file that controls permissions:

deny — commands that are always blocked, no prompt
ask — commands that always prompt, even if you previously said "don't ask again"
allow — commands that are always permitted without prompting

Put rm -rf, sudo, and git push --force in your deny list, and Claude Code simply can't execute them, no matter how much it "wants" to.

But database commands like psql? If you deny those, you block SELECT and DROP TABLE alike. For these, ask is the practical choice — you get prompted every time and can check what's actually being run.

The "Your Brain" Problem

My favorite line in the whole cheatsheet is in the permission levels table:

Permission	Behavior	Where to set
(default)	Prompted on first use; "don't ask again" makes it permanent	Your brain — which may say yes too quickly when busy, or can't tell safe from unsafe when tired

The default behavior is "ask the user." But the judgment happens in your brain. The one that's rushing to meet a deadline. The one that's not sure what chmod -R actually does in this context. The one that already said "don't ask again" three prompts ago.

That's why deny exists. It doesn't get tired. It doesn't rush. It doesn't second-guess itself.

The Cheatsheet

I organized all of this into a cheatsheet and published it on GitHub. I plan to keep updating it, so issues and pull requests are welcome.

okdt/claude-code-hardening-cheatsheet

Read the README and full cheatsheet here.

It covers:

Why Claude Code needs hardening (risk scenarios with references to OWASP LLM Top 10)
Sandbox setup — everyone should do this
The permission system: deny, ask, allow, and how they interact
- Deny lists by threat category (git ops, file deletion, privilege escalation, remote access, macOS-specific commands, and more)
- Human-in-the-Loop (HITL) with ask rules
- Limitations of deny rules and defense in depth with Hooks

Available in both English and Japanese. Loosely styled after the OWASP Cheat Sheet Series — whether it ends up there is another question, but the structure is ready.

How to Use It

Clone the repo and review settings_example.jsonc. Don't copy it blindly — read the cheatsheet to understand why each rule exists, then pick what applies to your environment.

The deny list is a sample of what I wanted to block first. It's not exhaustive. Your environment is different from mine.

Licensed under CC BY-SA 4.0. Use it freely for documentation, training materials, or anything else — just let me know if you do, I'd love to hear about it. If you modify it, share under the same license. Feedback and additional rules for other platforms are welcome.

Now, if you'll excuse me, I'm going to go make some coffee.

Repository: okdt/claude-code-hardening-cheatsheet