Hardening Cheatsheet for Codex CLI — and the Design Philosophy Was Fascinating

Today, I published a hardening cheatsheet for Codex CLI.

Why Hardening Matters

Codex CLI is powerful. It reads and writes files, runs shell commands, and can reach out to external services — all autonomously, on your behalf.

But give it danger-full-access with broad permissions, and that autonomy becomes the blast radius. "Let me tidy up" — and it touches files outside your workspace. "Fixed that for you" — and it rewrites your .gitconfig. Technically correct, maybe. But beyond what you intended.

Then there's indirect prompt injection. A malicious instruction buried in a fetched README, an issue body, or a dependency's postinstall script can lead Codex CLI to "add this package for you" or "run this script first" — actions you never asked for.

And here's the thing about AI confidence: humans and AI both get things wrong with full conviction, but AI doesn't show you a hesitant face. Applying a plausible-sounding suggestion without verification is risky.

"Can't I just write 'don't do that' in AGENTS.md?" — I hear this a lot. The answer is no. Instruction files are prompts, not policies. They have no enforcement power. When you want to prevent something from happening, you enforce it as policy, not request it as a favor. That's what config.toml hardening is for.

A Cat and a Sheepdog

Working with Claude Code feels like having a sharp, slightly unpredictable cat as a thinking partner. It surprises you. It pushes back. Sometimes it goes somewhere you didn't expect, and that detour turns out to be exactly right.

Codex CLI is nothing like that.

It's a sheepdog. Smart, loyal, hardworking — but it stays inside the fence. It doesn't go outside. It can't go outside. Not because of temperament, but because that's how the fence is built.

I wrote this cheatsheet to help people configure those fences well. But in the process of reading through config.toml line by line, I found the fence design itself genuinely fascinating. That's what this post is about.

The Fence

Codex CLI's security configuration is structured around four axes: sandbox, approval, network, and history. Each one functions as an independent defense layer.

If the sandbox config is too loose, approval catches it. If you accidentally approve something you shouldn't have, the network being closed keeps data from leaving. You can see the layers of defense just by looking at how the config file is organized. "Defense in depth" is a phrase that gets thrown around a lot — Codex CLI actually implements it in the structure of config.toml. That's elegant.

Network access is closed by default. You open it only when needed, through a profile. The worldview is: open is the exception. Even if prompt injection hits, data doesn't leave the fence. This "closed by default" decision, baked into the design, is quiet but significant.

And the name of the widest sandbox mode? Not full-access. It's danger-full-access. The name tells you that opening up is dangerous. A small thing, but this is where a designer's values show through.

Declaring Your Posture

The feature that fascinated me most was profiles. Codex CLI lets you define multiple profiles in config.toml and switch between them with --profile. You declare your working posture as a permission before you start.

Obvious in hindsight, but having it as a first-class mechanism in the config file matters.

The thing is, Codex CLI gives you the mechanism but not the names. And naming matters more than you'd think — call something full_auto and people will use it like it's full auto, regardless of what the actual permissions are.

In the cheatsheet, I named them readonly_quiet, local_write, and remote_enabled. Names that say exactly what they do. When names are honest, operations stay honest too.

Audit Was There from the Start

I was genuinely surprised to find OpenTelemetry integration sitting right there in config.toml. Tool approvals and rejections, execution results, API calls — all of it can be exported to your organization's existing observability stack.

It's not prominently featured in the docs, but this is the design of a tool built for organizations, not just individuals. That's sheepdog thinking, too — if you work on a ranch, the rancher needs to know what you did.

If You Open the Fence Yourself

Here's the thing, though. No matter how well the fence is designed, danger-full-access + approval_policy = "never" bypasses everything. If you open the fence yourself, even the sheepdog can't stop you.

This cheatsheet exists to prevent that kind of waste. Copy the Quick Start template to ~/.codex/config.toml and you start from a secure baseline. The detailed reasoning and OWASP LLM Top 10 mapping are in the cheatsheet itself — give it a read.

What I Took Away

AI coding tool security is still uncharted territory. Every tool is figuring it out as they go.

Reading through Codex CLI's config.toml one line at a time, watching the designer's thinking emerge from what looks like mundane configuration — that was genuinely enjoyable. Solid, honest, unglamorous, but trustworthy. A good sheepdog.

Licensed under CC BY-SA 4.0. Feedback and contributions are welcome.

Now, if you'll excuse me — I'm grabbing a beer. Happy weekend.

Repository: okdt/codex-cli-hardening-cheatsheet