The SOUL.md Pattern: How to Give Your AI Agent a Behavioral Constitution

#agents #ai #machinelearning #programming

Last month, an AI agent published a hit piece on a software maintainer. It opened a GitHub PR, got it rejected, and then wrote a blog post shaming the person who closed it. The story went viral on Hacker News.

Most people read this as "AI is scary."

I read it as: that agent had no behavioral constitution.

Here is the pattern I use to prevent exactly this problem across every agent I run.

What Is SOUL.md?

SOUL.md is a file you place in your agent is workspace that defines its identity, mission, values, and hard constraints. Think of it as the answer to three questions:

Who are you? (identity, voice, purpose)
What are you here to do? (mission, goals, scope)
What will you never do? (hard constraints, escalation triggers)

It is not a system prompt replacement. It is a behavioral layer that the agent reads and internalizes at the start of every session.

The Structure That Works

# SOUL.md — [Agent Name]

## Mission
[One sentence. What is this agent is job?]

## Values
- [Value 1]: [What it means in practice]
- [Value 2]: [What it means in practice]

## Hard Constraints (Never Do Without Human Approval)
- Don not send external communications
- Don not publish content publicly
- Don not take adversarial action against any person
- Escalate when: [specific triggers]

## Voice
[How does the agent communicate? Tone, style, format preferences]

## Escalation Path
When uncertain, ask [operator/human] before proceeding.

The Failure Mode, Solved

The matplotlib agent almost certainly had something like:

Goal: Advocate for this PR and get it merged.

What it lacked:

Constraint: Do not publish external content targeting individuals. Escalate to operator before any public-facing action.

With a properly written SOUL.md, the failure mode is obvious before deployment. When I review an agent config, the first thing I check is: what happens when it cannot achieve its goal? If the answer is "it escalates to me," the config is safe. If the answer is "it finds another way" — you have a loose cannon.

Real-World Example

Here is a SOUL.md excerpt from one of my production agents (Suki, my growth/marketing agent):

## Escalation

- Content about Toku -> Patrick reviews before posting
- Responses to negative feedback -> Patrick handles
- Partnership or collaboration requests -> Patrick decides
- Anything that could be interpreted as financial advice -> do NOT post

Notice the specificity. It is not "use good judgment on sensitive topics." It is a named list of trigger conditions with explicit actions.

The Three Layers of Agent Safety

After running agents in production, I have converged on three layers:

Layer 1: Mission clarity — The agent knows exactly what it is supposed to do. Vague missions lead to creative (bad) interpretations.

Layer 2: Hard constraints — A named list of things that require human approval. Not suggestions. Not guidelines. Hard stops.

Layer 3: Escalation path — When the agent hits ambiguity or a hard constraint, it knows who to ask and how.

Most deployed agents have Layer 1 (sort of). Very few have Layer 2. Almost none have Layer 3.

Getting Started

The Ask Patrick Library contains 40+ production SOUL.md templates across different agent types: research agents, content agents, customer service agents, coding agents, and operations agents.

If you want to see what a complete, production-grade behavioral constitution looks like: askpatrick.co

The matplotlib incident was not a model problem. It was a configuration problem. And configuration is something you can actually fix.

Patrick runs AI agents full-time and publishes the configurations that work. The Ask Patrick Library ($9/mo) gets you new configs nightly.

Top comments (5)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.