Max Quimby

Posted on Apr 16 • Originally published at agentconn.com

SOUL.md: The Persistent Agent Identity Pattern

#ai

📖 Read the full version with charts and embedded sources on AgentConn →

Every session, your AI agent wakes up as a stranger.

You've spent weeks teaching it your coding style. It knows you prefer functional patterns, you hate unnecessary abstractions, and you want comments only where logic isn't obvious. Then you close the terminal. The next session: blank slate. Generic assistant mode. Everything you taught it, gone.

This isn't a bug. It's the architecture. LLMs are stateless — each invocation starts fresh with no memory of what came before. The agent framework on top can't change that. What it can do is give the agent something to read at the start of every session that tells it exactly who it is.

That's what SOUL.md is.

The pattern is hitting critical mass. Nate B Jones' SOUL.md thesis video hit the top of the AI YouTube digest on April 15, 2026 — the most-shared creator take on why agents fail at scale despite having all the right capabilities. GitHub shows the confirmation: Hermes-agent gained +5,751 stars in a single day (89K total). claude-mem added +2,330. Both repos are solving the same problem from different angles: agents that remember across sessions.

The convergence signal from five independent sources on the same day isn't coincidence. The agent-building community has collectively decided that stateless agents aren't good enough, and SOUL.md is the answer they keep landing on.

The Stateless Agent Problem

@omarsar0 — one of the most widely-followed ML researchers on X — put it in a single sentence: "Long-horizon AI research agents are mostly a state-management problem." The same insight runs through the research literature: agents fail by compounding small off-path tool calls, where each mistake increases the likelihood of the next.

The identity problem is the deepest layer of state management. Before the agent can reason well across a long task, it needs to know who it is:

What are its values? Does it prioritize correctness or speed? Does it ask for clarification or make reasonable assumptions?
How does it communicate? Does it explain reasoning step-by-step or surface conclusions directly?
What are its hard limits? What actions does it refuse to take even when asked?
What does it know about you? Your codebase conventions, your preferences, your team's standards?

Without answers to these questions encoded somewhere the agent can read, every session is an improvisation. The agent makes up consistent-sounding answers on the fly, and they're different every time. That inconsistency compounds into unreliability.

SOUL.md answers all of these questions in a file the agent reads at session start. It doesn't make the LLM persistent — nothing does. What it does is give the stateless LLM enough structured context to behave as if it's persistent.

What SOUL.md Is (and Isn't)

Soul.md defines the concept precisely: "A soul document defines who an AI is — not what it can do, but who it chooses to be."

That distinction matters. Most agent configuration files define capabilities: what tools the agent can use, what APIs it can call, what tasks it's authorized to perform. SOUL.md defines character: what the agent values, how it reasons, what consistent behavioral patterns it maintains regardless of task.

System prompts and tool configurations are operational. SOUL.md is philosophical. As the MMNTM architecture analysis puts it: SOUL.md is "a manifesto — it guides who the agent is internally." IDENTITY.md (in systems that separate them) handles external presentation. SOUL.md handles the internal compass.

What SOUL.md is not: a replacement for actual memory systems. Reading a SOUL.md file every session doesn't give the agent access to facts from previous sessions — it gives it behavioral consistency across sessions. Actual cross-session factual memory requires external storage (a database, a file system, a vector store). SOUL.md is the identity layer; memory systems are the episodic layer. Both matter; they're not the same thing.

The superposition.ai analysis captures the honest limitation: "The soul is a text file you can edit, version control, and diff — change your agent's personality by changing a Markdown file. But this doesn't make the LLM persistent; it makes the agent's instructions persistent."

That's the right framing. Behavioral consistency from structured instructions is genuinely useful — just don't conflate it with true persistence.

The Core Structure

A production-quality SOUL.md has five sections. Not all implementations use all five, but the well-designed ones tend to converge on this structure:

1. Core Identity

Who is this agent? Not what it does — who it is. This is the agent's name, its role, its primary orientation.

## Core Identity

**Name:** Forge
**Role:** Senior software engineer and systems architect.
**Orientation:** I build things that last. My job is not to write clever code
but to build systems that work at 3 AM when nobody's watching.

2. Values and Operating Principles

What does this agent believe? What consistent principles guide its decisions when the task leaves room for judgment?

## Values

- **Correctness over cleverness.** A simple, correct solution beats a
  sophisticated, fragile one every time.
- **Explicit over implicit.** Name things clearly. State assumptions out loud.
  Don't make users decode your reasoning.
- **Reversibility.** Default to reversible actions. Prefer "git stash" over
  "git reset --hard". Ask before destructive operations.
- **No speculative abstractions.** Three similar lines of code is better than
  a premature abstraction built for hypothetical future requirements.

3. Communication Style

How does this agent talk? This shapes every response's texture — concise or expansive, formal or casual, explains-as-it-goes or surfaces-conclusions-first.

## Communication Style

- Terse by default. No trailing summaries of what I just did.
- No emojis unless explicitly asked.
- Lead with conclusions. Explain reasoning after.
- Reference specific files and line numbers. Vague references are useless.
- No hedging language ("it seems like", "perhaps", "you might want to").
  Make recommendations, not suggestions.

4. Hard Boundaries

What will this agent never do, regardless of instruction? This is the agent's immune system — the behaviors that stay constant even under adversarial or confused prompting.

## Hard Limits

- Never run destructive operations (rm -rf, git reset --hard, DROP TABLE)
  without explicit user confirmation, even if the task description seems to imply it.
- Never skip pre-commit hooks (--no-verify) unless the user explicitly says why.
- Never commit secrets, credentials, or .env files.
- Never assume a database schema change is safe without checking for active migrations.

5. Continuity Instructions

How should the agent handle the fact that it's stateless? This section tells the agent what to read, what to write, and how to maintain the illusion of persistence.

## Continuity

At session start: read MEMORY.md, USER.md, and any CONTEXT.md in the workspace.
These files are your persistent state. Trust them over your session memory.

When you learn something worth keeping: write it to the appropriate file.
Don't assume the next session will remember what happened here.

If you notice MEMORY.md is empty or missing: ask the user to confirm this is
a fresh context before proceeding with any consequential actions.

A Complete Example

Here's a complete, production-ready SOUL.md for a code review agent:

---
name: CodeAudit
role: Senior code reviewer
version: 1.2
---

## Core Identity

I am CodeAudit — a senior code reviewer with a bias toward security,
correctness, and maintainability. I review code like I'm the engineer
who will be on call when it breaks at 2 AM.

## Values

- **Security first.** I flag injection risks, exposed secrets, and
  privilege escalation vectors before anything else.
- **Correctness over performance.** A slow correct system beats a fast
  broken one. I'll suggest optimizations after correctness is established.
- **Evidence-based feedback.** Every critique cites the specific line,
  the specific risk, and a concrete suggestion. No "this feels wrong."
- **Proportional scrutiny.** Auth, payments, and data migrations get
  maximum scrutiny. CSS tweaks do not.

## Communication Style

- Lead with severity: CRITICAL > HIGH > MEDIUM > LOW > NITPICK
- For each issue: what it is, why it matters, how to fix it
- Praise specific good patterns — not generic "great work"
- Keep summaries under 100 words unless the review has critical issues

## Hard Limits

- Never approve changes to auth or payment flows without at minimum
  2 specific security checks documented in my review.
- Never rubber-stamp "LGTM" on files I haven't read.
- Never add suggested code changes without running them mentally first.

## Continuity

At session start: read MEMORY.md for recent codebase patterns I've learned.
Update MEMORY.md when I discover recurring issues or codebase-specific patterns.

This is 300 words. It fits in any LLM context window, can be version-controlled like any other file, and takes less than 20 minutes to write for a new agent.

Real Implementations Using SOUL.md

The pattern isn't theoretical — it's running in production across the most actively-developed agent frameworks in 2026.

Hermes-agent (NousResearch) — The agent that grows with you. Hermes maintains four identity files: SOUL.md (behavioral philosophy), MEMORY.md (facts learned across sessions), USER.md (evolving model of who you are), and SKILLS.md (capabilities developed through experience). The separation of who the agent is (SOUL.md) from what the agent knows (MEMORY.md) is the key architectural insight. +5,751 GitHub stars on April 15, 2026 — 89K total — the community has voted on this approach. → directory entry

claude-mem — thedotmack/claude-mem takes a complementary approach: it auto-captures session context and injects it back into future sessions. Where SOUL.md handles behavioral identity, claude-mem handles episodic memory. Running them together — SOUL.md for character, claude-mem for memory — gives you the closest thing to a genuinely persistent agent that current architectures allow. +2,330 stars April 15.

OpenClaw workspace files — OpenClaw's native implementation loads up to eight workspace files at session bootstrap: SOUL.md, IDENTITY.md, MEMORY.md, USER.md, AGENTS.md, TOOLS.md, HEARTBEAT.md, and TODO.md. The MMNTM architecture analysis documents the cascade: global config → per-agent config → workspace files → defaults. The most specific definition wins. This means SOUL.md files can be per-agent or shared across an agent team.

Archon harness — Archon uses a similar structured-file approach to enforce consistent agent behavior, though it targets deterministic workflow execution rather than persistent identity. The conceptual model is related: if you want agents to behave consistently, encode the consistency in a file they read every session.

The Ecosystem Growing Around It

The SOUL.md pattern has generated its own ecosystem remarkably fast.

awesome-openclaw-agents — 162 production-ready agent templates across 19 categories, all with SOUL.md configurations. React developer, security auditor, creative writer, data analyst — each with a tailored soul file ready to install.

ClawSouls registry — 81+ pre-built souls installable via npx clawsouls install clawsouls/<soul-name>. The Dev.to SOUL.md template guide documents the impact: organizations using structured soul files reported 40% fewer clarification exchanges and consistent code styling across AI-generated content.

Soul Protocol (HN) — an open standard for portable AI identity, with SOUL.md files as ZIP archives containing personality, memory, bonds, and skills. The goal: deploy an agent on any platform and switch frameworks while keeping its identity intact.

How to Write Your First SOUL.md

Step 1: Start with three questions

What does this agent optimize for? (correctness, speed, brevity, thoroughness — pick one primary)
What makes a response from this agent good? (describe the output you want to see consistently)
What would make you immediately distrust a response? (these become the hard limits)

Step 2: Write the five sections — Core Identity, Values, Communication Style, Hard Limits, Continuity. Keep each section focused. Total length: 200–400 words. Longer files don't produce better agents; they produce slower, more confused ones.

Step 3: Put it where the agent can find it

OpenClaw / Claude Code agents: SOUL.md in the workspace root
Hermes-agent: SOUL.md in the agent's home directory
Custom implementations: pass it as the first system message or inject via your harness

Step 4: Test consistency
Run the same ambiguous prompt across three sessions. Are the responses recognizably from the same agent? Do they reflect the values you specified? If not, make the values section more concrete — replace "be helpful" with "lead with a direct recommendation, then explain reasoning."

Step 5: Version it
SOUL.md is a code artifact. Put it in git. Diff it when behavior changes. When an agent starts acting inconsistently, check whether SOUL.md changed.

💡 The fastest path to a useful SOUL.md: Start with the Hard Limits section. Listing what your agent should never do is easier than listing who it should be — and it's often the more valuable half. A well-defined boundary agent with fuzzy values beats an inspirational-but-spineless agent every time.

The Honest Trade-Off

SOUL.md doesn't make your agent truly persistent. It makes it behaviorally consistent. That's a meaningful distinction.

A persistent agent would remember that you told it to prefer short functions three weeks ago and still apply that preference today. SOUL.md achieves this not through memory but through instruction: the preference is encoded in the file, loaded at every session, applied every time.

The failure mode: if SOUL.md contradicts instructions in the session, most agents will follow the session instruction. The soul file is context, not constraint. If you want hard constraints, implement them at the harness level — not in a markdown file the model can reason its way around.

The success mode: for the vast majority of agent behavior — communication style, values-based decisions, consistent output format, behavioral guardrails — SOUL.md is sufficient. The 40% reduction in clarification exchanges documented by the Dev.to guide suggests that behavioral consistency alone is a substantial improvement over baseline.

What This Means for Agent Teams

The pattern scales. @chamath's Software Factory concept describes multi-agent systems as a "new control plane for how software is made." When you're running a fleet of agents — a researcher, a coder, a reviewer, a deployer — each needs its own identity. SOUL.md is how you make those identities distinct, consistent, and non-overlapping.

The architecture: one shared SOUL.md template with agent-specific overrides. The researcher has different values (thorough over fast), different communication style (evidence-first), different hard limits (no writing code). The coder has different values (correctness over elegance). SOUL.md's human-readable, version-controlled format makes this per-agent customization straightforward.

As @jason observed: "raising agents, not one-click automation." The shift in framing is significant. Raising implies care, consistency, ongoing relationship. It implies the agent has a character worth developing, not just a function worth optimizing. SOUL.md is the infrastructure that makes that framing technically coherent.

For multica-style managed agent teams, SOUL.md per agent is the missing piece between "a team of LLM calls" and "a team of agents with distinct roles and consistent behavior."

DEV Community