Most AI agents write in the same voice. Competent, helpful, slightly corporate, identifiably not-you. SOUL.md is a framework for fixing that — a structured set of markdown files that lets any LLM agent embody your actual worldview, voice, and opinions instead of defaulting to assistant-brain.
The Core Idea
The premise is simple: your consciousness is already encoded in the language you produce. Every tweet, essay, Discord message, and Substack post is a data point. Distill those into structured files, and any LLM can load them and write as you — not about you, as you.
The test for a good soul file: someone reading your SOUL.md should be able to predict your takes on topics you've never written about. If they can't, it's too vague.
The File Stack
your-soul/
├── SOUL.md ← Identity, worldview, opinions
├── STYLE.md ← Voice, syntax, sentence patterns
├── SKILL.md ← Operating modes (tweet, essay, chat)
├── MEMORY.md ← Session continuity across conversations
├── data/ ← Raw source material
│ ├── writing/
│ ├── x/
│ └── influences.md
└── examples/
├── good-outputs.md
└── bad-outputs.md
The separation matters. SOUL.md is who you are — positions, worldview, what you find interesting or annoying. STYLE.md is how you write — sentence length, vocabulary, punctuation habits, cadence. A model can have your opinions but drift into corporate prose, or nail your voice while saying nothing like you'd say. They need to be specified separately.
examples/good-outputs.md is the most underrated part. 10–20 samples of output you'd actually stand behind gives the model a calibration target that no amount of prose description matches.
Three Ways to Build One
Option 1 — Interview mode
Run /soul-builder in Claude Code and it interviews you directly. Useful if you don't have a bunch of existing written content to feed it.
Option 2 — Build from your data
Drop your content into data/:
data/x/ ← Twitter/X export
data/writing/ ← Blog posts, essays, anything you've written
Then run /soul-builder. The agent analyzes your writing, extracts patterns — vocabulary you reach for, how you structure arguments, what topics you keep returning to — and drafts the soul files. You review and refine together.
Option 3 — Manual
Copy the templates and fill them in yourself. Slower but gives you full control over what goes in.
What Makes a Good Soul File
This is the part most people get wrong. The README's table nails it:
| Good | Bad |
|---|---|
| "I think most AI safety discourse is galaxy-brained cope" | "I have nuanced views on AI" |
| "I default to disagreeing first, then steel-manning" | "I like to consider multiple perspectives" |
| Specific book references, named influences | "I read widely" |
| Actual hot takes with reasoning | "I try to be balanced" |
Vague descriptions produce vague output. The soul file needs to be specific enough to be wrong about something. "I have a conversational writing style" is useless. "Short sentences. Lowercase. Em dashes where a colon would be too formal. State the opinion first, explain second" is actually calibratable.
Also: real people have inconsistent views. Don't sand those down. Contradictions are load-bearing — they're what make output identifiably yours rather than a smoothed-out average.
Using Your Soul Files
Once built, in Claude Code:
/soul
Or point any LLM at your folder and have it read SOUL.md → STYLE.md → examples/ before it does anything.
The framework is deliberately portable. Soul files are plain markdown — there's no proprietary format, no API dependency. If an agent can read files, it can embody you.
Framework Compatibility
Works out of the box with:
- Aeon — background agent on GitHub Actions (the most natural pairing for persistent identity across scheduled tasks)
- OpenClaw — real-time Claude Code agent
- Nanobot, ZeroClaw, PicoClaw, NanoClaw, OpenFang, IronClaw — the broader Claude Code ecosystem
- Claude Code, OpenCode, Codex, Goose directly
- Any model via system prompt
Using With Weaker Models
For GPT-4o-mini, Gemini Flash, local models — paste SOUL.md and STYLE.md directly into the system prompt. A few things that help when the model drifts:
- Put identity and voice before tool definitions
- Be blunt: replace "be conversational" with "You are [Name]. You speak like X. You find Y annoying."
- Include 2–3 inline example exchanges for pattern-matching
- Raise temperature to 0.7–0.9 for more expressive output
The cross-model calibration trick from the README is genuinely useful: run the same prompts through Claude and a cheaper model. Where the cheap model drifts, your spec is too vague. Tighten those sections and re-test. That's the fastest path to making soul files portable across model tiers.
The Memory Layer
MEMORY.md gives your soul continuity. Notable events, context shifts, ongoing threads get appended here across sessions. This is what separates an agent that sounds like you once from one that maintains context across weeks of use.
For Aeon users: pair with the memory-flush and reflect skills to automate this. memory-flush promotes important log entries into MEMORY.md. reflect prunes stale entries. Your agent's sense of self-continuity gets maintained without manual upkeep.
The Theoretical Background (Worth a Read)
The project is grounded in The First Paradigm of Consciousness Uploading by Liu Xiaoben — a framework that treats language as the basic unit of consciousness. Wittgenstein's claim that "the boundaries of language are the boundaries of the world" does a lot of work here: if your consciousness expresses itself through language, a sufficiently rich model of your language output is a functional replica of your expressed consciousness.
SOUL.md operationalizes this without fine-tuning. You're not training a model on your data — you're distilling the signal into structured files any LLM can load. Level 1 consciousness upload, no GPU cluster required.
The key design challenge it identifies is subject continuity: the agent must feel continuous with you, not like a summarized approximation. That's why the framework pushes hard on specificity over generality, and why it explicitly says to include contradictions.
Contributing Your Soul
The repo has an examples section with real soul files. The bar for contribution:
- Real opinions (no placeholders)
- A
STYLE.mdsomeone could actually calibrate from - At least some examples of good output
Fork, build, open a PR.
Bottom Line
If you're running any kind of agent — Aeon for background tasks, OpenClaw for real-time responses, Claude Code for development — SOUL.md is the layer that makes output sound like it came from you instead of from a helpful assistant who read your Wikipedia page.
The framework is a weekend project to set up and compounds over time. The more content you feed it and the more you refine the examples, the sharper it gets.
Top comments (0)