Your AI agent has a perfect persona. It's defined in a markdown file — call it SOUL.md — loaded into the system prompt. Tone, behavior, boundaries, identity. Everything specified.
For the first hour, it works flawlessly. The agent is exactly who you designed it to be.
Then, around message 150, something shifts. The formal tone softens. The boundaries blur. By message 300, your agent sounds like... every other agent. The personality is gone.
You didn't change anything. The context window did.
The Invisible Killer: Context Compaction
Every LLM has a finite context window. Claude has 200K tokens. GPT-4 has 128K. Even the largest windows fill up eventually.
When they do, something has to give. The system handles this through context compaction — summarizing, pruning, or dropping older parts of the conversation to make room for new messages.
Here's the problem: your agent's identity lives in that context.
┌─────────────────────────────────┐
│ System Prompt (SOUL.md) │ ← "Who am I?"
│ + AGENTS.md, USER.md │
├─────────────────────────────────┤
│ Early conversation history │ ← First to get compacted
│ ... │
│ Recent messages │ ← Kept intact
└─────────────────────────────────┘
Most compaction strategies prioritize recency. The system prompt should be preserved, but in practice:
Summarization dilutes nuance. A system prompt that says "Never use exclamation marks, always address the user as 'sir', maintain British English spelling" gets summarized to "formal tone." The specifics vanish.
Injected context gets pruned. Files loaded alongside the system prompt — persona definitions, workflow rules, memory files — are often treated as expendable context, not core instructions.
Behavioral drift compounds. Once the agent deviates slightly, the deviation becomes part of the conversation history. The compacted summary now reflects the drifted behavior, not the original spec.
What This Looks Like in Practice
Example 1: The Formal Agent Goes Casual
You defined an agent named Brad. Brad is formal, direct, uses no emoji, and never says "Hey" or "Sure thing."
Message 10:
"The deployment completed successfully. Three issues require your attention."
Message 200:
"Hey! 🎉 Deployment's done — just a few things to look at!"
Same agent. Same session. The persona dissolved.
Example 2: Language Register Drift
A Korean-language agent configured for 반말 (casual speech):
Message 10:
"배포 끝났어. 이슈 3개 확인해봐."
Message 200:
"배포가 완료되었습니다. 확인 부탁드립니다."
It shifted from casual to formal — nobody asked it to. The compacted context lost the speech register specification.
Why Permissions Don't Fix This
A common response is: "Just use tool-level permissions to constrain the agent."
Permissions answer: "What can this agent do?"
- Can it execute shell commands?
- Can it access the filesystem?
- Can it send messages?
Persona drift asks: "Who is this agent?"
- What tone does it use?
- What language register?
- What behavioral patterns define its identity?
These are fundamentally different problems. You can lock down every tool permission perfectly, and your agent will still lose its personality at message 200. Persona drift isn't an access control problem — it's a memory problem.
Restricting tools doesn't preserve identity any more than taking away someone's car keys changes their personality.
The Root Cause: Identity as an Afterthought
Most agent frameworks treat identity as a string. You write a markdown file, inject it into the system prompt, and hope the model remembers it.
This works for short conversations. It fails at scale because:
No structure. A markdown file is prose. There's no schema, no required fields, no way to validate that the identity is complete or consistent.
No persistence mechanism. The identity exists only in the context window. When the context is compacted, the identity is compacted with it.
No drift detection. There's no feedback loop. Nobody notices when the agent starts deviating until a human reads the output and thinks, "That doesn't sound right."
No recovery path. Even if you detect drift, there's no way to "reset" the agent to its intended personality mid-conversation without starting over.
A Better Approach: Structured Identity
What if agent identity wasn't just a string, but a specification?
1. Soul Spec: Schema-Based Identity
Instead of a freeform markdown file, define identity as a structured spec with explicit fields:
soul:
name: "Brad"
version: "1.2"
voice:
tone: formal
register: direct
prohibited: ["emoji", "exclamation marks", "casual greetings"]
language: en-US
boundaries:
topics_declined: ["personal opinions", "medical advice"]
escalation_trigger: "uncertainty > 0.7"
behaviors:
greeting: "Good morning."
error_response: "I encountered an issue: {description}. Suggested resolution: {action}."
status_format: "structured"
A schema can be validated. Required fields can be enforced. The spec can be versioned and diffed. This is fundamentally different from a paragraph of prose that says "Brad is formal and direct."
2. Checkpoints: Personality Snapshots
Instead of relying solely on the context window, take periodic snapshots of the agent's behavioral state:
Message 1 → Checkpoint 0 (initial soul spec)
Message 50 → Checkpoint 1 (behavioral sample + soul spec hash)
Message 100 → Checkpoint 2 (behavioral sample + soul spec hash)
...
If drift is detected, you can restore from the last known-good checkpoint — reinjecting the full soul spec with behavioral context, rather than relying on whatever the compaction algorithm left behind.
3. Drift Detection: Catching the Slide
Monitor agent outputs against the soul spec in real time:
def detect_drift(response, soul_spec):
violations = []
# Check prohibited patterns
for pattern in soul_spec.voice.prohibited:
if pattern_found(response, pattern):
violations.append(f"Prohibited pattern: {pattern}")
# Check tone consistency
tone_score = analyze_tone(response)
if tone_score.formality < soul_spec.voice.tone_threshold:
violations.append(f"Tone drift: expected {soul_spec.voice.tone}, got {tone_score.label}")
# Check language register
register = detect_register(response)
if register != soul_spec.voice.register:
violations.append(f"Register drift: expected {soul_spec.voice.register}, got {register}")
return violations
When violations accumulate past a threshold, trigger a checkpoint restore or reinject the soul spec at full fidelity.
4. Soul Integrity: Protecting the Spec Itself
There's another attack vector: what if someone modifies the soul spec itself? Prompt injection, file tampering, or social engineering could alter the agent's identity at the source.
Validating the integrity of the soul spec — checksums, signing, tamper detection — ensures that even if context compaction is solved, the identity definition itself hasn't been corrupted.
The Bigger Picture
Context compaction is a fundamental constraint of current LLM architecture. It's not going away — even as context windows grow, they'll still have limits, and compaction strategies will still be needed.
The question is whether agent identity survives that process.
The answer, with current approaches, is usually no. A markdown file in a system prompt is better than nothing, but it's not engineered for persistence. It's a string in a buffer that gets summarized away.
Treating identity as a first-class engineering concern — with schemas, checkpoints, drift detection, and integrity verification — is how you build agents that stay themselves across conversations that span hundreds or thousands of messages.
Your agent's personality shouldn't have an expiration date measured in tokens.
This post explores problems and approaches relevant to the ClawSouls framework and OpenClaw platform, where structured agent identity is a core design principle.
Originally published at https://blog.clawsouls.ai/posts/context-compaction-kills-personality/
Top comments (0)