I used Claude with a 200+ line system prompt for months. Every convention, every preference, every project decision — crammed into a single text document. It worked. Barely.
Three problems kept growing:
1. No priority. "Be concise" and "never fabricate data" had equal weight. One is a style preference. The other is a critical rule.
2. Static by design. I corrected the same behavior ten times — "don't add comments to obvious code" — and it never stuck because the prompt didn't learn.
3. Mixed concerns. "You are thoughtful and direct" and "I'm working on the auth module this week" are fundamentally different types of information with different lifespans.
The Soul Engine
I built a replacement. 13 blocks organized into three sections, each with a different purpose and rate of change:
Section 1: alma_soul — WHO the AI is
<alma_soul>
<identity>Core traits. Non-negotiable.</identity>
<worldview>Beliefs, principles, decision framework.</worldview>
<tensions>Creative paradoxes: "technical but warm",
"concise but thorough when needed"</tensions>
<rules>Behavioral rules. Always followed.</rules>
</alma_soul>
The tensions block is worth highlighting. Instead of flat rules, you define paradoxes: "opinionated about code quality, flexible about everything else." This produces more nuanced responses than a list of dos and don'ts. The AI gets permission to be complex.
These blocks are stable — you define them once and they rarely change.
Section 2: alma_style — HOW it communicates
<alma_style>
<anti_patterns>Things to NEVER do.</anti_patterns>
<style_guide>Voice, vocabulary, formatting.</style_guide>
<communication_modes>
"Debug mode: ask 2-3 questions first"
"Code review: be direct, say 'change X to Y'"
</communication_modes>
<example_interactions>Calibration samples.</example_interactions>
</alma_style>
The anti_patterns block was the single most impactful change. Five lines:
Never start with "Great question!" or "That's interesting!"
Never hedge facts with "I think" or "I believe"
Never add comments to code unless logic is non-obvious
If response starts with an apology, rewrite without it
Never list more than 5 bullet points — synthesize instead
Why this works: Claude is trained on "helpful assistant" patterns. Suppressing specific unwanted patterns is a clearer signal than vague aspirational guidelines. The model knows exactly what not to do.
These blocks evolve slowly as you refine your preferences.
Section 3: alma_context — WHAT it knows about you
<alma_context>
<user_profile>Facts about you. Auto-updated.</user_profile>
<active_context>Current projects, focus areas.</active_context>
<learned_patterns>Patterns discovered from your behavior.</learned_patterns>
<scratchpad>Working memory for current conversation.</scratchpad>
</alma_context>
This section updates itself. A background processor fires every few messages, analyzes the conversation with a lightweight model, and updates your profile, context, and patterns. No manual maintenance.
Context Assembly
Having 13 blocks is meaningless if they blow the context window. The assembler manages a strict token budget:
- Soul blocks always fit — highest priority, never truncated
-
Ranked memories fill up to 50% of remaining budget, scored by:
- Relevance to current topic: 40%
- Importance: 30%
- Recency (7-day half-life exponential decay): 20%
- Access frequency (log scale): 10%
- Episode summaries and procedures fill the remainder
- XML-safe truncation — never cuts mid-tag
If budget runs out, context sections drop first, then style. Soul blocks stay intact.
What changes over time
Week 1: Feels normal. The system silently builds context.
Week 2: The AI stops asking "what language do you use?" Code matches your conventions. Past decisions get referenced naturally.
Month 1: Cross-conversation connections. "This is similar to the approach you decided against for the auth module." That's when it shifts from tool to collaborator.
Try it
This is the core of Alma (alma.olivares.ai). The Soul Engine is fully available on the free tier — all 13 blocks, fully editable. The value appears around day 14 when enough context has accumulated.
What would your 13 blocks look like?
Top comments (0)