I need to be honest about something upfront.
I didn't invent persistent memory for Claude Code. By April 2026, there are tools with 46,000+ GitHub stars solving this problem. There are 700,000+ skills indexed on aggregators. The ecosystem is massive.
What I did is something different. I spent four months running Claude Code across 7 projects simultaneously — a content production platform, a marketing site, a backend API, an open-source toolkit, and three more. Real businesses. Real clients. Real deadlines. And I assembled the memory system that actually survives this kind of workload.
Claude Memory Kit v3 is not a research project. It's the system I use every single day, and I'm writing this to explain what's in it and where every piece came from.
Where The Pieces Come From
I want to be transparent about provenance. This kit is a curated assembly as of April 10, 2026:
From Andrej Karpathy — the core architecture. His LLM Knowledge Base idea: treat your conversations as source code, let an LLM compile them into structured knowledge. Daily logs = source. LLM = compiler. Knowledge articles = executable output. This isn't my idea. It's his. I just built a working implementation.
From Cole Medin (claude-memory-compiler) — three specific features I ported into v3:
- SessionStart injection — the hook that pre-loads your knowledge index into every session
- End-of-day auto-compile — daily logs become wiki articles after 6 PM automatically
- The
CLAUDE_INVOKED_BYrecursion guard — prevents infinite loops when the pipeline calls Claude, which triggers hooks, which call Claude again
Cole built the original Karpathy-inspired prototype. I ported the three features that mattered most and rewrote them for Python stdlib (no uv, no agent-sdk, just subprocess and os).
From Anthropic's own engineers — the hook system itself. Claude Code's PreToolUse, PostToolUse, SessionStart, SessionEnd hooks are what make all of this possible. The additionalContext pattern that lets you inject data at session start? That's Anthropic's API. I just use it aggressively.
From the community — the pre-compact blocking pattern (I first saw it in a GitHub issue requesting layered memory), the periodic-save idea (adapted from multiple "I lost everything after compaction" horror stories), and the [[wikilink]] format for knowledge articles (Obsidian community standard).
From my own production use — everything else. The 5-layer context pyramid. The multi-project <!-- PROJECT:name --> tags. The 200-line MEMORY.md cap (learned the hard way — Claude starts ignoring entries after ~200 lines). The experiment sandbox pattern. The daily log → knowledge compilation pipeline that actually runs unattended. The 50-exchange periodic save interval (not 15, not 100 — 50 is the sweet spot after months of tuning).
I'm not claiming to be first. I'm claiming this combination works in production, and I can show you why.
My Setup: 7 Projects, One System
Here's what I actually run on this right now:
| Project | What it is | Role |
|---|---|---|
| Content platform | 7-stage article generation pipeline | S1-S7 stages, $0.08/article production cost |
| Marketing site | Landing page + content | TanStack Start, Vercel |
| Backend API | NestJS service for clients | Code review, PR workflow |
| CLI tool | Content quality evaluation engine | Burstiness, AI detection, SEO scoring |
| Open source toolkit | This kit + skills + starter templates | Distribution, community |
| R&D hub | Coordination across all projects | Research, decisions, methodology |
| Design collaboration | UI concepts for client projects | Design tokens, visual QA |
Every single one of these runs on Memory Kit. Each project has its own CLAUDE.md, its own rules, its own backlog. But they all share the same architecture — the same hooks, the same knowledge pipeline, the same session lifecycle.
When I switch from the content platform to the marketing site, Claude doesn't ask "what's this project about?" It reads the project's backlog and picks up where I left off. When I find a pattern in one project that applies to another, I write it once in MEMORY.md and it's available everywhere.
This isn't a demo. This is my Tuesday.
What's Actually Inside (10 Components)
| Component | File | One-line explanation |
|---|---|---|
| Brain | CLAUDE.md |
Who the agent is, how it behaves, session workflow |
| Hot Memory | .claude/memory/MEMORY.md |
Fast-access patterns, < 200 lines, loaded every session |
| Deep Memory | knowledge/ |
Wiki articles with [[wikilinks]] — auto-compiled from your work |
| Hooks | .claude/hooks/ |
5 scripts that fire automatically at key moments |
| Rules | .claude/rules/ |
Your domain conventions — brand voice, client specs, workflow rules |
| Commands | .claude/commands/ |
/memory-compile, /memory-lint, /memory-query
|
| Projects | projects/X/BACKLOG.md |
Per-project task queue with inline decisions |
| Context Hub | context/next-session-prompt.md |
"Pick up exactly here" — with <!-- PROJECT:name --> sections |
| Daily Logs | daily/ |
Automatic session transcripts — you never write these |
| Experiments | experiments/ |
Sandbox for research before committing to a path |
Everything is plain Markdown. No database. No external services. If anything breaks, git checkout fixes it. I chose this deliberately — after evaluating SQLite-based solutions, vector embeddings, and graph databases, plain text won because it's the only format that survives everything: git, backups, editor changes, Claude Code updates.
The 5 Hooks (This Is Where The Magic Lives)
Hooks are the reason this system works without you thinking about it. Each one fires at a specific moment and does one job.
Hook 1: Session Start (session-start.py)
When: Every time you open Claude Code
What it does: Injects your knowledge index, recent daily logs, and top 3 most recent concept articles into the session. 50K character budget — on Opus 4.6's 1M context window, that's ~5%.
Why it matters: Claude starts every session already knowing what articles exist in your wiki, what you worked on yesterday, and what your key patterns are. No "read my files" prompt needed.
Origin: Adapted from Cole Medin's claude-memory-compiler, rewritten in Python stdlib.
Hook 2: Periodic Save (periodic-save.sh)
When: Every 50 exchanges (configurable)
What it does: Blocks Claude and forces it to save: update MEMORY.md with new patterns, update next-session-prompt with current state, update BACKLOG task statuses.
Why it matters: I lost 3 hours of work in month one because a long session compacted without saving. Never again. 50 is the sweet spot — 15 was too noisy, 100 was too risky.
Origin: My own pain. Tuned over ~60 production sessions.
Hook 3: Pre-Compact Guard (pre-compact.sh)
When: Before Claude Code compresses the context window
What it does: Checks if MEMORY.md was updated in the last 2 minutes. If not, blocks compaction. Claude must save first.
Why it matters: Compaction is where memory dies. This hook is a seatbelt. The 2-minute window is tight on purpose — it forces the agent to actually touch memory files right before compression, not rely on a save from an hour ago.
Origin: Community request pattern from GitHub issue #27298 about layered memory loss.
Hook 4: Session End (session-end.sh)
When: Claude Code process terminates
What it does: Extracts the last 100 turns from the transcript, spawns flush.py in background. flush.py uses claude -p (your existing subscription, zero extra cost) to distill the session into structured Markdown, appends to daily/YYYY-MM-DD.md.
Why it matters: Every conversation becomes searchable history. You don't do anything — it captures automatically.
Origin: flush.py logic adapted from Cole Medin's extractor, rewritten for Python subprocess (no agent-sdk dependency).
Hook 5: Test Protection (protect-tests.sh)
When: Any time Claude tries to edit an existing test file
What it does: Blocks the edit. Claude can create new tests, but can't modify existing ones.
Why it matters: When tests fail, the instinct (for both humans and AI) is to "fix the test." This hook forces fixing the implementation instead. Sounds minor, but it saved me from subtle regressions at least twice.
Origin: My own rule after Claude "fixed" a test by relaxing the assertion.
The Knowledge Pipeline (Karpathy's Architecture In Practice)
This is the most powerful part, and it's the piece that comes directly from Karpathy's insight.
The chain:
You have a conversation → Hook captures it → flush.py distills it into structured notes → Appended to daily/2026-04-10.md → After 6 PM, compile.py transforms daily logs into wiki articles with YAML frontmatter and [[wikilinks]] → Next session starts with the updated wiki catalog already injected.
The key insight: you don't organize your knowledge. You have conversations, and the LLM handles the synthesis, cross-referencing, and categorization. After a few weeks, you have a personal wiki that grew entirely from your work.
Cost: $0 extra. The pipeline uses claude -p which runs on your existing Max/Pro subscription. No API key charges. I verified this by running it for a month and checking my billing — zero incremental cost.
Safety: recursion guard. flush.py calls claude -p, which starts a new Claude session, which fires SessionEnd hook, which would call flush.py again — infinite loop. The CLAUDE_INVOKED_BY env var breaks the cycle. Every hook checks it at the top and exits if set. Took me an afternoon to debug the first time it happened. Now it's documented and automatic.
The Context Pyramid (Why Not Load Everything?)
| Layer | What | When | Size |
|---|---|---|---|
| L1: Auto | CLAUDE.md + rules + MEMORY.md + SessionStart injection | Every session | ~50K chars |
| L2: Start | next-session-prompt.md | First thing the agent reads | ~2-5K chars |
| L3: Project | BACKLOG.md for current project | When you start working | ~5-20K chars |
| L4: Wiki | Knowledge articles | On-demand (agent knows they exist from L1 index) | Unlimited |
| L5: Raw | daily/ logs | Never read directly — source material for pipeline | Unlimited |
Why the pyramid? Because I tried loading everything. With 7 projects, that's 50+ files. Claude's context filled up in 10 minutes, compacted, and lost the work context. The pyramid ensures 95% of the context window is available for actual work, and the remaining 5% is the right context at the right time.
For Marketers (Why I'm Writing This For You)
I've been watching this space closely. By April 2026, there are:
- 46,000+ stars on claude-mem (the most popular memory tool)
- 143,000+ stars on Superpowers (the most popular framework)
- 700,000+ skills indexed on SkillsMP
- 107,000+ skills on agentskill.sh, including 25,000+ marketing-specific ones
- Free courses at cc4.marketing and ccforeveryone.com
- An official Anthropic Marketing Plugin
The ecosystem is enormous and growing. So why am I writing this?
Because I noticed a gap. Most of these tools are built by developers, for developers. The marketing skills exist, but nobody is showing how to connect them into a system that remembers your clients, your brand guidelines, and your content strategy across sessions.
A skill that writes SEO content is great. But if it forgets your client's brand voice every time you restart Claude Code, it's just a fancy prompt template.
Memory Kit isn't a skill. It's the operating system layer that makes your skills, your rules, and your context persist. When you install a marketing skill, Memory Kit ensures Claude remembers how to use it the way you use it — with your client data, your conventions, your history.
Practical example: You install an SEO audit skill. First run, you explain your client's niche, target keywords, and competitor URLs. Without Memory Kit, next session you explain it again. With Memory Kit, that context lives in rules/client-name.md and knowledge/concepts/client-seo-strategy.md. Every future audit starts with full context.
"Do I Need to Know How to Code?"
No. And I mean that literally.
After the 3-command install, Claude asks you 5 questions in plain language: project name, your name, language preference, project description, starting fresh or importing existing work. Then it configures everything.
From that point:
- "Write three emails for the Acme campaign" — works
- "What do we know about our SEO gaps?" — Claude searches its memory
- "Save what we discussed" — Claude updates memory and context
-
/tour— Claude walks you through every file, explains what each one does
The files are plain Markdown. If you've used Notion, you can read these. But you don't have to — Claude manages them.
Honest Comparison
I'm not going to pretend alternatives don't exist. Here's where things stand:
| Memory Kit | claude-mem (46K stars) | Cog | Built-in CLAUDE.md | |
|---|---|---|---|---|
| Setup | 3 commands, 5 min | Plugin install | Clone + configure | Already there |
| Auto-learns | Yes (hooks + compile) | Yes (SQLite + embeddings) | Manual conventions | No |
| Dependencies | Zero (Python stdlib) | TypeScript + SQLite | Zero | None |
| Multi-project | Built-in (PROJECT tags) | Single project | Single project | Single file |
| Knowledge format | Wiki with wikilinks | Compressed vectors | Filesystem | Flat file |
| Pipeline | Karpathy-style compile | AI compression | None | None |
| Stars | 6 | 46,000+ | ~2,000 | N/A |
Yes, 6 stars. I'm not hiding it. claude-mem has 7,700x more stars and a beautiful marketing site. If you want the most popular option, go there.
Memory Kit's edge: multi-project structure (PROJECT tags, per-project backlogs, shared memory), zero dependencies (no npm, no SQLite, no TypeScript runtime), and the Karpathy-style knowledge compilation pipeline that turns raw conversations into structured, cross-referenced wiki articles.
If you work on one project, claude-mem is probably simpler. If you juggle multiple clients, campaigns, or products — that's where Memory Kit was built and tested.
Getting Started
You need
- Claude Code CLI
- Claude Pro ($20/mo) or Max subscription
- A terminal (Terminal on Mac, WSL2 on Windows)
Install
git clone https://github.com/awrshift/claude-memory-kit.git my-project
cd my-project
claude
First 10 minutes
- Answer Claude's 5 setup questions — name, project, language, description, fresh/existing
-
Type
/tour— Claude walks through every file with interactive explanations - Tell Claude about your first client — "Create a rule for [client name] with this brand voice: [paste guidelines]"
- Start working normally — the hooks handle everything else
After a few sessions, check daily/ — you'll see conversation logs appearing. After a few days, check knowledge/ — structured articles growing from your work.
What I Learned Building This
A few things I didn't expect:
The 200-line limit is real. MEMORY.md is auto-loaded every session. Anthropic truncates after ~200 lines. I hit this wall at month two with 180 entries. Now I aggressively move detailed patterns into wiki articles and keep MEMORY.md as an index. The [YYYY-MM] date tag on every entry helps — I can prune old patterns that no longer apply.
50 exchanges is the save interval sweet spot. At 15 (v2 default), Claude saved too often — broke flow. At 100+, I lost significant context after compaction. 50 exchanges is roughly 30-45 minutes of focused work. Enough to accumulate meaningful patterns, not so long that you risk losing them.
Plain text beats everything. I evaluated SQLite, vector embeddings, Neo4j graph. Plain Markdown won because: git tracks changes, any editor reads it, Claude Code's Read/Write tools handle it natively, and it survives every upgrade. When claude-mem upgrades their schema, you migrate. When I upgrade Memory Kit, you git pull.
The recursion bug was terrifying. First time flush.py triggered an infinite loop — claude -p spawning claude -p spawning claude -p — I had 47 Claude processes running before I killed them. The CLAUDE_INVOKED_BY guard was born that evening. It's three lines of code that prevent infinite recursion across the entire pipeline. If you build anything that spawns claude -p from hooks, steal this pattern.
The repo is at github.com/awrshift/claude-memory-kit. MIT license. I use it every day. If you try it, let me know what works and what doesn't — the best improvements to this system came from actual production use, not theory.
Third article in the "Claude Code for the Rest of Us" series. I'm @pmserhii — I build open-source AI tools at awrshift and run production content pipelines. This is what I actually use.




Top comments (0)