WonderLab

Posted on Mar 19 • Edited on Jun 4

Open Source Project of the Day (Part 21): Claude-Mem - Persistent Memory Compression System for Claude Code

#ai #claudecode #mcp #opensource

Introduction

"Memory is the foundation of intelligence — a true AI assistant is one that still remembers your project after multiple conversations."

This is Part 21 of the "Open Source Project of the Day" series. Today we explore Claude-Mem (GitHub).

When writing code with Claude Code, every new session starts as a blank slate: it doesn't remember what logic you changed last time, what bug you fixed, or what conventions exist in your project. Claude-Mem was built to solve exactly this problem: it automatically captures tool usage and observations during sessions, semantically compresses and summarizes them with AI, and injects relevant context on demand in new sessions — allowing Claude to maintain continuous understanding of your project across multiple sessions, even after reconnecting.

What You'll Learn

Claude-Mem's core value: cross-session persistent memory and progressive disclosure
How the 5 lifecycle hooks collaborate with the Worker service, SQLite, and Chroma
The three-tier retrieval workflow (search → timeline → get_observations) and ~10x token savings
How to install, configure, and use it in Claude Code / OpenClaw
Comparison with similar "AI memory" solutions and selection criteria

Prerequisites

Basic experience using Claude Code (or Claude Desktop for code editing)
Understanding of plugins / hooks concepts
Basic familiarity with SQLite and vector retrieval (RAG) is helpful but not required

Project Background

Project Introduction

Claude-Mem is a plugin for Claude Code that implements a "persistent memory compression system." It automatically records tool calls and observations from coding sessions, compresses them with AI summarization, stores them locally, and injects relevant context on demand when new sessions start or during ongoing sessions — allowing Claude to continue understanding your current project across sessions and even across device reconnections.

Core problems the project solves:

Claude Code has no cross-session memory by default — every new session requires re-introducing the project
Complete history from long sessions consumes large amounts of tokens, which is expensive and easily exceeds limits
Lack of "on-demand loading" semantic retrieval makes it impossible to precisely recall history relevant to the current task
Developers want AI to remember project conventions, fixed bugs, API usage patterns, etc., without manually pasting them every time

Target user groups:

Developers who regularly use Claude Code for long-term project development
Teams that need AI to remember project context and reduce repeated explanations
Technical users interested in AI memory, RAG, and context engineering
Users of OpenClaw or similar gateways who want unified memory capabilities

Author/Team Introduction

Author: Alex Newman (@thedotmack)
Copyright: Copyright (C) 2025 Alex Newman
Tech stack: Primarily TypeScript (~81.9%), with JavaScript, Shell, and HTML; built on the Claude Agent SDK
Ecosystem: Official documentation at docs.claude-mem.ai, Discord, X @Claude_Memory, and more

Project Stats

⭐ GitHub Stars: 27.7k+
🍴 Forks: 1.9k+
📦 Version: v10.0.6 (as of February 2026; check GitHub Releases for latest)
📄 License: GNU Affero General Public License v3.0 (AGPL-3.0); the ragtime/ directory is separately licensed under PolyForm Noncommercial 1.0.0
🌐 Website/Docs: claude-mem.ai / docs.claude-mem.ai
💬 Community: GitHub Issues, Discord, X @Claude_Memory

Main Features

Core Purpose

Claude-Mem's core purpose is to provide cross-session persistent memory and on-demand context injection for Claude Code:

Automatic capture: Uses lifecycle hooks to capture tool usage and observations during sessions (e.g., file reads, command execution, edit results, etc.)
AI compression and summarization: Uses Claude (via agent-sdk) to semantically summarize and compress observations, controlling storage and retrieval costs
Persistent storage: Writes sessions, observations, and summaries to SQLite, and uses Chroma for vector retrieval supporting hybrid search
On-demand injection: Injects relevant memories into Claude's context in new sessions or during ongoing sessions based on the current task
Progressive disclosure: Returns lightweight indexes first (e.g., ID lists), then fetches details on demand — significantly reducing token usage (~10x savings)

Use Cases

Continuous multi-session development
- Fix a bug today, add a feature tomorrow — Claude still remembers previous changes and conventions
- Switch machines or reconnect without needing to re-explain the project structure
Project conventions and knowledge accumulation
- Remember "API must include X-API-Key header", "database uses SQLite, schema is at xxx"
- Use save_memory to proactively save key information for retrieval in future sessions
Bug and change tracing
- Search history in natural language (e.g., "authentication-related fixes"), view context with timeline
- Use observation IDs to view full content and references via Web Viewer or API
Team or gateway unified memory (OpenClaw)
- Install Claude-Mem on an OpenClaw gateway to provide unified persistent memory for multiple users or sessions
Context engineering and cost control
- Use progressive disclosure strategy to control injected token volume, balancing "remembers everything" with "doesn't overflow context"

Quick Start

In a new Claude Code session, run:

# Add plugin marketplace source
/plugin marketplace add thedotmack/claude-mem

# Install plugin
/plugin install claude-mem

After installation, restart Claude Code — new sessions will automatically include relevant memories from past sessions.

OpenClaw one-click install (on the OpenClaw gateway):

curl -fsSL https://install.cmem.ai/openclaw.sh | bash

The installer handles dependencies, plugin configuration, AI provider configuration, Worker startup, and optional real-time observation streams via Telegram/Discord/Slack.

Core Features

Persistent Memory
- Context is preserved across sessions; new sessions automatically get summaries of relevant history for the current project/task
Progressive Disclosure
- Tiered memory retrieval: first get a compact index (~50–100 tokens/entry), then fetch full observations by ID (~500–1000 tokens/entry), significantly reducing token consumption
Skill-based Search (mem-search)
- Query project history using natural language; supports searching memories directly in Claude Desktop conversations
Web Viewer UI
- Real-time memory stream and retrieval interface: http://localhost:37777 — view observations, references, and settings
Privacy Controls
- Content wrapped in a "private" tag (<private>) is excluded from storage — suitable for secrets, passwords, and other sensitive information
Context Configuration
- Fine-grained control over injected content and behavior via ~/.claude-mem/settings.json (model, port, data directory, log level, etc.)
References and Traceability
- Observations have IDs; view original content and references via http://localhost:37777/api/observation/{id} or Web Viewer
Beta Capabilities
- Such as Endless Mode (biologically-inspired memory architecture for longer sessions); toggle between stable/beta in Web Viewer → Settings

Project Advantages

Comparison	Claude-Mem	Manual history/rules files	Other memory plugins (if any)
Cross-session memory	Automatic, on-demand injection	Must paste manually each time	Implementation-dependent
Token usage	Progressive disclosure, ~10x savings	Paste entire history, easily overflows	Implementation-dependent
Retrieval method	Semantic + keyword hybrid (Chroma)	None	Mostly keyword or simple vector
Claude Code integration	Deep integration (hooks + MCP)	No integration	Varies
Privacy	"private" tag exclusion	Fully local control	Depends on whether it's local

Why choose Claude-Mem?

Designed specifically for Claude Code, deeply integrated with session lifecycle and MCP toolchain
Clear documentation and architecture (lifecycle hooks, Worker, DB, Search all explained), easy to understand and extend
Open source, active (27k+ Stars), sustainable community and iteration
Supports OpenClaw and similar gateways for team or gateway-level deployment

Detailed Project Analysis

Architecture Overview

Claude-Mem has several main threads: Hook collection → Worker service → Storage & retrieval → Injection & search.

Lifecycle Hooks (5+1)
Execute scripts at key moments, passing "what happened" to the Worker:
- SessionStart: Session begins, can initialize or load the last summary
- UserPromptSubmit: Before/after the user submits a prompt
- PostToolUse: After each tool call, submits an observation (tool name, inputs/outputs, etc.)
- Stop: Session stopped by the user
- SessionEnd: Session ends normally There's also a Smart Install pre-check script (dependencies, etc.) that doesn't belong to the lifecycle above but runs during the install/startup process.
Worker Service
A persistent HTTP service on default port 37777, managed by Bun. Provides:
- Web Viewer (memory stream, observation list, settings)
- ~10 search/write-related APIs (search, fetch observations by ID, write memories, etc.)
- Bridge between hooks and MCP tools
SQLite Database
Stores structured data: sessions, observations, summaries, etc. Also uses FTS5 for full-text search supporting the keyword retrieval path.
Chroma Vector Store
Stores embeddings for semantic retrieval; combined with SQLite to form hybrid search (keyword + semantic), improving recall quality.
mem-search Skill
Exposes "search memories with natural language" capability to users/Claude, internally using MCP tools and following progressive disclosure (index first, then details).
MCP Tools
Allows Claude to actively query and write memories during conversation, strictly following the three-tier workflow "search → timeline → get_observations" to control token usage.

Three-Tier Retrieval Workflow (MCP Search Tools)

To save tokens, Claude-Mem breaks "searching memories" into three steps, only fetching full text for observations that are truly needed:

search
Search the memory index with natural language or keywords, returning a compact list (with ID, type, timestamp, summary, etc.) at ~50–100 tokens/entry.
timeline
Around a specific observation or query, retrieve surrounding context in chronological order — still a relatively compact temporal view, useful for deciding "whether to expand."
get_observations
Based on the IDs filtered in the previous two steps, batch-fetch full observation content (~500–1000 tokens/entry). Only called when relevance is confirmed, achieving ~10x overall token savings.

Additionally:

save_memory: Actively write a memory entry (e.g., API conventions, project decisions), making it available for future semantic retrieval.
__IMPORTANT: Workflow instructions, always visible to Claude, guiding it to use the three-tier process and avoid bulk fetching full text at once.

Example (logical illustration, not directly executable code):

// 1. First search the index
search(query="authentication bug", type="bugfix", limit=10)

// 2. Check timeline (optional)
timeline(observation_id=123)

// 3. Only fetch full text for needed IDs
get_observations(ids=[123, 456])

// Manually save a memory entry
save_memory(text="API requires auth header X-API-Key", title="API Auth")

Configuration and Extension

Configuration: ~/.claude-mem/settings.json — configure AI model, Worker port, data directory, log level, injection strategy, and more.
Extension:
- Hook scripts can be customized to fit special environments or collection needs.
- Documentation includes Hooks Reference, Architecture Overview, Search Architecture, etc., for secondary development or integration with existing systems.

System Requirements and Dependencies

Node.js 18.0.0+
Claude Code: Latest version supporting plugins
Bun: Worker and process management (auto-installed if missing)
uv: Python dependencies for vector search (auto-installed if missing)
SQLite 3: Local persistence (usually pre-installed)

Project Resources

Official Resources

🌟 GitHub: github.com/thedotmack/claude-mem

Related Resources

OpenClaw Integration Guide (gateway deployment and real-time observation streams)
Progressive Disclosure (context and token optimization ideas)
Context Engineering (AI agent context optimization principles)

Who Should Use This

Everyday Claude Code developers: Want to reduce repeated explanations and let AI remember the project and decisions
Team or gateway administrators: Provide unified persistent memory in OpenClaw and similar environments
People interested in AI memory, RAG, and context engineering: Can learn from the hook design, hybrid retrieval, and progressive disclosure implementation
Teams needing references and audit trails: Trace "what the AI saw at the time" through observation IDs and Web Viewer

Check out PrimeSkills — a curated marketplace of AI agents and skills that have been validated in real-world, enterprise-grade workflows. No fluff, just what actually works.

Find more useful knowledge and interesting products on my Homepage

DEV Community