Build a Self-Improving Memory Layer for Claude Code with Hooks and RAG

#ai #machinelearning #research #deeplearning

Implement automatic hooks to capture Claude Code's work into a ChromaDB vector store and a CLAUDE.md file, creating a persistent, searchable memory for your project.

The Technique: Automatic Knowledge Capture with Hooks

The core innovation is using Claude Code's hook system to intercept events and automatically log them to a persistent knowledge base. This solves the "stateless AI" problem where every session starts fresh, forcing you to re-debug the same issues.

The author built a three-layer system:

ChromaDB Vector Store: For semantic search across captured errors, fixes, and learnings.
Graph Memory: To track relationships (e.g., Error → occurred_in → File).
CLAUDE.md File: A living project file updated after each session.

The magic is in the automation. You don't manually type /learn. Scripts watch Claude Code's activity and save the important bits.

Why It Works: Turning Ephemeral Sessions into Lasting Knowledge

Claude Code is powerful but forgetful. Its context is limited to the current session. This system externalizes that context into a searchable format. When you ask, "Have we seen this auth error before?" the RAG system can find the exact fix from two weeks ago.

The three-layer approach is key:

ChromaDB finds semantically similar issues, even if the phrasing differs.
Graph Memory answers relational questions like "What files are most error-prone?"
CLAUDE.md gives Claude immediate, high-priority context at the start of every new session, loaded automatically.

How To Apply It: Start with a Simple Hook

You don't need to build the full three-layer system immediately. Start by automating your CLAUDE.md file. Here’s a basic session_summary.py hook you can adapt:

# ~/.config/claude-code/hooks/session_summary.py
import json
from datetime import datetime

def on_stop(session_history, project_root):
    """Hook that runs when a Claude Code session ends."""
    claude_md_path = project_root / "CLAUDE.md"

    # Extract the last few messages to summarize the session
    recent_activity = session_history[-5:]  # Get last 5 messages

    summary = """## Session Summary - {date}

### What We Did
{activity_summary}

### Key Learnings / Pitfalls
- Add key insights here from the session.
""".format(
        date=datetime.now().strftime("%Y-%m-%d"),
        activity_summary='\n'.join([f"- {msg['role']}: {msg['content'][:100]}..." for msg in recent_activity])
    )

    # Append to CLAUDE.md
    with open(claude_md_path, 'a') as f:
        f.write(f"\n\n{summary}")

    print(f"[Hook] Session summary appended to {claude_md_path}")

To use this:

Save the script to your Claude Code hooks directory.
Ensure CLAUDE.md exists in your project root.
Every time you end a session (Ctrl+D or type /exit), the hook will fire and append a summary.

For the next level, implement an error-capturing hook using the PostToolUse event. The source provides a blueprint:

# capture_failure.py (PostToolUse hook)
def capture_failure(tool_result):
    if tool_result.exit_code != 0:  # Check if a shell command failed
        error = tool_result.output[:500]  # Get error snippet
        # Log to a simple JSON file for now
        log_entry = {
            "timestamp": datetime.now().isoformat(),
            "command": tool_result.command,
            "error": error,
            "file": tool_result.file
        }
        with open("debug_log.json", "a") as f:
            f.write(json.dumps(log_entry) + "\n")

This creates a searchable log of every failed command. You can later upgrade this to write to a local ChromaDB instance.

Integrating with MCP for a Smoother Workflow

Once you have data being captured, you can serve it back to Claude Code using the Model Context Protocol (MCP). Claude Code has native MCP support, mentioned in 34 prior sources. You can write a simple MCP server that:

Reads your debug_log.json or ChromaDB.
Exposes a tool like search_past_errors(query).
Lets Claude query past failures directly within the chat.

A basic MCP server skeleton:

# mcp_memory_server.py
from mcp.server import Server, NotificationOptions
import mcp.server.stdio
import json

server = Server("claude-memory")

@server.list_tools()
async def handle_list_tools():
    return [{
        "name": "search_past_errors",
        "description": "Search through previously captured error logs.",
        "inputSchema": {
            "type": "object",
            "properties": {
                "query": {"type": "string"}
            }
        }
    }]

@server.call_tool()
async def handle_call_tool(name, arguments):
    if name == "search_past_errors":
        query = arguments.get("query", "")
        # Simple grep-style search for now
        results = []
        try:
            with open("debug_log.json", "r") as f:
                for line in f:
                    if query.lower() in line.lower():
                        results.append(json.loads(line))
        except FileNotFoundError:
            pass
        return {
            "content": [{
                "type": "text",
                "text": json.dumps(results[:5], indent=2)  # Return top 5 matches
            }]
        }

# Run the server
if __name__ == "__main__":
    mcp.server.stdio.run(server)

Add this server to your Claude Code config (claude.toml):

[mcp_servers.memory]
command = "python"
args = ["/path/to/mcp_memory_server.py"]

Now, in any session, you can ask Claude: "Use the search_past_errors tool to see if we've hit a 'JWT expired' error before."

The Payoff: From Reactive to Proactive Debugging

The end goal is to have your CLAUDE.md file automatically prepopulated with a "Known Pitfalls" section. When you start a new session on a project, Claude immediately knows: