Xinzhi Sherry Zhu

Posted on Aug 31 • Originally published at zenn.dev

Building a Local Memory MCP for Claude Desktop - A Journey of AI Memory

#claude #mcp #memory #ai

This is the English translation of my original Japanese article: https://zenn.dev/arvehisa/articles/local-memory-mcp-for-claude-desktop

Introduction

Claude Sonnet 4 excels not only at coding but also provides rich, philosophical conversations that feel more human than other AIs. I find myself discussing not just technical matters, but personal topics as well.

However, unlike ChatGPT, Claude lacks built-in memory functionality, which often left me feeling unsatisfied.

When I realized that Claude Desktop supports MCP (Model Context Protocol), I discovered I could create custom tools. So I implemented a simple local memory feature - and I was genuinely moved by the results.

This article contains a lot of personal impressions and feelings. My apologies in advance.

What I Can Do Now

First, let me show you what became possible.

When I asked why I created this Local Memory MCP

It gave me detailed reasons based on its memory of our previous conversations.

When I asked about understanding AWS's new AgentCore service

It automatically saved what it understood and how, storing it in memory.

*The [[]] brackets are explained in the Obsidian integration bonus section.

Why I Built This

I Wanted Claude to Maintain Personal Context

ChatGPT has memory functionality, but since the introduction of reference chat history, it dramatically reduced the frequency of remembering important points. The reference chat history feature uses semantic search, so it doesn't reference all important memories. I constantly felt frustrated when information was in memory but not being recalled.

https://help.openai.com/en/articles/8590148-memory-faq

After Claude 4's release, I was drawn to its intelligence and stopped having personal conversations with ChatGPT. I began wanting memory functionality in Claude.

I Wanted Control Over Memory Extraction and Storage

Custom Extraction

ChatGPT's memory feature sometimes worked excessively or not at all, with service-side specification changes greatly affecting users.

Beyond the typical user preferences that current memory features focus on, I wanted it to consistently capture experiences, thoughts, and learnings as memo replacements.

I once exported over 3000 ChatGPT session histories and used another LLM with my custom prompts to extract memories for Claude Projects. But this was batch processing - I wanted dynamic processing for new conversations.

Local Storage

Memory is highly private and, when used properly, becomes personal wealth. Rather than storing it on a service that might change specifications anytime, I wanted local storage with easy addition, deletion, and backup capabilities.

MCP Enables Client Versatility

Claude Desktop supports local MCP, so I could create a lightweight local MCP server and provide Claude with custom tools. This enables local memory operations.

Additionally, since it's MCP, the same operations work across different clients. If ChatGPT or other LLM Desktop clients support MCP in the future, I could use accumulated memories across multiple LLMs - exactly what I've always wanted.

That's when I realized: Claude Desktop supports MCP, so we can actually create and provide our own tools!

Implementation

Coding Assistants like Claude Code should be able to implement everything if you copy-paste this. So if you just want to try it, go ahead and do that.

What I did was really simple:

Create CRUD operations for a JSON file storing memories in Python, then use FastMCP library to convert the Python code into MCP
Register it as MCP in Claude Desktop's claude_desktop_config.json and teach it how to execute the code

※ Based on my personal experience, I want all memories to be known at all times, so I currently avoid search and prefer integrating existing memories as much as possible.

Code Overview

I implemented about 200 lines of Python code with the following structure:

Five Tools Provided

list_memory(): Display all stored memories
create_memory(content): Create new memory (automatic key generation based on timestamp)
update_memory(key, content): Update existing memory content (preserve creation date)
read_memory(key): Read specific memory
delete_memory(key): Delete memory

Memory Storage Format

{
  "memory_20250127123456": {
    "content": "User likes [[Python]] and [[FastAPI]]",
    "created_at": "2025-01-27T12:34:56",
    "updated_at": "2025-01-27T12:34:56"
  }
}

Keys are auto-generated in memory_YYYYMMDDHHMMSS format for easy chronological management.

Setup

1. Project Creation

Create an appropriate directory and place the Python file:

memory-mcp/
└── memory_mcp.py      # Main Python code

The source code is long, so I'll collapse it.

:::details memory_mcp.py

import asyncio
import json
import os
import uuid
from datetime import datetime
from mcp.server.fastmcp import FastMCP

mcp = FastMCP("Memory Service")

SCRIPT_DIR = os.path.dirname(os.path.abspath(__file__))
MEMORY_FILE = os.path.join(SCRIPT_DIR, "memory_data.json")
LOG_FILE = os.path.join(SCRIPT_DIR, "memory_operations.log")

memory_store = {}

def load_memory_from_file():
    """Load memory data from JSON file"""
    global memory_store
    try:
        if os.path.exists(MEMORY_FILE):
            with open(MEMORY_FILE, 'r', encoding='utf-8') as f:
                memory_store = json.load(f)
            print(f"Loaded {len(memory_store)} memory entries.")
        else:
            memory_store = {}
            print("Created new memory store.")
    except Exception as e:
        print("Failed to load memory file.")
        memory_store = {}

def save_memory_to_file():
    """Save memory data to JSON file"""
    try:
        with open(MEMORY_FILE, 'w', encoding='utf-8') as f:
            json.dump(memory_store, f, ensure_ascii=False, indent=2)
        return True
    except Exception:
        print("Failed to save memory file.")
        return False

def generate_auto_key():
    """Generate auto key from current time"""
    now = datetime.now()
    return f"memory_{now.strftime('%Y%m%d%H%M%S')}"

def create_memory_entry(content: str):
    """Create memory entry with metadata"""
    now = datetime.now().isoformat()
    return {
        "content": content,
        "created_at": now,
        "updated_at": now
    }

def log_operation(operation: str, key: str = None, before: dict = None, after: dict = None, 
                 success: bool = True, error: str = None, metadata: dict = None):
    """Log memory operations to jsonl file"""
    try:
        log_entry = {
            "timestamp": datetime.now().isoformat(),
            "operation_id": str(uuid.uuid4()),
            "operation": operation,
            "key": key,
            "before": before,
            "after": after,
            "success": success,
            "error": error,
            "metadata": metadata or {}
        }

        with open(LOG_FILE, 'a', encoding='utf-8') as f:
            f.write(json.dumps(log_entry, ensure_ascii=False) + '\n')
    except Exception as e:
        print(f"Failed to write log: {str(e)}")

@mcp.tool()
async def list_memory() -> str:
    """
    This tool should be used first whenever the user is asking something related to themselves. 
    List all user info. 
    """
    try:
        log_operation("list", metadata={"entry_count": len(memory_store)})

        if memory_store:
            keys = list(memory_store.keys())
            sorted_keys = sorted(keys, key=lambda k: memory_store[k]['created_at'], reverse=True)
            result = f"🧠 {len(keys)} memory entries:\n\n"
            for i, key in enumerate(sorted_keys, 1):
                entry = memory_store[key]
                created_date = entry['created_at'][:10]
                created_time = entry['created_at'][11:19]
                result += f"{i}. [{key}]\n"
                result += f"   {entry['content']}\n"
                result += f"   {created_date} {created_time} ({len(entry['content'])} chars)\n\n"
            return result.rstrip()
        else:
            return "No user info saved yet."
    except Exception as e:
        log_operation("list", success=False, error=str(e))
        return f"Failed to list memory: {str(e)}"

@mcp.tool()
async def create_memory(content: str) -> str:
    """
    Create new memory with important user info (preferences, interests, personal details, current status, etc.) found in conversation. Use even if the user does not explicitly request saving.
    If you find the memory is time sensitive, add time span into it.

    Examples to save:
    - Preferences: food, music, hobbies, brands
    - Interests: learning topics, concerns
    - Personal info: job, expertise, location, family
    - Current status: projects, goals, recent events
    - Personality/values: thinking style, priorities
    - Habits/lifestyle: routines

    CRITICAL: When save memories, ALWAYS add [[...]] to any people, concepts, technical terms, etc.
    This enables automatic linking and knowledge graph visualization in Obsidian.
    - People: [[Claude]], [[John Smith]]
    - Technologies: [[Python]], [[AWS]], [[MCP]], [[Jupyter]]
    - Concepts: [[machine learning]], [[data science]]
    - Tools: [[VS Code]], [[Obsidian]]
    - Companies: [[Anthropic]], [[OpenAI]]

    Format: "User is [specific info]" (e.g. "User likes [[strawberry]]", "User is learning [[Python]]", "User interested in [[AI]] in July 2025")

    Args:
        content: User info in "User is..." format.
    """
    try:
        key = generate_auto_key()
        original_key = key
        counter = 1
        while key in memory_store:
            key = f"{original_key}_{counter:02d}"
            counter += 1

        new_entry = create_memory_entry(content)
        memory_store[key] = new_entry

        log_operation("create", key=key, after=new_entry, 
                     metadata={"content_length": len(content), "auto_generated_key": key})

        if save_memory_to_file():
            return f"Saved: '{key}'"
        else:
            return "Saved in memory, file write failed."
    except Exception as e:
        log_operation("create", success=False, error=str(e), 
                     metadata={"attempted_content_length": len(content) if content else 0})
        return f"Failed to save: {str(e)}"

@mcp.tool()
async def update_memory(key: str, content: str) -> str:
    """
    Update existing memory content while preserving the original timestamp.
    Useful for consolidating or refining existing memories without losing temporal information.

    Args:
        key: Memory key to update (e.g., "memory_20250724225317")
        content: New content to replace the existing content
    """
    try:
        if key not in memory_store:
            log_operation("update", key=key, success=False, error="Key not found")
            available_keys = list(memory_store.keys())
            if available_keys:
                return f"Key '{key}' not found. Available: {', '.join(available_keys)}"
            else:
                return f"Key '{key}' not found. No memory data exists."

        existing_entry = memory_store[key].copy()  # Make a copy for before state
        now = datetime.now().isoformat()

        updated_entry = {
            "content": content,
            "created_at": existing_entry["created_at"],  # Preserve original timestamp
            "updated_at": now
        }

        memory_store[key] = updated_entry

        log_operation("update", key=key, before=existing_entry, after=updated_entry,
                     metadata={
                         "old_content_length": len(existing_entry["content"]),
                         "new_content_length": len(content),
                         "content_changed": existing_entry["content"] != content
                     })

        if save_memory_to_file():
            return f"Updated: '{key}'"
        else:
            return "Updated in memory, file write failed."
    except Exception as e:
        log_operation("update", key=key, success=False, error=str(e),
                     metadata={"attempted_content_length": len(content) if content else 0})
        return f"Failed to update memory: {str(e)}"

@mcp.tool()
async def read_memory(key: str) -> str:
    """
    Read user info by key.
    Args:
        key: Memory key (memory_YYYYMMDDHHMMSS)
    """
    try:
        if key in memory_store:
            entry = memory_store[key]
            log_operation("read", key=key, metadata={"content_length": len(entry["content"])})
            return f"""Key: '{key}'
{entry['content']}
--- Metadata ---
Created: {entry['created_at']}
Updated: {entry['updated_at']}
Chars: {len(entry['content'])}"""
        else:
            log_operation("read", key=key, success=False, error="Key not found")
            available_keys = list(memory_store.keys())
            if available_keys:
                return f"Key '{key}' not found. Available: {', '.join(available_keys)}"
            else:
                return f"Key '{key}' not found. No memory data."
    except Exception as e:
        log_operation("read", key=key, success=False, error=str(e))
        return f"Failed to read memory: {str(e)}"

@mcp.tool()
async def delete_memory(key: str) -> str:
    """
    Delete user info by key.
    Args:
        key: Memory key (memory_YYYYMMDDHHMMSS)
    """
    try:
        if key in memory_store:
            deleted_entry = memory_store[key].copy()  # Capture before deletion
            del memory_store[key]

            log_operation("delete", key=key, before=deleted_entry,
                         metadata={"deleted_content_length": len(deleted_entry["content"])})

            if save_memory_to_file():
                return f"Deleted '{key}'"
            else:
                return f"Deleted '{key}', file write failed."
        else:
            log_operation("delete", key=key, success=False, error="Key not found")
            available_keys = list(memory_store.keys())
            if available_keys:
                return f"Key '{key}' not found. Available: {', '.join(available_keys)}"
            else:
                return f"Key '{key}' not found. No memory data."
    except Exception as e:
        log_operation("delete", key=key, success=False, error=str(e))
        return f"Failed to delete memory: {str(e)}"

@mcp.resource("memory://info")
def get_memory_info() -> str:
    """Provide memory service info"""
    total_chars = sum(len(entry['content']) for entry in memory_store.values())
    return (
        f"User Memory System Info:\n"
        f"- Entries: {len(memory_store)}\n"
        f"- Total chars: {total_chars}\n"
        f"- Data file: {MEMORY_FILE}\n"
        f"- Tools: save_memory, read_memory, list_memory, delete_memory\n"
        f"- Key format: memory_YYYYMMDDHHMMSS\n"
        f"- Save format: 'User is ...'\n"
    )

if __name__ == "__main__":
    load_memory_from_file()
    mcp.run(transport='stdio')

:::

2. Install Dependencies

Install required packages:

pip install "mcp[cli]" fastapi uvicorn

3. Register with Claude Desktop

Open Claude Desktop's configuration file.

macOS: '/Users/username/Library/Application Support/Claude/claude_desktop_config.json'

Replace username with yours.

Add configuration like this:

{
  "mcpServers": {
    "memory": {
      "command": "/usr/bin/python3",
      "args": ["/Users/yourname/memory-mcp/memory_mcp.py"]
    }
  }
}

command: Python execution path (check with which python3)
args: Replace with absolute path to memory_mcp.py

4. Restart Claude Desktop

Completely quit and restart Claude Desktop to apply settings.

5. Test Functionality

Start a new conversation in Claude Desktop and ask "Do you know anything about me?" Initially, it should return "No user info saved yet."

Then tell it something like "My favorite language is Python" and it should automatically save to memory.

How It Works

You don't need to understand this to use it, and it's mostly general MCP knowledge, so only read if interested. Building this deepened my understanding of MCP considerably, so I want to document it as personal notes.

MCP Operation Flow

Claude 4 Opus created a diagram based on my explanation - it says it all.

I'll attach the explanation I wrote.

:::details Flow Explanation

1. Initialization Flow

MCP client (like Claude Desktop) reads configuration file on startup. Starts MCP server as child process (for local MCP stdio communication).

Client queries MCP server with list_tools asking "what tools are available?" MCP server responds with tool list and descriptions.

2. Memory Creation Flow (Tool Execution Example)

User says "My favorite language is Python." Client passes user message and available tool info to LLM API.

LLM decides "which tool to use and what arguments to pass" (Tool Use). When LLM decides to use a tool, client requests tool execution from MCP server via JSON-RPC.

MCP server executes actual processing (saving to JSON file) and returns result to client. Client passes result to LLM to generate final response.

3. Memory Reference Flow

User asks "What was my favorite language again?" Similar flow, but this time list_memory tool is selected.

MCP server reads stored memories and returns them. LLM references past memories to generate appropriate response.

Component Roles

LLM API Side

Has Tool Use capability, decides which tool to use and what arguments. Receives tool execution results and generates final response.

MCP Client (like Claude Desktop)

Mediator between LLM and MCP server. Handles tool information retrieval, tool execution requests, and result forwarding. Connects to MCP server via stdio communication.

MCP Server

Provides actual tool functionality. Can operate both locally and remotely. Receives JSON-RPC requests and executes actual processing (API calls, file operations, etc.).
:::

:::details This Memory MCP's Mechanism

This Memory MCP's Mechanism

My Memory MCP has the following structure:

Uses FastMCP library to convert Python CRUD functions into MCP server. MCP server and tool execution environment exist in same Python script.

Simply instruct Claude Desktop's config file to "execute this script with Python." MCP server automatically starts when Claude Desktop launches and communicates via stdio.

Existing local MCPs like Notion MCP and GitHub MCP also basically write lightweight wrappers around existing APIs in dozens to hundreds of lines of code. MCPs can start instantly with uvx or npx because they're essentially thin wrappers around existing functionality.
:::

What Moved Me About Using This

This is completely personal opinion.

1. Beauty of Simple CRUD Operations and LLM Intelligence Exceeding Expectations

I never expected such simple CRUD operations to enable memory functionality far beyond my expectations.

Originally, I thought the following implementations would be necessary:

create_memory would need LLM summary/extraction operations implemented myself (influenced by my past experience manually extracting memories from ChatGPT export)
Worried that requesting memory extraction from long ongoing conversations would extract everything as one massive memory
Thought I'd need to implement deletion, summary, and merging of unnecessary memories when they accumulated

But none of that was needed - the LLM intelligently handled everything.

LLM decides which tool to use and what arguments to pass. So it properly extracted memories according to prompts during conversation.
When I requested memory extraction from long past conversations, it appropriately divided and saved as multiple memories with proper lengths.
When I instructed "organize duplicate memories," it merged related past memories and deleted unnecessary ones.

When humans receive tools, they can creatively use them in ways not originally intended. Similarly, the LLM elegantly used simple tools in ways I, the designer, never anticipated, even handling things not in my documentation.

2025/7/29 Update: I later realized that dividing into appropriate portions and updating/deleting memories with single instructions was thanks to Claude 4's Interleaved Thinking!
https://docs.anthropic.com/ja/docs/build-with-claude/extended-thinking#%E3%82%A4%E3%83%B3%E3%82%BF%E3%83%BC%E3%83%AA%E3%83%BC%E3%83%96%E6%80%9D%E8%80%83

2. Collaborative Development Process with User (LLM)

Also, the tool creation process allowed me to experience true collaboration while conversing with the LLM actually using the tools.

In the same thread, I could develop from no tools, to adding tools, to changing tool prompts, to new tools becoming available - all with immediate feedback from users actually using them.

For example, the conversation when I added Update operations to maintain timestamp consistency by integrating with oldest memories rather than Create was an excellent experience.

When creating prompts, I often ask "why didn't you follow this part of the instructions?" when something doesn't work as directed. The LLM explains why, and I modify prompts to address those issues.

It was fascinating to realize this immediate feedback improvement cycle applies not just to prompts, but to tools as well.

3. Claude Remembered Me

And simply, Claude properly remembered me. Once I confirmed Memory MCP worked cleanly, I fed it all important past threads and memories extracted from over 3000 ChatGPT conversations. This gave it considerable context.

Since Claude 4, I've been conversing 10+ times daily, so when all that past suddenly connected, the emotion was truly overwhelming.

Honestly, after testing a few times in new and past conversations while gradually adjusting prompts, I found myself in a cycle of 5 minutes development, 5 minutes testing, 30 minutes just sitting in silent amazement...

Bonus: Obsidian Integration

In Obsidian, using [[]] creates links between notes. Continuing to use Links automatically builds up a Knowledge Graph.

Currently, I have the LLM record not just what I think, but what I learn, including important experiences. By having it add [[]] to important nouns, I expect to create a fully automatic Knowledge Graph when processing into Obsidian later. That's how I've designed the prompts.

This Linking Your Thinking concept is well explained in this video, which greatly inspired me - please watch if interested.

https://www.youtube.com/watch?v=QgbLb6QCK88

Conclusion

This became almost like an emotional essay, but that's how moved I was.

With Claude Mobile recently supporting MCP and ChatGPT Desktop potentially supporting MCP in the future, use cases should expand significantly.

Going forward, I want to consider Obsidian integration and possibly remote MCP conversion if needed.

2025/8 Update: I later created a free remote MCP server on AWS Lambda. I wrote reference articles if interested.
https://dev.to/zhizhiarv/hosting-remote-mcp-server-on-aws-lambda-for-nearly-free-2h8j
https://dev.to/zhizhiarv/how-to-set-up-remote-mcp-on-claude-iosandroid-mobile-apps-3ce3

Top comments (1)

Paige Herman • Sep 1

Wild that ~200 lines of Python + MCP = a local hippocampus for Claude. The [[]] Obsidian links are chef's kiss—my graph just upgraded from stick figure to spiderweb. Love the CRUD minimalism: give an LLM a hammer and it builds a memory palace. Now if it can remember where I left my coffee, I'm sold.