How I Built a Persistent Memory Layer for AI Coding Assistants Using MCP

#ai #opensource #openai #productivity

Every time I start a new Claude or Cursor session, I have the same conversation:

"I use TypeScript with strict mode, pnpm for packages, deploy to Vercel, and prefer functional components with hooks."

I got tired of it. So I built PersistMemory — a persistent memory server that plugs into any MCP-compatible AI tool.

The Problem

LLM-based coding assistants are powerful but amnesiac. They don't remember:

Your tech stack and preferences
Project-specific architecture decisions
Past debugging sessions and solutions
Your coding conventions

Every session is a blank slate.

The Solution: MCP + Semantic Memory

MCP (Model Context Protocol) is Anthropic's open standard for connecting AI tools to external data sources. PersistMemory is an MCP server that provides 9 tools to any connected AI assistant:

Tool	What it does
`add_to_memory`	Store a piece of knowledge
`search_memory`	Semantically search stored memories
`create_space`	Create an isolated memory space
`switch_space`	Switch between spaces
`list_spaces`	List all your spaces
`get_current_space`	Show active space
`publish_message`	Send a message to a space
`fetch_messages`	Retrieve message history
`authenticate`	OAuth login

Setup: One Config Line

For Claude Desktop, add this to your config:

{
  "mcpServers": {
    "persistmemory": {
      "url": "https://mcp.persistmemory.com/sse"
    }
  }
}

That's it. No API keys — authentication happens via OAuth when Claude first connects.

Works the same way for Cursor, VS Code/Copilot, Windsurf, and Cline (see docs for each client's config).

How Semantic Search Works

When you tell your AI "remember that I use Neon Postgres with Drizzle ORM", PersistMemory:

Embeds the text using bge-large-en-v1.5 (1024-dimension vectors)
Stores the vector in Cloudflare Vectorize with metadata (space, type, timestamp)
Saves the raw text in Neon Postgres

When you later ask "what's my database setup?", PersistMemory:

Embeds your query
Performs vector similarity search
Returns the top matching memories

No keyword matching needed. "Database setup" finds "Neon Postgres with Drizzle ORM" because they're semantically close.

Spaces: Context Isolation

Spaces let you organize memories by project or context:

startup-app — your startup's stack, architecture decisions, deployment config
freelance-client-x — client-specific requirements and preferences
personal — your general coding preferences

When your AI is in the "startup-app" space, it only searches memories from that space. No accidental context bleed.

Architecture

┌─────────────────┐     MCP/SSE      ┌──────────────────────┐
│  Claude/Cursor/  │ ◄──────────────► │   PersistMemory      │
│  Copilot/etc.    │                  │   (CF Workers)       │
└─────────────────┘                  │                      │
                                      │  ┌────────────────┐  │
                                      │  │ Vectorize      │  │
                                      │  │ (embeddings)   │  │
                                      │  └────────────────┘  │
                                      │  ┌────────────────┐  │
                                      │  │ Neon Postgres  │  │
                                      │  │ (memory store) │  │
                                      │  └────────────────┘  │
                                      │  ┌────────────────┐  │
                                      │  │ Workers AI     │  │
                                      │  │ (BGE-large)    │  │
                                      │  └────────────────┘  │
                                      └──────────────────────┘

Stack:

Cloudflare Workers — serverless backend, globally distributed
Neon Postgres — persistent storage via serverless driver
Drizzle ORM — type-safe database access
Cloudflare Vectorize — vector database for semantic search
Workers AI — embedding generation (BGE-large-en-v1.5)
Cloudflare R2 — file storage for uploaded documents
Next.js on Vercel — web dashboard

What's Next

Meeting bot integration — automatically capture and store context from meetings
Auto-memory — AI proactively stores important context without being told
Memory sharing — share spaces with team members
Webhooks — trigger actions when memories are added