Abhishek Nair

Posted on Mar 15 • Originally published at padawanabhi.de

How I Built a Cross-Tool Memory and Skill System for AI-Assisted Development

#claudecode #cursor #codex #mcp

10 min read | Intermediate

I use four AI coding tools daily: Claude Code, Cursor, Codex CLI, and occasionally Gemini/Antigravity. Each one is good at different things. Claude Code handles complex multi-file refactors. Cursor is fast for inline edits. Codex runs background tasks. Gemini brings a different perspective.

But here's the problem: each tool starts every conversation from zero. It doesn't know my preferences, my project architecture, or the lessons I learned last week in a different tool. I end up repeating myself constantly — "we switched email providers last sprint", "always create a branch before editing", "use pnpm not npm".

So I built a system that gives all four tools a shared brain. They share memory (what I've learned), skills (how to do things), and rules (what to always/never do) — without any tool knowing about the others directly.

This post walks through the exact setup. Everything here is running in production on my projects right now.

The Architecture

                     Knowledge Graph (MCP)
                     localhost:8765
                    ┌─────────────────┐
                    │  Entities        │
                    │  Observations    │
                    │  Relations       │◄──── All 4 tools read/write
                    └────────┬────────┘
                             │
          ┌──────────────────┼──────────────────┐
          │                  │                  │
    Claude Code          Cursor            Codex CLI
    ┌──────────┐     ┌──────────┐      ┌──────────┐
    │ CLAUDE.md│     │.cursorrules│    │ AGENTS.md │
    │ skills/  │     │ .cursor/  │    │ .codex/   │
    │ commands/│     │  rules/   │    │           │
    │ agents/  │     │           │    │           │
    │ memory/  │     │           │    │           │
    └──────────┘     └──────────┘    └──────────┘
          │                  │                  │
          └──────────────────┼──────────────────┘
                             │
                    Skills MCP Server
                    (universal template)
                    ┌─────────────────┐
                    │  51 skills      │
                    │  13 commands    │
                    │  17 agents      │
                    └─────────────────┘

Three layers:

Knowledge Graph — shared memory across all tools (user preferences, project context, cross-project learnings)
Universal Template — reusable skills, commands, and agents available to any project
Per-Tool Config — tool-specific instructions (CLAUDE.md, .cursorrules, AGENTS.md)

Layer 1: The Knowledge Graph (Shared Memory)

The foundation is a knowledge graph running as an MCP (Model Context Protocol) server on localhost:8765. Every AI tool connects to it. It stores three types of data:

Entities — things that exist: projects, people, tools, concepts.

Observations — facts attached to entities: "prefers pnpm over npm", "Project A uses Supabase + Resend", "Project B required ISO 13482 compliance".

Relations — connections between entities: "Project A uses Supabase", "User works-on Project C".

Why Not Just Files?

File-based memory (like Claude Code's ~/.claude/projects/<project>/memory/) is project-scoped and tool-scoped. It works well for "remember this for next time in this project with this tool." But it can't share context across projects or tools.

The knowledge graph solves both problems:

Cross-project: Working on auth in Project B? Search the graph for "authentication" and find patterns you established in Project A.
Cross-tool: Fix a bug in Cursor, and the graph remembers the root cause. Next time Claude Code encounters something similar, it finds the insight.

The Read Strategy: Scoped, Not Full Dump

The most important lesson I learned: never load the entire graph. As it grows, dumping everything into context wastes tokens on irrelevant data.

Instead, I use scoped searches at two moments:

Session start — two targeted searches:

search_nodes("user preferences workflow")  → load personal profile
search_nodes("<current project name>")     → load current project context

Mid-conversation — on-demand when I need cross-project context:

search_nodes("auth authentication")  → find patterns from other projects
search_nodes("ROS2 robotics")        → pull relevant robotics knowledge

The Write Strategy: Continuous Mining

I configured all tools to continuously mine conversations for durable knowledge and write to the graph immediately — not batch it for later.

Two categories of triggers:

Artifact triggers — when something is created:

New skill created → create_entities with name, purpose, key rules
Architecture decision made → add_observations to project entity
New tool/library adopted → create_entities + create_relations

Conversation triggers — when something is said:

User corrects approach → add_observations to user entity with preference
Debugging insight reveals pattern → add_observations
Cross-project learning → add_observations to both project entities

What NOT to save: ephemeral task details, in-progress debugging, things derivable from code or git history.

Layer 2: The Universal Template (Shared Skills)

I maintain a universal template at ~/universal-claude-template/ with 51 skills, 13 commands, and 17 agents. Any project can access these via the Skills MCP server.

What's a Skill vs. a Command vs. an Agent?

Skills are deep procedural knowledge — multi-step workflows with detailed instructions. Examples:

write-blog — full blog creation pipeline (research → outline → write → translate → publish)
create-prd — product requirements document generation
debugging — systematic debugging methodology
financial-analysis — financial modeling and analysis
competitive-research — market research synthesis

Commands are user-invocable shortcuts (/command):

/commit — conventional commit workflow
/explore — deep codebase exploration (read-only)
/fix-issue — end-to-end GitHub issue resolution
/deploy-check — pre-deployment validation

Agents are specialized sub-agents for parallel work:

researcher — codebase/domain exploration (never writes code)
code-reviewer — catches bugs and security issues
architect — design decisions and trade-off analysis
test-runner — runs tests and fixes failures

How Skills Load Across Tools

The Skills MCP server makes all skills available to any connected tool. When Claude Code needs a workflow:

search_skills("blog")       → finds write-blog skill
read_skill("write-blog")    → loads the full SKILL.md

Then it follows the skill's instructions as if it were a local file.

Skill Sync: Keeping the Template Current

When a new generalizable skill is created in any project:

diff_skills("/path/to/project")                          → see what's new
sync_skill("skill-name", "to_template", "/path/to/project") → copy to template

This keeps the universal template growing from real project work, not hypothetical planning.

Layer 3: Per-Tool Configuration

Each tool gets its own configuration format, but they all reference the same knowledge graph and skills.

Claude Code: CLAUDE.md + Rules + Memory

~/.claude/
├── CLAUDE.md              ← Global instructions (knowledge graph + skills MCP setup)
├── settings.json          ← MCP server configs, permissions
└── projects/
    └── <project>/
        ├── CLAUDE.md      ← Project-specific instructions
        └── memory/        ← File-based memory (project-scoped)
            ├── MEMORY.md  ← Index of memory files
            ├── user_profile.md
            └── feedback_*.md

The global CLAUDE.md tells Claude Code how to use the knowledge graph and skills MCP. Project-level CLAUDE.md files add project-specific context (stack, commands, gotchas).

Claude Code also has .claude/rules/ — always-loaded constraint files:

quality.md — definition of done, naming conventions
architecture.md — separation of concerns, error handling
git-workflow.md — branching strategy, commit format
security.md — secrets handling, input validation
testing.md — TDD workflow, mocking rules

Cursor: .cursorrules + .cursor/rules/

Cursor uses .cursorrules (project root) and .cursor/rules/ (MDC format). I mirror the same rules from the universal template, adapted for Cursor's syntax.

Codex CLI: AGENTS.md

OpenAI's Codex CLI reads AGENTS.md for its principal agent instructions. Same content, different format.

The Dual Memory Pattern

Claude Code has a unique advantage: it supports both file-based memory AND the knowledge graph. I use both intentionally:

File-based memory (~/.claude/projects/<project>/memory/) for:

Claude Code-specific context that other tools don't need
Project-scoped feedback and preferences
Quick recall within the same tool

Knowledge graph (localhost:8765) for:

Cross-project patterns and learnings
User preferences all tools should know
Architecture decisions and their rationale
Tool/library evaluations

The rule: if only Claude Code needs it, use file memory. If any other tool should know, use the graph.

Domain Packs: Specializing Per Project

The universal template includes domain packs that add domain-specific skills:

bash ~/universal-claude-template/setup.sh web       # Web dev skills
bash ~/universal-claude-template/setup.sh robotics  # ROS2, hardware, sim2real
bash ~/universal-claude-template/setup.sh creative  # Design, brand, content
bash ~/universal-claude-template/setup.sh finance   # Financial modeling
bash ~/universal-claude-template/setup.sh research  # Academic methodology

Each pack adds relevant rules, skills, and context without bloating projects that don't need them.

What This Actually Looks Like in Practice

Here's a real workflow:

Start a session in Claude Code on a portfolio project. It auto-loads project memory, searches the graph for recent context.
I say "replace the email provider with Resend". Claude Code reads the existing code, designs the migration, implements it across backend + frontend + Docker + CI. It uses the deployment skill for Docker patterns and database-patterns for the Supabase schema.
Mid-implementation, it discovers a Node 18 compatibility issue with crypto.randomUUID(). It writes this to the knowledge graph as a cross-project learning: "crypto.randomUUID() is not a global in Node 18 — use require('crypto').randomUUID()."
Later, in Cursor, I'm working on a different project that also runs Node 18. Cursor searches the graph, finds the crypto.randomUUID observation, and avoids the same mistake.
I switch to Codex CLI for a background task. It reads from the same graph, knows my preferences, and follows the same commit conventions.

No context was lost. No preferences were repeated. Each tool contributed to and benefited from the shared brain.

Setting This Up Yourself

Prerequisites

Claude Code installed
An MCP-compatible memory server (I use @anthropic/memory-mcp)
Optional: Cursor, Codex CLI

Step 1: Set Up the Knowledge Graph

# Install the memory MCP server
npm install -g @anthropic/memory-mcp

# Add to Claude Code
claude mcp add memory -- npx @anthropic/memory-mcp --port 8765

Step 2: Clone the Universal Template

git clone https://github.com/your-username/universal-claude-template ~/universal-claude-template

Or build your own. Start with 5-10 skills that match your actual workflow, then grow organically.

Step 3: Configure Global CLAUDE.md

Create ~/.claude/CLAUDE.md with instructions for the knowledge graph (how to read, when to write) and the skills MCP (how to search and load).

Step 4: Set Up Per-Project Config

cd your-project
bash ~/universal-claude-template/setup.sh web  # or robotics, creative, etc.

This copies rules, creates the CLAUDE.md template, and links skills.

Step 5: Connect Other Tools

Add the memory MCP server to Cursor and Codex CLI using their respective config formats. They'll share the same knowledge graph.

Lessons Learned

Start small. Don't create 50 skills on day one. Create them as you need them, from real work. My template grew from 5 skills to 51 over three months.

Scope your reads. Loading the entire graph into context is worse than having no graph. Always search for specific topics.

Let tools teach each other. The most valuable graph entries come from debugging sessions — insights that prevent the same mistake in a different context.

Don't fight the tool. Each AI tool has different strengths. Claude Code is best for complex refactors. Cursor for quick edits. Codex for background jobs. The shared brain lets you use the right tool without losing context.

Skills are better than prompts. A well-written SKILL.md file with clear steps, constraints, and examples outperforms even the best prompt. Prompts are ephemeral. Skills are durable.

I use this exact setup across all my projects. If you want help setting up a cross-tool AI workflow for your team, let's talk.

Originally published at padawanabhi.de

Top comments (4)

Daniel Meppiel • Mar 15

The thing that surprised me most when I started working on cross-tool agent config was how quickly the "just sync files" approach breaks down. It seems like the problem is format translation — Copilot wants .github/instructions/, Cursor wants .cursor/rules/, Claude wants its own thing — but the real difficulty is the dependency graph between those files. Your security instructions reference your API standards, which reference your architecture context, and suddenly you're not syncing files, you're resolving a DAG.

I've been prototyping a dependency-management approach to this (github.com/microsoft/apm) and the part I keep going back and forth on is the memory/context boundary. You can version and lock instruction files pretty cleanly, but memory — the stuff that's project-specific, evolving, sometimes personal — resists the package-manager model. It wants to be mutable in ways that lock files don't love.

Curious how you handle that tension with four tools in play. When you update a piece of shared memory, does it fan out automatically or do you have a manual sync step? And have you run into cases where the "right" context for Claude is actually different from what Cursor needs — not just in format but in substance?

Abhishek Nair • Mar 15 • Edited

Great question. Here's how my setup handles this in practice:

Memory sync is automatic. The knowledge graph runs as an MCP server that all four tools connect to. When any tool learns something (a correction, an architecture decision, a debugging pattern), it writes to the graph immediately. Next time I open a different tool, it reads the same graph. No file syncing, no manual step.

The instructions for each tool are global and minimal. They say "here's where to look, how to search, and when to pull context." Each tool fetches context only when relevant (working on auth? search for auth patterns, debugging? pull deployment context). The instructions don't reference each other because the knowledge lives in the graph, not in flat files.

On the DAG problem specifically: I'll be honest, I sidestepped it rather than solved it. My instruction files are deliberately flat and self-contained. They don't reference each other, so there's no dependency chain to resolve when syncing. The relational knowledge ("this security decision exists because of that architecture constraint") lives in the graph as entities with relations. The tradeoff is that instruction files can't express "follow the API conventions defined in X," they just state the rule and trust the graph has deeper context if needed. For a solo/small team setup this works well because AI tools are good at pulling context on demand, but I can see a dependency-managed approach being more valuable at larger team scale.

Skills are the part that gets versioned deliberately. I have a universal template with 46+ reusable skills, and specific sync functions (diff_skills, sync_skill) handle two-way flow between projects and the template. So skills are portable and versioned, memory is living and mutable. Different lifecycle, different mechanism.

What makes it sustainable: I have a post-session hook that reviews closed conversations, extracts knowledge, validates whether it was already recorded, and appends if not. Plus a weekly cleanup script that removes duplicates and stale entries. The graph accumulates noise if you don't prune it.

Your APM approach sounds like it tackles the part I deliberately avoided (the dependency resolution between instruction files). Curious how you handle the mutable memory side. My answer has been: version the instructions and skills (files in git), let the memory be a living graph that tools query at runtime. Different concerns, different mechanisms.

Abhishek Nair • Mar 17

The notepad-staleness vs cold-start tradeoff is exactly the right framing. I land in a similar place — the notepad (knowledge graph in my case) stays small and scoped, and I accept token cost on session start for anything that needs to be freshly derived from code or git.

On conflict resolution: the MCP server uses last-write-wins by entity + attribute. In practice this is fine because the four tools (Claude Code, Cursor, Codex CLI, Antigravity) rarely write the same entity in the same session. When they do, the weekly cleanup script catches stale or contradictory observations and prunes them. Not as rigorous as a DAG-enforced consistency model, but the signal-to-noise has been good enough in practice.

On failure mode: when the MCP server goes down mid-session, the tool degrades gracefully — session continues without graph access, learnings from that session don't get written. The more interesting part is the session stop hook that runs at conversation end. It does two things: extracts learnings from the closed session and writes them to the graph, AND checks whether the MCP server is running — restarting it if it's down. So even if the server went down mid-session, it's back up before the next one starts. The failure mode is self-healing rather than just tolerated.

For a fully autonomous agent without a human in the loop, I'd probably want your tighter model. For an assistant where a human closes sessions, the hook-based recovery works well.