DEV Community: Elvis Mørales Fdz

How to give Claude Code persistent memory with a self-hosted mem0 MCP server

Elvis Mørales Fdz — Thu, 19 Feb 2026 01:06:11 +0000

Last week I spent two hours with Claude Code debugging a token refresh race condition. I traced it through the auth middleware, tested four approaches, and finally found that the session timeout window overlaps with the token refresh cycle on my setup. Three-line fix. The next day, a similar auth timing issue appeared in a different service. Claude suggested some of the exact approaches we'd already tried and rejected the day before.

That's the kind of knowledge that falls through the cracks between Claude Code sessions. Yes, CLAUDE.md stores static rules and Auto Memory saves compressed summaries. But neither captures the full diagnostic path, which approaches you tried, why three of them failed, and the specific conditions that made the fourth one work. That detail disappears when the session ends.

I went looking for MCP memory servers and other solutions that could fill that gap. Most either depended on running in the cloud, gave me too little control over the local setup, or required adding a separate API key for their internal LLM operations. Claude Code already authenticates through an OAuth Access Token (OAT), and the SDK supports it, so adding another key felt redundant and came with extra API costs.

During that search I came across mem0. I went through their documentation, tried the OpenClaw plugin to see how the library handles memory extraction and semantic search, and liked the approach. I patched it to reuse Claude Code's existing OAT token instead of requiring a separate key and submitted the change upstream. Their official MCP integration server is cloud-only though, so I built mem0-mcp-selfhosted, a local version backed by infrastructure I can fully control.

The stack runs on Qdrant for vector storage, Ollama for local embeddings, and optional Neo4j for a knowledge graph that I added later. I also set it up to route different operations to the best LLM for each task. It provides eleven tools for your Claude Code instance to manage long-term memory operations, and your memories data never leaves your machine.

This article covers how this MCP server works, how to set it up in about 15 minutes, and how to get Claude using memory automatically without you triggering it.

Why Claude Code's built-in memory falls short for accumulated knowledge

Does Claude Code remember between sessions?

Partially. Claude Code has three persistence mechanisms that carry context forward: CLAUDE.md files you write yourself, Auto Memory where Claude saves notes about your project patterns and preferences, and Session Memory that extracts summaries from past conversations. All three load at session start, and they cover a lot of ground.

Static rules, project conventions, and compressed summaries of past work carry forward just fine. If you told Claude to use PostgreSQL last week, it might remember it.

What doesn't carry forward is the detailed reasoning behind your decisions. When you spend an afternoon choosing between Redis and database-backed sessions, weighing operational complexity and infrastructure costs, and ultimately picking database sessions because your traffic doesn't justify a separate Redis instance yet, that full reasoning chain gets compressed into a one-line summary at best. The next session, Claude might suggest Redis for caching and you have to walk through the tradeoff analysis again.

Three categories of knowledge get lost or compressed beyond usefulness:

Decision reasoning. Not what you decided, but why and under what conditions. "We chose in-memory caching over Redis because at current scale it's premature optimization. Revisit at 10k rps." Auto Memory might note the decision, but the conditional logic that makes it useful, the part about when to revisit, gets lost in compression.
Debugging insights. "The flaky test failures on CI were caused by state leakage between test groups, not async issues. We proved this by isolating test groups last Tuesday." Session Memory might summarize "fixed flaky tests" but not the three-hour diagnostic path that saves you from repeating the same investigation.
Cross-project patterns. You build JWT middleware on Project A. Two weeks later, Project B needs authentication. Auto Memory and Session Memory are project-scoped, and while a global CLAUDE.md can carry some context across repos, it's a static file, not a searchable knowledge base. The pattern exists in a different repo, but Claude has no way to find it.

The built-in memory helps, but it has structural limits

CLAUDE.md works well for project rules. Auto Memory adds automatic note-taking, which is a real improvement over manual curation alone. I use both, and I recommend them.

But they share three structural limitations:

No search. Everything loads at session start regardless of relevance. At 200+ entries, you're burning context tokens on information Claude doesn't need for this particular task.
Summaries, not reasoning. Auto Memory and Session Memory compress multi-hour sessions into short notes. The compression loses the detail that matters most, which approaches failed and why.
Mostly project-scoped. Auto Memory is strictly per-project. A global CLAUDE.md can carry rules across repos, but it's a flat file you maintain by hand, not a searchable store of accumulated knowledge.

That's the gap Claude Code persistent memory with semantic search fills. Ask "what went wrong with Redis last month?" and get back the full reasoning: "rejected Redis for session storage because the operational overhead wasn't justified at our traffic levels. Switched to database-backed sessions. Revisit if we hit 10k concurrent users." The words don't match at all, but the meaning does.

What mem0 gives Claude Code: persistent memory with semantic search

This MCP server for Claude Code uses mem0ai as a library and exposes 11 MCP tools that Claude Code calls directly.

Here's what it looks like in practice:

Session 1 -- debugging a test suite:

> Remember: flaky test failures in CI were caused by state leakage between
  test groups, not async timing. Fixed by resetting database between groups.
  Took 3 hours to isolate. Don't chase the async red herring again.

Session 2 -- two weeks later, different project, tests start flaking:

> Search my memories for flaky test debugging
-> "flaky test failures in CI were caused by state leakage between test
   groups, not async timing. Fixed by resetting database between groups."

Claude retrieves the debugging insight and skips the three-hour investigation. It starts with the proven fix.

The difference from a flat file: semantic vector search. "Flaky test debugging" matches "state leakage between test groups" even with completely different wording. The server embeds memories using Ollama's bge-m3 model and stores them in Qdrant for approximate nearest neighbor search. Claude finds memories by meaning, not keywords.

The 11 tools

Tool	What it does
`add_memory`	Store text or conversations. The LLM extracts key facts automatically.
`search_memories`	Semantic vector search with filters, threshold, and reranking.
`get_memories`	Browse and filter stored memories (non-search).
`get_memory`	Fetch a single memory by UUID.
`update_memory`	Replace memory text. Re-embeds and re-indexes.
`delete_memory`	Delete a single memory.
`delete_all_memories`	Safe bulk delete (never nukes your collection).
`list_entities`	List which users/agents/runs have stored memories.
`delete_entities`	Cascade-delete an entity and all its memories.
`search_graph`	Search Neo4j entities by substring (optional).
`get_entity`	Get all relationships for a specific entity (optional).

The last two require Neo4j, which is entirely optional. You get full Claude Code persistent memory with the first nine tools and nothing but Qdrant + Ollama running.

Check out the full source and documentation at the mem0-mcp-selfhosted GitHub repo.

How the MCP server delivers Claude Code persistent memory

Claude Code <-- stdio --> FastMCP Server
                            |-- auth.py          <- OAT token auto-discovery
                            |-- config.py        <- Env vars -> config
                            |-- helpers.py       <- Error handling, safe bulk-delete
                            |-- graph_tools.py   <- Direct Neo4j Cypher queries
                            '-- server.py        <- 11 MCP tools + prompt
                                  |
                                  |-- mem0ai Memory class
                                  |     |-- Vector: LLM fact extraction -> Ollama embed -> Qdrant
                                  |     '-- Graph: LLM entity extraction -> Neo4j (optional)
                                  |
                                  '-- Infrastructure
                                        |-- Qdrant     <- Vector store
                                        |-- Ollama     <- Embeddings (local)
                                        '-- Neo4j      <- Knowledge graph (optional)

The server is 7 modules, each with a specific responsibility. FastMCP handles the MCP protocol layer. The mem0ai library handles memory operations. Everything else is configuration, auth, and safety wrappers. Each Claude Code session connects via stdio, so the memory tools are available the moment you start working.

The vector memory path

When Claude calls add_memory:

The text goes to Anthropic's API for fact extraction (using your Claude subscription)
The extracted facts get embedded locally via Ollama (bge-m3, 1024 dimensions)
The embedding vectors get stored in Qdrant

When Claude calls search_memories, Ollama embeds the query and Qdrant finds the nearest vectors by cosine similarity. The whole pipeline runs in 2-5 seconds.

Zero-config auth with OAT auto-discovery

Most memory MCP servers require separate API key configurations. This one reads your existing OAT (OAuth Access Token) directly from ~/.claude/.credentials.json. No configuration needed, and your persistent memory setup works the moment you connect.

The server uses a 3-tier fallback chain:

MEM0_ANTHROPIC_TOKEN env var (explicit override)
~/.claude/.credentials.json (auto-discovery, zero config)
ANTHROPIC_API_KEY env var (standard API key)

It detects whether the token is an OAT (sk-ant-oat...) or an API key (sk-ant-api...) and configures the SDK accordingly. OAT tokens use your existing Claude subscription. No separate billing, no additional API key to manage.

Setting up Claude Code persistent memory in 15 minutes

Prerequisites

Two services running locally:

Qdrant -- self-hosted vector database (one Docker command)
Ollama -- local embeddings (native install or Docker)

And Claude Code with an active subscription.

Step 1: start the infrastructure

# Start Qdrant
docker run -d -p 6333:6333 -p 6334:6334 \
  -v qdrant_storage:/qdrant/storage \
  --name qdrant qdrant/qdrant

# Start Ollama (skip if already installed natively)
docker run -d -p 11434:11434 \
  -v ollama:/root/.ollama \
  --name ollama ollama/ollama

# Pull the embedding model
docker exec ollama ollama pull bge-m3

If Ollama is already running natively on your machine, skip the Docker container and run ollama pull bge-m3 directly. That's it for infrastructure. Your self-hosted AI memory backend is ready for Claude Code to connect. See the full configuration guide for all available environment variables.

Step 2: add the MCP server to Claude Code

One command, available across all your projects:

claude mcp add --scope user --transport stdio mem0 \
  --env MEM0_QDRANT_URL=http://localhost:6333 \
  --env MEM0_EMBED_URL=http://localhost:11434 \
  --env MEM0_EMBED_MODEL=bge-m3 \
  --env MEM0_EMBED_DIMS=1024 \
  --env MEM0_USER_ID=your-user-id \
  -- uvx --from git+https://github.com/elvismdev/mem0-mcp-selfhosted.git mem0-mcp-selfhosted

uvx downloads, installs, and runs the server in an isolated environment. No manual pip install, no virtual env, no dependency conflicts.

Or add it to a single project with .mcp.json in the project root:

.mcp.json for project-scoped setup

{
  "mcpServers": {
    "mem0": {
      "command": "uvx",
      "args": ["--from", "git+https://github.com/elvismdev/mem0-mcp-selfhosted.git", "mem0-mcp-selfhosted"],
      "env": {
        "MEM0_QDRANT_URL": "http://localhost:6333",
        "MEM0_EMBED_URL": "http://localhost:11434",
        "MEM0_EMBED_MODEL": "bge-m3",
        "MEM0_EMBED_DIMS": "1024",
        "MEM0_USER_ID": "your-user-id"
      }
    }
  }
}

Step 3: make it automatic with CLAUDE.md

Add this to ~/.claude/CLAUDE.md (global) so Claude uses memory without you asking:

## MCP Servers

- **mem0**: Persistent memory across sessions. At the start of each session,
  `search_memories` for relevant context before asking the user to re-explain
  anything. Use `add_memory` whenever you discover project architecture, coding
  conventions, debugging insights, key decisions, or user preferences. Use
  `update_memory` when prior context changes. When in doubt, save it -- future
  sessions benefit from over-remembering.

With this, Claude proactively searches memory at session start and saves things it learns as it goes. You stop re-explaining. Sessions build on each other. Your Claude Code memory across sessions is now fully automatic.

Step 4: try it

Restart Claude Code, then:

> Search my memories for authentication decisions
> Remember that we rejected Redis for caching because connection pooling
  caused issues at our scale. Revisit at 10k concurrent users.
> Show me all entities in my memory

That's it. Qdrant stores your vectors, Ollama generates embeddings locally, and Claude Code now has persistent memory across every session and project.

Optional: add a knowledge graph with Neo4j

Vector search handles the core memory use case. If you want structured entity relationships on top, Neo4j adds a second dimension.

When you store "I prefer TypeScript with strict mode," the graph layer extracts entities and relationships:

user -> PREFERS -> TypeScript
user -> PREFERS -> strict_mode

You can then ask "what does this user prefer?" and traverse the graph for structured answers rather than relying on text similarity alone.

Quick setup

docker run -d -p 7687:7687 -e NEO4J_AUTH=neo4j/mem0graph neo4j:5

Add to your MCP config:

MEM0_ENABLE_GRAPH=true
MEM0_NEO4J_URL=bolt://127.0.0.1:7687
MEM0_NEO4J_PASSWORD=mem0graph

The quota cost and how to avoid it

Each add_memory with graph enabled triggers 3 additional LLM calls: entity extraction, relationship generation, and contradiction resolution. That's a real quota cost on your Claude subscription.

To protect your quota, route graph operations to a cheaper model:

Provider	Cost	Quality	Notes
Ollama (Qwen3:14b)	Free	0.971 tool-calling F1	~7-8GB VRAM (Q4_K_M)
Gemini 2.5 Flash Lite	Near-free	85.4% entity extraction	Cloud
`gemini_split`	Gemini + Claude	Best combined accuracy	85.4% extraction + 100% contradiction

With the Ollama path, the entire graph pipeline runs locally. Zero cloud dependency.

Environment variables for each graph provider
Ollama (free, local):

MEM0_GRAPH_LLM_PROVIDER=ollama
MEM0_GRAPH_LLM_MODEL=qwen3:14b

Gemini (near-free):

MEM0_GRAPH_LLM_PROVIDER=gemini
GOOGLE_API_KEY=your-google-api-key

Split-model (best accuracy):

MEM0_GRAPH_LLM_PROVIDER=gemini_split
GOOGLE_API_KEY=your-google-api-key
MEM0_GRAPH_CONTRADICTION_LLM_PROVIDER=anthropic

Neo4j is entirely optional. You get useful self-hosted AI memory with Qdrant and Ollama alone. See the project README for the complete list of environment variables.

How self-hosted mem0 compares to other Claude Code persistent memory solutions

Developers I talked to on Reddit had an interesting setup: an Obsidian vault connected to Claude via MCP, with all their chat logs and notes organized by project. When they needed context, they tell Claude to load a specific project folder. It works, but every load pulled in full transcripts, and as their vault grew, the context cost grew linearly with it.

One of the developers posted a good question: "Isn't this setup I have the same as what you built?" Not quite. The retrieval model is fundamentally different.

Approach	Search	Storage	Curation	Cross-project
CLAUDE.md + Auto Memory	None (loads all)	Markdown files	Mixed (manual + auto)	Per-project (global option)
mem0-mcp-selfhosted	Semantic vector	Qdrant vectors	Automatic	Global
Graphiti (Zep)	Hybrid graph + vector	Graph DB (required)	Automatic	Depends
Obsidian + MCP	Keyword or semantic	Vault files	Manual	Per-vault

When each approach fits

CLAUDE.md + Auto Memory is perfect for small projects with manageable context. Zero setup, immediate value, and Auto Memory adds automatic note-taking on top. I let Claude Code do its thing and use both alongside mem0, and they complement each other well.

The CLAUDE.md tells Claude Code how to use memory tools. mem0 handles the semantic storage and retrieval.

mem0-mcp-selfhosted makes sense when you need LLM long-term memory that works across multiple projects, accumulating knowledge over weeks, or when your preferences have outgrown what a flat file handles gracefully. Semantic search is the differentiator at scale.

Graphiti is worth evaluating if structured temporal relationships are your primary need. It's graph-first, meaning a graph database is required, not optional. Neo4j is the primary backend, with FalkorDB, Kuzu, and Amazon Neptune also supported. It offers bi-temporal tracking that mem0 doesn't, recording both when a fact became true and when the system learned it. The infrastructure is heavier, and depending on your LLM provider you may need separate API keys for LLM and embedding operations.

Obsidian + MCP works well if you're already an Obsidian power user who wants visual browsing and manual editing of notes. Basic implementations use keyword search over vault files, though some servers like obsidian-mcp-tools add semantic search via the Smart Connections plugin. All implementations store full documents rather than distilled facts, so context costs scale with vault size.

Get started and let me know how it goes

Here's what we covered:

Claude Code's built-in memory captures rules and summaries, but not detailed reasoning chains. Claude Code persistent memory with semantic search requires an external tool.
mem0-mcp-selfhosted gives Claude Code 11 memory tools backed by self-hosted Qdrant + Ollama.
Semantic vector search finds memories by meaning, not keywords.
The CLAUDE.md integration makes memory usage automatic. No manual triggering needed.
Neo4j adds structured entity relationships, but it's entirely optional.
Zero-config auth reads your existing OAT token. No API key setup.

The setup takes about 15 minutes: two Docker containers, one claude mcp add command, and a CLAUDE.md snippet. After that, Claude Code persistent memory builds up knowledge over time across all your projects.

claude mcp add --scope user --transport stdio mem0 \
  --env MEM0_QDRANT_URL=http://localhost:6333 \
  --env MEM0_EMBED_URL=http://localhost:11434 \
  --env MEM0_EMBED_MODEL=bge-m3 \
  --env MEM0_EMBED_DIMS=1024 \
  --env MEM0_USER_ID=your-user-id \
  -- uvx --from git+https://github.com/elvismdev/mem0-mcp-selfhosted.git mem0-mcp-selfhosted

Full source code, documentation, and issue tracker for this self-hosted MCP memory server mem0-mcp-selfhosted are on GitHub. If you're interested in more Claude Code tooling, check out my WordPress performance review skill.

I'd love to know:

Does Claude use memory proactively with the CLAUDE.md setup in your experience?
What would you want Claude to remember that it currently forgets?
How's the setup experience? Too many pieces or manageable?

Install it, search for something, and open an issue or drop a comment if the results surprise you.

elvismdev / mem0-mcp-selfhosted

Self-hosted mem0 MCP server for Claude Code. Run a complete memory server against self-hosted Qdrant + Neo4j + Ollama while using Claude as the main LLM.

mem0-mcp-selfhosted

Self-hosted mem0 MCP server for Claude Code. Run a complete memory server against self-hosted Qdrant + Neo4j + Ollama, with your choice of Anthropic (Claude) or Ollama as the main LLM.

Uses the mem0ai package directly as a library, supports both Claude's OAT token and fully local Ollama setups, and exposes 11 MCP tools for full memory management.

Prerequisites

Service	Required	Purpose
Qdrant	Yes	Vector memory storage and search
Ollama	Yes	Embedding generation (`bge-m3`) and optionally local LLM
Neo4j 5+	Optional	Knowledge graph (entity relationships)
Google API Key	Optional	Required only for `gemini`/`gemini_split` graph providers

Python >= 3.10 and uv.

Authentication: The default setup uses Claude (Anthropic) as the LLM for fact extraction. No API key needed, the server automatically uses your Claude Code session token. For fully local setups, set MEM0_PROVIDER=ollama. See Authentication for advanced options.

Quick Start

Default (Anthropic)

Add the…

View on GitHub

Claude Code Skill for WordPress Performance Reviews

Elvis Mørales Fdz — Sat, 29 Nov 2025 03:42:16 +0000

What if you could get a performance code review from someone who's seen WordPress sites fail under load and remembers all the patterns that caused it?

That's what this Claude Code skill does. It knows 50+ anti-patterns that cause WordPress performance problems, and it flags them with severity levels, line numbers, and explanations of what actually happens when they run in production.

I built it because I kept seeing the same issues over and over. A site works fine in development, handles QA traffic without problems, then gets on the news and falls over within an hour. The code wasn't "wrong", it just had patterns that don't survive real traffic. Unbounded database queries. Cache bypass. N+1 problems. Accidental self-DDoS from polling. These things are invisible until they're not.

TL;DR: This Claude Code skill performs AI-powered WordPress code review that catches performance anti-patterns static analysis can't see. It understands context across files, knows platform-specific fixes for WordPress VIP, WP Engine, and shared hosting, and explains why patterns fail under load - not just that they violate a rule.

What WordPress Performance Issues It Catches

The obvious stuff is here, posts_per_page => -1, query_posts(), session_start() on the frontend. If you've worked on WordPress at any scale, you know these. But the skill also catches the patterns that look fine until they aren't:

Database writes on page loads. This one looks harmless:

add_action( 'wp_head', function() {
 update_option( 'page_views', get_option( 'page_views', 0 ) + 1 );
} );

It's just incrementing a counter, right? But update_option is an INSERT or UPDATE query. Every visitor triggers a database write. At 1,000 visitors per hour, that's 1,000 write queries competing for row locks on wp_options. Your database CPU spikes, other queries queue up, page loads slow down, and eventually things start timing out.

Dynamic transient keys. This pattern creates rows in wp_options for every user:

set_transient( "user_{$user_id}_cart", $cart_data, HOUR_IN_SECONDS );

Syntactically fine. Functionally fine for a few hundred users. At 10,000 users, you have 20,000 rows in wp_options - WordPress stores the value and the timeout in separate rows. I've seen this pattern bloat wp_options to 40GB. The fix is to use wp_cache_set() instead, which uses object cache and doesn't touch the database.

Polling patterns. Someone adds a "live updates" feature:

setInterval(() => fetch('/wp-json/myapp/v1/updates'), 5000);

Every browser tab polls your server every 5 seconds. With 100 concurrent users, that's 100 requests every 5 seconds - 20 per second hitting your origin. These requests can't be cached because they're checking for "new" data. You've accidentally DDoS'd yourself.

Cron jobs that never stop multiplying. This is subtle:

register_activation_hook( __FILE__, function() {
 wp_schedule_event( time(), 'hourly', 'my_plugin_sync' );
} );

Every time the plugin is activated, it schedules a new cron job. Deactivate and reactivate a few times during debugging, and you've got 5 instances of my_plugin_sync running every hour. The fix is checking wp_next_scheduled() first - but if you don't know to look for it, you won't.

Output summary from reviewing a WooCommerce site with custom plugins

How a Claude Code Skill Compares to Static Analysis Tools

I still run PHPCS with WordPress Coding Standards in my workflow - it's integrated with VS Code via this PHP Sniffer extension and catches real issues on every save. Tools like PHPStan and Psalm in some projects also add type checking and can catch bugs before runtime.

But static analysis has limits. It can tell you "this line violates rule X" but not "this pattern across your codebase will cause problems under specific conditions."

When to use static analysis:

Enforcing coding standards on every save
Type checking and catching null reference errors
CI/CD pipeline checks for syntax and style
Catching obvious mistakes before code review

When to use AI-powered code reviews:

Finding performance issues that span multiple files
Understanding WordPress-specific runtime behavior
Getting context-aware recommendations for your hosting platform
Catching patterns that are syntactically valid but semantically problematic

The best approach (if they're already part of your workflow): Use both. Run PHPCS and PHPStan in your IDE and CI pipeline. Then use this Claude Code skill for deeper WordPress performance analysis before major releases or when debugging production issues.

Here's an example of something static analysis can't catch. This code silently turns into an unbounded query when it fails:

$post_id = get_post_id_from_external_api(); // Returns false on timeout or 404
$query = new WP_Query( array( 'p' => intval( $post_id ) ) );

When get_post_id_from_external_api() returns false, intval(false) is 0. And in WP_Query, 'p' => 0 doesn't mean "get post with ID 0" - it means "don't filter by post ID." If there's no posts_per_page set, you've just queried every post in the database.

Static analysis can't catch this. It sees a valid function call with a valid argument type. It doesn't know that $post_id might be falsy, and it doesn't know the WordPress-specific behavior of 'p' => 0. The skill can reason about this and flag it: "You're passing a potentially falsy value through intval() into a post ID argument - validate before querying."

Another example: session_start() in a single plugin makes your entire site uncacheable. Page caching serves identical HTML to everyone. PHP sessions create per-visitor state - incompatible by design. One line in one plugin you installed last year can be the reason your page cache hit rate is 0% and you're wondering why Cloudflare isn't helping.

Static analysis sees session_start() and might flag it generically. But it doesn't understand the site-wide caching implications. The skill explains: "This bypasses all page caching. Every request hits PHP. Check if this plugin actually needs sessions on the frontend, or if it can be limited to logged-in users."

Finding N+1 Database Queries Across Files

This one is hard to spot in manual code review and very challenging for file-by-file static analysis to catch:

// In archive.php
while ( have_posts() ) {
 the_post();
 $author = get_extended_author_data( get_the_author_meta( 'ID' ) );
 // render post...
}

// In functions.php
function get_extended_author_data( $user_id ) {
 $user = get_user_by( 'id', $user_id ); // Database query
 $meta = get_user_meta( $user_id ); // Another query if not primed
 return array( 'user' => $user, 'meta' => $meta );
}

The loop runs 10 times. Each iteration calls get_extended_author_data(), which runs 2 queries. That's 20 database queries where there should be 2 - one to get all the users, one to get all the meta.

When you're looking at archive.php, you see a function call. When you're looking at functions.php, you see a function that does a couple of queries. Neither file is "wrong" in isolation. The problem is the interaction between them.

One fix could be to prime the caches before the loop - and the skill can suggest the best approach for your specific case depending on the context it learns from your codebase as a whole:

// Prime author cache before the loop
$author_ids = wp_list_pluck( $wp_query->posts, 'post_author' );
cache_users( $author_ids ); // Single query loads all authors

while ( have_posts() ) {
 the_post();
 $author = get_extended_author_data( get_the_author_meta( 'ID' ) );
 // Now uses cached data - no additional queries
}

The skill sees across files. It understands that calling a function with database queries inside a WordPress loop is a red flag pattern, regardless of where that function is defined.

WooCommerce Performance Anti-Patterns

WooCommerce sites face unique WordPress performance challenges. The skill catches several WooCommerce-specific issues that can tank your store's performance:

Cart fragment AJAX overhead. In older WooCommerce versions (pre-7.8), the cart fragments script ran on every page load - even pages with no cart widget. While WooCommerce 7.8+ improved this behavior, many stores still have legacy code or third-party plugins that explicitly enqueue the script:

// Legacy pattern that can still cause unnecessary AJAX on every page
wp_enqueue_script( 'wc-cart-fragments' );

The skill flags when cart fragments are being loaded unnecessarily and suggests limiting them to pages that actually display cart data, or reviewing whether your theme/plugins truly need real-time cart updates on every page.

Order queries without limits. This pattern appears in admin dashboards and reports:

$orders = wc_get_orders( array(
 'status' => 'completed',
 'date_created' => '>' . strtotime( '-30 days' ),
) );

Looks reasonable - get last 30 days of orders. But stores with high volume can have 10,000+ orders in 30 days. Without a 'limit' parameter, you're loading them all into memory. The skill recommends adding pagination or using 'limit' => 100 with proper iteration.

Product query loops in templates. Custom product displays often trigger extra queries:

// In a custom template
$related = wc_get_related_products( $product_id, 4 );
foreach ( $related as $related_id ) {
 $related_product = wc_get_product( $related_id ); // Query per product
 // display...
}

The skill suggests using wc_get_products() with the IDs to batch-load products in a single query.

Platform-Specific WordPress Performance Advice

The right fix for your identified issues can also depend on where your production code runs. The Claude Code skill adapts and suggests its recommendations based on your hosting environment - if it's able to identify it from what it learns from your codebase. It can also suggest how different alternative approaches exist in WordPress-specific hosting platforms.

On WordPress VIP, there are platform-specific functions that handle caching for you. Instead of url_to_postid() - which does an expensive lookup on every call - you use wpcom_vip_url_to_postid(), which caches the result. For external HTTP calls, vip_safe_wp_remote_get() adds circuit-breaker logic that fails gracefully after repeated timeouts. The skill knows these alternatives exist and recommends them when it identifies VIP-specific patterns in your code.

On managed hosts like WP Engine or Pantheon, you usually have Redis or Memcached available. Transients use the object cache instead of the database, so the dynamic transient key problem I mentioned earlier is less severe (though still not ideal). Query Monitor is typically available for profiling slow queries.

On shared hosting, there's often no persistent object cache at all. Transients fall back to wp_options. That 20,000-row bloat is very real. You also can't rely on server-level cron, so DISABLE_WP_CRON may not be an option. The skill can flag these patterns more aggressively when it identifies, or you mention in your prompts, that you're not on a platform with object cache.

Real Results from WordPress Code Reviews

To give you a sense of what the skill finds in practice:

On a 50-file custom theme: The skill identified 12 performance issues in about 30 seconds. The most critical was an N+1 query pattern in the archive template that was causing 47 database queries per page load instead of 5. After fixing, page generation time dropped from 800ms to 180ms.

On a commercial Pro WooCommerce plugin: Found 4 issues including an unbounded order query in a reporting function and a missing nonce check (security, not performance, but still caught it). The order query was loading 15,000 orders into memory on a client with high volume.

On a migration from shared hosting to WP Engine: The skill identified 6 patterns that would work differently with Redis object cache enabled, including transient usage that was previously falling back to database but would now persist correctly.

The skill won't catch everything - it's not running your code or profiling actual queries. But it catches patterns that experienced WordPress developers learn to avoid after seeing them cause problems in production.

How to Install and Use the WordPress Performance Skill

Installation

First, add the marketplace to Claude Code:

/plugin marketplace add elvismdev/claude-wordpress-skills

Then install the plugin:

/plugin install claude-wordpress-skills

Important: After installation, exit Claude Code completely and relaunch it. The skill won't load until you restart:

# Exit Claude Code (Ctrl+C or type 'exit')
# Then relaunch from your project directory
claude

To verify the skill is loaded, you can check your installed plugins:

/plugin

You should see claude-wordpress-skills listed as installed and enabled.

Usage

Once installed, ask Claude Code to review your code:

"Review this plugin for performance issues"
"Check theme/functions.php before we launch"
"Audit this PR for WordPress anti-patterns"
"Look for N+1 queries in this WooCommerce extension"

The skill activates automatically when you're reviewing WordPress PHP code or mention performance-related terms like "slow," "timeout," or "performance review."

Troubleshooting

Skill not activating? Make sure you restarted Claude Code after installation. Check that the plugin shows as "enabled" in /plugin output.

Getting too many warnings? The skill errs on the side of caution. You can ask it to focus on critical issues only: "Show me only high-severity performance issues."

Need platform-specific advice? Tell Claude Code your hosting environment: "We're on WP Engine with Redis" and it will adjust recommendations.

Get It on GitHub

GitHub: https://github.com/elvismdev/claude-wordpress-skills

If you find patterns it misses, false positives it flags, or anti-patterns you've seen in the wild that aren't covered - open an issue or PR. A great goal would be to capture the stuff that experienced WordPress developers know to look for and make it available to everyone doing WordPress code review.

Contributing

The skill is defined in markdown files that Claude Code reads to understand WordPress performance patterns. To add a new anti-pattern:

Fork the repository
Add the pattern to the relevant category file
Include: pattern description, code example, severity level, fix recommendation
Submit a PR with before/after examples

Some Example Questions That Come to Mind

How do I check WordPress plugin performance?

Install the Claude Code WordPress performance skill, navigate to your plugin directory, and ask Claude to "review this plugin for performance issues." It will analyze your PHP files for 50+ anti-patterns including database query issues, caching problems, and code patterns that don't scale. For runtime profiling, combine with Query Monitor to see actual query counts and execution times.

What causes slow WordPress database queries?

The most common causes are: unbounded queries (posts_per_page => -1), N+1 query patterns where loops trigger individual queries instead of batch fetching, missing indexes on custom tables, meta_query without proper indexes, and autoloaded options in wp_options that load on every page. The skill catches all of these patterns and explains the specific fix for each.

How to find N+1 queries in WordPress?

N+1 queries happen when a loop triggers a database query on each iteration. Look for function calls inside while ( have_posts() ) or foreach loops that call get_post_meta(), get_user_by(), get_the_terms(), or similar functions. The Claude Code skill specifically identifies these patterns even when the querying function is defined in a different file than the loop.

What performance issue burned you the worst? I'm especially curious about patterns specific to WooCommerce or multisite - might end up in the next version.