<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Gregory Dickson</title>
    <description>The latest articles on DEV Community by Gregory Dickson (@gregory_dickson_6dd6e2b55).</description>
    <link>https://dev.to/gregory_dickson_6dd6e2b55</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3621284%2Feaeb3a73-3aff-4141-b3e9-a16200e5aba6.png</url>
      <title>DEV Community: Gregory Dickson</title>
      <link>https://dev.to/gregory_dickson_6dd6e2b55</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/gregory_dickson_6dd6e2b55"/>
    <language>en</language>
    <item>
      <title>MemoryGraph vs Graphiti: Choosing the Right Memory for Your AI Agent</title>
      <dc:creator>Gregory Dickson</dc:creator>
      <pubDate>Fri, 26 Dec 2025 13:56:17 +0000</pubDate>
      <link>https://dev.to/gregory_dickson_6dd6e2b55/memorygraph-vs-graphiti-choosing-the-right-memory-for-your-ai-agent-526k</link>
      <guid>https://dev.to/gregory_dickson_6dd6e2b55/memorygraph-vs-graphiti-choosing-the-right-memory-for-your-ai-agent-526k</guid>
      <description>&lt;p&gt;&lt;em&gt;When general-purpose memory meets coding-specific memory&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;December 2025 - Gregory Dickson&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;You've decided your AI agent needs persistent memory. Context loss between sessions is one of the biggest friction points in AI-assisted development.&lt;/p&gt;

&lt;p&gt;Now you're comparing options. If you've done any research, you've probably found &lt;a href="https://github.com/getzep/graphiti" rel="noopener noreferrer"&gt;Graphiti&lt;/a&gt;. With 21,000+ GitHub stars, Y Combinator backing, and a &lt;a href="https://arxiv.org/abs/2501.13956" rel="noopener noreferrer"&gt;peer-reviewed architecture paper&lt;/a&gt;, it's the category leader in AI agent memory.&lt;/p&gt;

&lt;p&gt;So why would you consider anything else?&lt;/p&gt;

&lt;p&gt;Because &lt;strong&gt;the best tool depends on what you're building&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;This post offers an honest comparison to help you choose. We built MemoryGraph, so we're biased. But we'll be fair about where Graphiti excels and where we think MemoryGraph is the better fit.&lt;/p&gt;




&lt;h2&gt;
  
  
  The TL;DR
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;If You're Building...&lt;/th&gt;
&lt;th&gt;Consider...&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;A general AI agent (customer service, personal assistant, enterprise bot)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Graphiti&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;A coding agent (Claude Code, Cursor, Aider, Continue)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;MemoryGraph&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;An agent that needs temporal queries across all entity types&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Graphiti&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;An agent that needs to know what &lt;em&gt;solved&lt;/em&gt; what, what &lt;em&gt;caused&lt;/em&gt; what&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;MemoryGraph&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Production infrastructure with Neo4j/FalkorDB already deployed&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Graphiti&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Zero-infrastructure local development&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;MemoryGraph&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;If you're building a coding agent and want to get started in 60 seconds without infrastructure, MemoryGraph is purpose-built for you. If you're building a general-purpose agent and have database infrastructure, Graphiti is excellent.&lt;/p&gt;




&lt;h2&gt;
  
  
  What They Have in Common
&lt;/h2&gt;

&lt;p&gt;Both MemoryGraph and Graphiti are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Graph-based&lt;/strong&gt;: not flat vector stores&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MCP-compatible&lt;/strong&gt;: work with Claude Desktop, Cursor, and other MCP clients&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Apache 2.0 licensed&lt;/strong&gt;: open source, enterprise-friendly&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Python-native&lt;/strong&gt;: built for the AI/ML ecosystem&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Relationship-aware&lt;/strong&gt;: store entities &lt;em&gt;and&lt;/em&gt; their connections&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Both emerged from the same insight: &lt;strong&gt;vector similarity alone isn't enough for agent memory&lt;/strong&gt;. When you ask "What did we decide last week?" or "What caused this bug?", you need relationships and temporal context, not just embedding similarity.&lt;/p&gt;




&lt;h2&gt;
  
  
  Where They Differ
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Target Use Case
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Graphiti&lt;/strong&gt; is designed for &lt;em&gt;any&lt;/em&gt; AI agent. Their tagline is "Build Real-Time Knowledge Graphs for AI Agents." The examples in their docs include things like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;"Kendra loves Adidas shoes"&lt;/li&gt;
&lt;li&gt;Customer preferences across sessions&lt;/li&gt;
&lt;li&gt;Business entity relationships&lt;/li&gt;
&lt;li&gt;User interaction history&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This generality means Graphiti can model any domain.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;MemoryGraph&lt;/strong&gt; is designed specifically for &lt;em&gt;coding&lt;/em&gt; agents. Every feature is optimized for software development workflows:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;12 memory types built for code (&lt;code&gt;solution&lt;/code&gt;, &lt;code&gt;problem&lt;/code&gt;, &lt;code&gt;error&lt;/code&gt;, &lt;code&gt;fix&lt;/code&gt;, &lt;code&gt;code_pattern&lt;/code&gt;, etc.)&lt;/li&gt;
&lt;li&gt;35+ relationship types for development (&lt;code&gt;SOLVES&lt;/code&gt;, &lt;code&gt;CAUSES&lt;/code&gt;, &lt;code&gt;DEPENDS_ON&lt;/code&gt;, &lt;code&gt;IMPROVES&lt;/code&gt;, etc.)&lt;/li&gt;
&lt;li&gt;Integration patterns for Claude Code, Cursor, Aider, Continue&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This specificity means less configuration for coding use cases.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Relationship Model
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Graphiti&lt;/strong&gt; uses a flexible triplet model where you define your own ontology:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Graphiti: Define custom entity and edge types
&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Person&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;EntityNode&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Product&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;EntityNode&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Loves&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;EntityEdge&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;strength&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This flexibility enables custom ontologies for any domain, but requires upfront design work.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;MemoryGraph&lt;/strong&gt; provides 35+ pre-defined relationship types organized into 7 categories:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# MemoryGraph: Use built-in coding relationships
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tool&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;create_relationship&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;from_memory_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;solution_123&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;to_memory_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;problem_456&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;relationship_type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SOLVES&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;  &lt;span class="c1"&gt;# One of 35+ built-in types
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The categories:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Causal&lt;/strong&gt;: &lt;code&gt;CAUSES&lt;/code&gt;, &lt;code&gt;TRIGGERS&lt;/code&gt;, &lt;code&gt;LEADS_TO&lt;/code&gt;, &lt;code&gt;PREVENTS&lt;/code&gt;, &lt;code&gt;BREAKS&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Solution&lt;/strong&gt;: &lt;code&gt;SOLVES&lt;/code&gt;, &lt;code&gt;ADDRESSES&lt;/code&gt;, &lt;code&gt;ALTERNATIVE_TO&lt;/code&gt;, &lt;code&gt;IMPROVES&lt;/code&gt;, &lt;code&gt;REPLACES&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Context&lt;/strong&gt;: &lt;code&gt;OCCURS_IN&lt;/code&gt;, &lt;code&gt;APPLIES_TO&lt;/code&gt;, &lt;code&gt;WORKS_WITH&lt;/code&gt;, &lt;code&gt;REQUIRES&lt;/code&gt;, &lt;code&gt;USED_IN&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Learning&lt;/strong&gt;: &lt;code&gt;BUILDS_ON&lt;/code&gt;, &lt;code&gt;CONTRADICTS&lt;/code&gt;, &lt;code&gt;CONFIRMS&lt;/code&gt;, &lt;code&gt;GENERALIZES&lt;/code&gt;, &lt;code&gt;SPECIALIZES&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Similarity&lt;/strong&gt;: &lt;code&gt;SIMILAR_TO&lt;/code&gt;, &lt;code&gt;VARIANT_OF&lt;/code&gt;, &lt;code&gt;RELATED_TO&lt;/code&gt;, &lt;code&gt;ANALOGY_TO&lt;/code&gt;, &lt;code&gt;OPPOSITE_OF&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Workflow&lt;/strong&gt;: &lt;code&gt;FOLLOWS&lt;/code&gt;, &lt;code&gt;DEPENDS_ON&lt;/code&gt;, &lt;code&gt;ENABLES&lt;/code&gt;, &lt;code&gt;BLOCKS&lt;/code&gt;, &lt;code&gt;PARALLEL_TO&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Quality&lt;/strong&gt;: &lt;code&gt;EFFECTIVE_FOR&lt;/code&gt;, &lt;code&gt;INEFFECTIVE_FOR&lt;/code&gt;, &lt;code&gt;PREFERRED_OVER&lt;/code&gt;, &lt;code&gt;DEPRECATED_BY&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For coding agents, these relationships are immediately useful without ontology design.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Entity Extraction
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Graphiti&lt;/strong&gt; uses LLM-powered entity extraction. When you add an episode (a piece of text), it automatically extracts entities and relationships:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Graphiti: Automatic extraction
&lt;/span&gt;&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;graphiti&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_episode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user_message&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;episode_body&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;I fixed the timeout by adding retry logic with exponential backoff&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;source&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;EpisodeType&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# LLM extracts: entities, relationships, timestamps
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This eliminates manual data structuring, but adds latency (LLM calls) and cost (tokens).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;MemoryGraph&lt;/strong&gt; uses explicit storage. You decide what to store:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# MemoryGraph: Explicit storage
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tool&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;store_memory&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;solution&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;title&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Fixed timeout with retry logic&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Added exponential backoff with max 3 retries...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tags&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;timeout&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;retry&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;exponential-backoff&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This gives you control over exactly what's stored, with no LLM extraction overhead. The tradeoff is that your agent (or you) must explicitly store memories.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Architecture comparison:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Graphiti (automatic extraction):
┌─────────┐    LLM     ┌──────────┐   Neo4j   ┌───────────┐
│ Episode │ ─────────▶ │ Entities │ ────────▶ │ Knowledge │
│  (text) │  Extract   │ + Edges  │   Store   │   Graph   │
└─────────┘  500ms-2s  └──────────┘           └───────────┘

MemoryGraph (explicit storage):
┌────────┐   Direct    ┌────────────┐
│ Memory │ ──────────▶ │ SQLite/Neo │   No LLM required
└────────┘    &amp;lt;5ms     └────────────┘
     │
     ▼ (explicit)
┌──────────────┐
│ Relationship │   You control what's linked
└──────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;The extraction trade-off:&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Aspect&lt;/th&gt;
&lt;th&gt;Graphiti (automatic)&lt;/th&gt;
&lt;th&gt;MemoryGraph (explicit)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Cognitive load&lt;/td&gt;
&lt;td&gt;Lower: just feed it text&lt;/td&gt;
&lt;td&gt;Higher: you decide what to store&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Relationship discovery&lt;/td&gt;
&lt;td&gt;May find implicit connections&lt;/td&gt;
&lt;td&gt;Only what you specify&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Storage latency&lt;/td&gt;
&lt;td&gt;500ms-2s (LLM call)&lt;/td&gt;
&lt;td&gt;&amp;lt;5ms (direct write)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cost per memory&lt;/td&gt;
&lt;td&gt;$0.003-$0.01 (token cost)&lt;/td&gt;
&lt;td&gt;$0 (no LLM)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Extraction quality&lt;/td&gt;
&lt;td&gt;Depends on model/prompts&lt;/td&gt;
&lt;td&gt;Deterministic&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  4. Infrastructure Requirements
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Graphiti&lt;/strong&gt; requires a graph database:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Graphiti setup&lt;/span&gt;
docker run neo4j...              &lt;span class="c"&gt;# Or FalkorDB, Kuzu, Neptune&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;NEO4J_URI&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;bolt://localhost:7687
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;NEO4J_PASSWORD&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;...
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;...        &lt;span class="c"&gt;# Required for entity extraction&lt;/span&gt;
pip &lt;span class="nb"&gt;install &lt;/span&gt;graphiti-core[neo4j]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is appropriate for production systems. But it's friction for getting started.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;MemoryGraph&lt;/strong&gt; defaults to SQLite with zero configuration:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# MemoryGraph setup&lt;/span&gt;
pipx &lt;span class="nb"&gt;install &lt;/span&gt;memorygraphMCP
claude mcp add &lt;span class="nt"&gt;--scope&lt;/span&gt; user memorygraph &lt;span class="nt"&gt;--&lt;/span&gt; memorygraph
&lt;span class="c"&gt;# Done. Database created automatically.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can upgrade to Neo4j, FalkorDB, or cloud sync later. But the default works immediately.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Temporal Model
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Graphiti&lt;/strong&gt; has a sophisticated bi-temporal model:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Valid time&lt;/strong&gt;: When the fact was true in the real world&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Transaction time&lt;/strong&gt;: When the fact was recorded&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This enables queries like "What did we know about X as of March 2024?" and handles contradictions by invalidating old edges rather than deleting them.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;MemoryGraph&lt;/strong&gt; also supports bi-temporal tracking (added in v0.10.0, inspired by Graphiti):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# MemoryGraph temporal queries
&lt;/span&gt;&lt;span class="n"&gt;march_2024&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2024&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tzinfo&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;timezone&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;utc&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;solutions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_related_memories&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;error_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;as_of&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;march_2024&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;changes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;what_changed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;since&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;one_week_ago&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Both handle temporal queries well. Graphiti's bi-temporal model is more sophisticated, tracking validity intervals on every edge. MemoryGraph's temporal support (added in v0.10.0) covers the common cases: point-in-time queries and change tracking.&lt;/p&gt;

&lt;h3&gt;
  
  
  6. Query Model
&lt;/h3&gt;

&lt;p&gt;Both systems are "graph-based" but query differently:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Graphiti&lt;/strong&gt; uses hybrid retrieval (from the &lt;a href="https://arxiv.org/abs/2501.13956" rel="noopener noreferrer"&gt;arXiv paper&lt;/a&gt;):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Semantic similarity search (embeddings)&lt;/li&gt;
&lt;li&gt;BM25 full-text search (Lucene via Neo4j)&lt;/li&gt;
&lt;li&gt;Breadth-first graph traversal from seed nodes&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;MemoryGraph&lt;/strong&gt; uses:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;FTS5 full-text search (SQLite) or native graph queries (Neo4j/FalkorDB)&lt;/li&gt;
&lt;li&gt;Tag-based filtering with exact match&lt;/li&gt;
&lt;li&gt;Typed relationship traversal with configurable depth&lt;/li&gt;
&lt;li&gt;Three search tolerance modes: &lt;code&gt;strict&lt;/code&gt;, &lt;code&gt;normal&lt;/code&gt; (stemming), &lt;code&gt;fuzzy&lt;/code&gt; (typo-tolerant)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Graphiti's hybrid approach excels at finding semantically related content across large, unstructured graphs. MemoryGraph's typed traversal excels at answering specific questions like "what solved this error?" or "what depends on this component?"&lt;/p&gt;




&lt;h2&gt;
  
  
  Practical Comparison: A Debugging Workflow
&lt;/h2&gt;

&lt;p&gt;Here's how each tool handles a common coding scenario.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Scenario
&lt;/h3&gt;

&lt;p&gt;You're debugging a Redis timeout issue. Over several sessions, you:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Encounter the error&lt;/li&gt;
&lt;li&gt;Try a fix (increase timeout), doesn't work&lt;/li&gt;
&lt;li&gt;Try another fix (add retry logic), causes memory leak&lt;/li&gt;
&lt;li&gt;Find the root cause (connection pool exhaustion)&lt;/li&gt;
&lt;li&gt;Implement the real fix (increase pool size)&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  With Graphiti
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Session 1: Encounter error
&lt;/span&gt;&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;graphiti&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_episode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;debug_session&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;episode_body&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Got RedisTimeoutError after 30 seconds. Stack trace shows connection.execute() hanging.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;source&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;EpisodeType&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Session 2: Try timeout fix
&lt;/span&gt;&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;graphiti&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_episode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;debug_session&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="n"&gt;episode_body&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Increased Redis timeout to 60s. Still getting timeouts under load.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;source&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;EpisodeType&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Session 3: Try retry logic
&lt;/span&gt;&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;graphiti&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_episode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;debug_session&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;episode_body&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Added retry logic with exponential backoff. Now seeing memory growth - possible leak.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;source&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;EpisodeType&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# ... and so on
&lt;/span&gt;
&lt;span class="c1"&gt;# Later: Query what happened
&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;graphiti&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Redis timeout fixes&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Graphiti's LLM extraction will create entities and relationships from this text. The quality depends on the extraction prompts and model.&lt;/p&gt;

&lt;h3&gt;
  
  
  With MemoryGraph
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Session 1: Store the error
&lt;/span&gt;&lt;span class="n"&gt;error&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;store_memory&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;error&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;RedisTimeoutError under load&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Connection.execute() hangs after 30s under concurrent requests&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;tags&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;redis&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;timeout&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;production&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Session 2: Store failed attempt
&lt;/span&gt;&lt;span class="n"&gt;attempt1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;store_memory&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;solution&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Increased Redis timeout to 60s&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Changed timeout config. Still fails under load - not the root cause.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;tags&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;redis&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;timeout&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;failed&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;create_relationship&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;attempt1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ADDRESSES&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# Attempted to address
&lt;/span&gt;&lt;span class="nf"&gt;create_relationship&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;attempt1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;INEFFECTIVE_FOR&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# But didn't work
&lt;/span&gt;
&lt;span class="c1"&gt;# Session 3: Store attempt that caused new problem
&lt;/span&gt;&lt;span class="n"&gt;attempt2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;store_memory&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;solution&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Added retry with exponential backoff&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Implemented retry logic. Works for timeout but causes memory growth.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;tags&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;redis&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;retry&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;partial-fix&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;leak&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;store_memory&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;problem&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Memory leak from retry logic&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Each retry holds connection reference, causing memory growth under load.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;tags&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;redis&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;memory-leak&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;create_relationship&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;attempt2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ADDRESSES&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;create_relationship&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;attempt2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;leak&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;CAUSES&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# This fix caused a new problem
&lt;/span&gt;
&lt;span class="c1"&gt;# Session 4: Find root cause and real fix
&lt;/span&gt;&lt;span class="n"&gt;root_cause&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;store_memory&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;problem&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Redis connection pool exhaustion&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Default pool size of 10 is exhausted under load, causing queued connections to timeout.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;tags&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;redis&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;connection-pool&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;root-cause&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;real_fix&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;store_memory&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;solution&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Increased Redis connection pool to 50&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Set REDIS_POOL_SIZE=50. Handles concurrent load without timeouts or retries.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;tags&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;redis&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;connection-pool&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fix&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;create_relationship&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;root_cause&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;CAUSES&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;create_relationship&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;real_fix&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;root_cause&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SOLVES&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;create_relationship&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;real_fix&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;attempt1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;IMPROVES&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;create_relationship&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;real_fix&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;attempt2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;REPLACES&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Later: Query the full picture
&lt;/span&gt;&lt;span class="nf"&gt;recall_memories&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;redis timeout&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The result is a queryable graph:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[pool_exhaustion] ──CAUSES──▶ [timeout_error]
       │                            ▲
       │                            │
       ▼                    ┌───────┴───────┐
[real_fix: pool=50]         │               │
       │              [attempt1: 60s]  [attempt2: retry]
       │                    │               │
       ├──IMPROVES─────────▶│               │
       │                    │               ▼
       └──REPLACES─────────────────────▶[memory_leak]
                                            ▲
                                            │
                              [attempt2]──CAUSES──┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When you ask "What happened with Redis?" six months later, MemoryGraph returns this entire causal chain, including what &lt;em&gt;didn't&lt;/em&gt; work and why.&lt;/p&gt;




&lt;h2&gt;
  
  
  Decision Framework
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Choose Graphiti If:
&lt;/h3&gt;

&lt;p&gt;✅ You're building a &lt;strong&gt;general-purpose AI agent&lt;/strong&gt; (not specifically for coding)&lt;/p&gt;

&lt;p&gt;✅ You want &lt;strong&gt;automatic entity extraction&lt;/strong&gt; from unstructured text&lt;/p&gt;

&lt;p&gt;✅ You need &lt;strong&gt;sophisticated temporal queries&lt;/strong&gt; across arbitrary entity types&lt;/p&gt;

&lt;p&gt;✅ You already have &lt;strong&gt;Neo4j, FalkorDB, or similar infrastructure&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;✅ You want a &lt;strong&gt;commercial platform&lt;/strong&gt; with support (Zep Cloud)&lt;/p&gt;

&lt;p&gt;✅ You're okay with &lt;strong&gt;LLM costs&lt;/strong&gt; for entity extraction&lt;/p&gt;

&lt;h3&gt;
  
  
  Choose MemoryGraph If:
&lt;/h3&gt;

&lt;p&gt;✅ You're building with &lt;strong&gt;Claude Code, Cursor, Aider, or Continue&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;✅ You want &lt;strong&gt;coding-specific relationships&lt;/strong&gt; out of the box (&lt;code&gt;SOLVES&lt;/code&gt;, &lt;code&gt;CAUSES&lt;/code&gt;, &lt;code&gt;DEPENDS_ON&lt;/code&gt;)&lt;/p&gt;

&lt;p&gt;✅ You want &lt;strong&gt;zero infrastructure&lt;/strong&gt;: SQLite default, upgrade later&lt;/p&gt;

&lt;p&gt;✅ You prefer &lt;strong&gt;explicit control&lt;/strong&gt; over what gets stored&lt;/p&gt;

&lt;p&gt;✅ You want to &lt;strong&gt;get started in 60 seconds&lt;/strong&gt;, not 60 minutes&lt;/p&gt;

&lt;p&gt;✅ You want &lt;strong&gt;local-first&lt;/strong&gt; with optional cloud sync&lt;/p&gt;




&lt;h2&gt;
  
  
  What About Using Both?
&lt;/h2&gt;

&lt;p&gt;This is a valid architecture:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Use &lt;strong&gt;Graphiti&lt;/strong&gt; for your product's user-facing memory (customer preferences, conversation history, business entities)&lt;/li&gt;
&lt;li&gt;Use &lt;strong&gt;MemoryGraph&lt;/strong&gt; for your &lt;em&gt;development&lt;/em&gt; workflow (what you learned building the product)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;They solve different problems. Graphiti helps your AI agent remember your &lt;em&gt;users&lt;/em&gt;. MemoryGraph helps your coding agent remember your &lt;em&gt;codebase&lt;/em&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  What If You Choose Wrong?
&lt;/h2&gt;

&lt;p&gt;Both systems use standard data formats. Migration is possible:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;MemoryGraph → Graphiti&lt;/strong&gt;: Export memories as JSON, feed them as episodes. Graphiti's LLM will re-extract entities and relationships (you'll lose your explicit relationship types but gain Graphiti's automatic extraction).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Graphiti → MemoryGraph&lt;/strong&gt;: Export entities and edges. Map entity types to MemoryGraph's 12 memory types, map edge types to the 35 relationship types. Manual mapping required, but no data loss.&lt;/p&gt;

&lt;p&gt;Neither system creates vendor lock-in at the data layer. Choose based on current needs; you can migrate if requirements change.&lt;/p&gt;




&lt;h2&gt;
  
  
  Getting Started with MemoryGraph
&lt;/h2&gt;

&lt;p&gt;If MemoryGraph sounds right for your use case:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Install&lt;/span&gt;
pipx &lt;span class="nb"&gt;install &lt;/span&gt;memorygraphMCP

&lt;span class="c"&gt;# Add to Claude Code&lt;/span&gt;
claude mcp add &lt;span class="nt"&gt;--scope&lt;/span&gt; user memorygraph &lt;span class="nt"&gt;--&lt;/span&gt; memorygraph

&lt;span class="c"&gt;# Start using&lt;/span&gt;
claude
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="s2"&gt;"Remember this: Use pytest fixtures for database tests"&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="s2"&gt;"What do you remember about testing?"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No database setup. No Docker. No API keys.&lt;/p&gt;

&lt;p&gt;See &lt;a href="https://memorygraph.dev" rel="noopener noreferrer"&gt;memorygraph.dev&lt;/a&gt; for documentation, or &lt;a href="https://github.com/gregorydickson/memory-graph" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt; for the source.&lt;/p&gt;




&lt;h2&gt;
  
  
  Getting Started with Graphiti
&lt;/h2&gt;

&lt;p&gt;If Graphiti is the better fit:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Start Neo4j&lt;/span&gt;
docker run &lt;span class="nt"&gt;-p&lt;/span&gt; 7474:7474 &lt;span class="nt"&gt;-p&lt;/span&gt; 7687:7687 neo4j

&lt;span class="c"&gt;# Install&lt;/span&gt;
pip &lt;span class="nb"&gt;install &lt;/span&gt;graphiti-core[neo4j]

&lt;span class="c"&gt;# Configure&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;NEO4J_URI&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;bolt://localhost:7687
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;See &lt;a href="https://github.com/getzep/graphiti" rel="noopener noreferrer"&gt;github.com/getzep/graphiti&lt;/a&gt; for documentation.&lt;/p&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Graphiti and MemoryGraph both solve the fundamental problem of AI agent memory. They're both graph-based, both MCP-compatible, both Apache 2.0 licensed.&lt;/p&gt;

&lt;p&gt;The difference is focus.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Graphiti&lt;/strong&gt; is a general-purpose temporal knowledge graph for any AI agent. It's mature, well-funded, and production-proven.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;MemoryGraph&lt;/strong&gt; is a coding-specific memory system for AI coding agents. It's opinionated, zero-config, and built for developers who want to start in 60 seconds.&lt;/p&gt;

&lt;p&gt;Choose the tool that matches your use case. For coding agents, we think MemoryGraph is the better fit. For general AI agents, Graphiti is excellent.&lt;/p&gt;

&lt;p&gt;And if you're building both? Use both.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;MemoryGraph is open source under Apache 2.0. Try it at &lt;a href="https://memorygraph.dev" rel="noopener noreferrer"&gt;memorygraph.dev&lt;/a&gt; or star us on &lt;a href="https://github.com/memorygraphdev/memorygraph" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Gregory Dickson is a Senior AI Developer &amp;amp; Solutions Architect specializing in AI/ML development and cloud architecture. He's the creator of MemoryGraph, an open-source MCP memory server using graph-based relationship tracking.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>architecture</category>
      <category>ai</category>
      <category>tooling</category>
      <category>agents</category>
    </item>
    <item>
      <title>What Deep Learning Theory Teaches Us About AI Memory</title>
      <dc:creator>Gregory Dickson</dc:creator>
      <pubDate>Fri, 26 Dec 2025 13:53:52 +0000</pubDate>
      <link>https://dev.to/gregory_dickson_6dd6e2b55/what-deep-learning-theory-teaches-us-about-ai-memory-796</link>
      <guid>https://dev.to/gregory_dickson_6dd6e2b55/what-deep-learning-theory-teaches-us-about-ai-memory-796</guid>
      <description>&lt;p&gt;&lt;em&gt;How rate reduction and lossy compression principles from Berkeley's new textbook could reshape how we build persistent memory for LLMs&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The Memory Problem No One Talks About
&lt;/h2&gt;

&lt;p&gt;Every AI coding assistant you use today has the same dirty secret: it forgets everything the moment your session ends. That brilliant debugging session where Claude figured out your codebase architecture? Gone. The context about your team's coding conventions that took 20 messages to establish? Evaporated.&lt;/p&gt;

&lt;p&gt;We're building MemoryGraph to solve this problem—a graph-based memory system that gives LLMs persistent, queryable memory across sessions. But as we dove deeper into the architecture, we kept hitting the same fundamental question:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What does it actually mean to "remember" something well?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;It's not enough to just store text and retrieve it. Human memory doesn't work that way. We compress experiences into schemas, organize knowledge hierarchically, and somehow retrieve exactly what's relevant from decades of accumulated experience in milliseconds.&lt;/p&gt;

&lt;p&gt;Then we discovered a new textbook that changed how we think about this problem entirely.&lt;/p&gt;




&lt;h2&gt;
  
  
  "Learning Deep Representations of Data Distributions"
&lt;/h2&gt;

&lt;p&gt;In August 2025, Yi Ma's lab at Berkeley released &lt;a href="https://ma-lab-berkeley.github.io/deep-representation-learning-book/" rel="noopener noreferrer"&gt;&lt;em&gt;Learning Deep Representations of Data Distributions&lt;/em&gt;&lt;/a&gt;—an open-source textbook that presents a unified mathematical framework for understanding deep learning through the lens of compression.&lt;/p&gt;

&lt;p&gt;Their central thesis is deceptively simple:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;"We compress to learn, and we learn to compress."&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The book argues that intelligence—whether biological or artificial—fundamentally involves discovering low-dimensional structure in high-dimensional data and transforming that data into compact, structured representations.&lt;/p&gt;

&lt;p&gt;This isn't just philosophy. They provide rigorous mathematics showing that popular neural network architectures (ResNets, Transformers, CNNs) can be derived as iterative optimization steps that maximize something called &lt;strong&gt;rate reduction&lt;/strong&gt;—a measure of how well representations compress data while preserving important distinctions.&lt;/p&gt;

&lt;p&gt;Reading this, we realized: this framework maps directly onto the memory storage problem.&lt;/p&gt;




&lt;h2&gt;
  
  
  Rate Reduction: A New Way to Think About Memory Quality
&lt;/h2&gt;

&lt;p&gt;The book introduces a principle called &lt;strong&gt;Maximal Coding Rate Reduction (MCR²)&lt;/strong&gt;. Here's the intuition:&lt;/p&gt;

&lt;p&gt;Imagine you have a collection of memories from different categories—bug fixes, architectural decisions, API documentation, team preferences. A good memory representation should do two things simultaneously:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Maximize expansion between categories&lt;/strong&gt;: Memories about bug fixes should live in a completely different "region" of your representation space than memories about team preferences. You want these categories to be as distinguishable as possible.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Maximize compression within categories&lt;/strong&gt;: All your bug fix memories should cluster tightly together. They share common structure—problem, cause, solution—and your representation should capture that.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Mathematically, this is expressed as:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ΔR = R(all memories) - Σ R(memories in each category)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Where R is the "coding rate"—essentially, how many bits you'd need to encode the data. You want to maximize ΔR: the total coding rate should be high (diverse memories), but the sum of within-category rates should be low (similar memories cluster).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;This gives us a concrete metric for memory quality that goes beyond simple retrieval accuracy.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  How This Applies to LLM Memory Systems
&lt;/h2&gt;

&lt;h3&gt;
  
  
  The Problem with Flat Embeddings
&lt;/h3&gt;

&lt;p&gt;Most vector databases treat all memories the same way: convert text to a 384 or 768-dimensional embedding, store it, retrieve by cosine similarity.&lt;/p&gt;

&lt;p&gt;But this ignores the structure we know exists in the data. A memory about a "person" is fundamentally different from a memory about a "code pattern." Treating them identically wastes representational capacity and makes retrieval harder.&lt;/p&gt;

&lt;p&gt;The Berkeley framework suggests a different approach: &lt;strong&gt;type-specific subspaces&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Structured Embedding Spaces
&lt;/h3&gt;

&lt;p&gt;Instead of one flat embedding space, imagine memories organized into learned subspaces:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌─────────────────────────────────────────────────────────────┐
│                    Embedding Space (384-dim)                │
│                                                             │
│   ┌──────────────┐  ┌──────────────┐  ┌──────────────┐     │
│   │   Person     │  │   Project    │  │   Solution   │     │
│   │  Subspace    │  │  Subspace    │  │  Subspace    │     │
│   │   (64-dim)   │  │  (128-dim)   │  │   (96-dim)   │     │
│   │              │  │              │  │              │     │
│   │  • Alice     │  │ • ProjectX   │  │ • Fix#123    │     │
│   │  • Bob       │  │ • MemGraph   │  │ • Fix#456    │     │
│   │  • Carol     │  │ • API-v2     │  │ • Pattern#7  │     │
│   └──────────────┘  └──────────────┘  └──────────────┘     │
│                                                             │
│        ↑ Orthogonal subspaces (maximally separated)         │
└─────────────────────────────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each memory type gets projected into its own subspace. These subspaces are learned to be orthogonal—maximizing separation between types. Within each subspace, similar memories cluster together—maximizing compression.&lt;/p&gt;

&lt;p&gt;This is rate reduction in action: expand between categories, compress within them.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Graph as a Compression Mechanism
&lt;/h2&gt;

&lt;p&gt;Here's where things get interesting for MemoryGraph specifically.&lt;/p&gt;

&lt;p&gt;The Berkeley book shows that neural network layers can be understood as iterative compression steps. Each layer transforms representations to be more compact and more structured.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;We realized: a knowledge graph already does this.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Consider how MemoryGraph stores information:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Raw Input: "Alice fixed the authentication bug in the login 
            service yesterday by adding proper token validation"

Graph Representation:
  (Alice:Person) --[AUTHORED]--&amp;gt; (Fix#892:Solution)
  (Fix#892:Solution) --[RESOLVES]--&amp;gt; (AuthBug:Error)
  (AuthBug:Error) --[AFFECTS]--&amp;gt; (LoginService:Project)
  (Fix#892:Solution) --[INVOLVES]--&amp;gt; (TokenValidation:CodePattern)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This graph representation is a &lt;strong&gt;lossy compression&lt;/strong&gt; of the original text. We've extracted the essential structure—who, what, where, how—and discarded the rest. The entities are compressed representations (cluster centers), and the relationships define how to navigate between them.&lt;/p&gt;

&lt;p&gt;In the language of the Berkeley book:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Entities&lt;/strong&gt; = compressed representations of many observations (low-dimensional subspace centers)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Relations&lt;/strong&gt; = transformation operators between subspaces&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Observations&lt;/strong&gt; = high-dimensional raw data that gets compressed into entity updates&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The graph structure itself encodes the low-dimensional manifold that rate reduction seeks to discover.&lt;/p&gt;




&lt;h2&gt;
  
  
  Compression in Action: MemoryGraph's Inference Engine
&lt;/h2&gt;

&lt;p&gt;This isn't just theory—we've already implemented automatic compression in MemoryGraph's inference engine. When you save a memory, the system automatically discovers and creates new relationships you didn't explicitly define.&lt;/p&gt;

&lt;h3&gt;
  
  
  Transitive Compression
&lt;/h3&gt;

&lt;p&gt;Consider this scenario. You create two explicit relationships:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"Auth Service" --[DEPENDS_ON]--&amp;gt; "JWT Library"
"JWT Library" --[DEPENDS_ON]--&amp;gt; "Crypto Utils"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The inference engine automatically adds:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"Auth Service" --[DEPENDS_ON]--&amp;gt; "Crypto Utils" (inferred, transitive)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is &lt;strong&gt;path compression&lt;/strong&gt;—reducing a multi-hop traversal to a single edge. In information-theoretic terms, we're eliminating redundancy in the graph structure. The transitive relationship was always &lt;em&gt;implicitly&lt;/em&gt; there; the inference engine makes it &lt;em&gt;explicit&lt;/em&gt; and queryable.&lt;/p&gt;

&lt;h3&gt;
  
  
  Type Inference as Semantic Compression
&lt;/h3&gt;

&lt;p&gt;The engine also performs type inference:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Rule&lt;/th&gt;
&lt;th&gt;What It Does&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;type_from_solves&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;If memory SOLVES a problem → type becomes "solution"&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;type_from_fixes&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;If memory FIXES an error → type becomes "fix"&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;type_from_causes&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;If memory CAUSES a problem → type becomes "problem"&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;This is semantic compression: instead of storing "this memory has a SOLVES relationship to a problem-type entity," we compress that pattern into a single type label. The type &lt;em&gt;is&lt;/em&gt; the compressed representation of the memory's structural role in the graph.&lt;/p&gt;

&lt;h3&gt;
  
  
  Co-occurrence Affinity
&lt;/h3&gt;

&lt;p&gt;Our cloud tier includes an even more interesting rule: &lt;strong&gt;co-occurrence affinity&lt;/strong&gt;. When two memories share multiple common connections (say, 3+ shared neighbors), the engine infers they're related—even if no one explicitly connected them.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Memory A --[USES]--&amp;gt; React
Memory A --[USES]--&amp;gt; TypeScript  
Memory A --[PART_OF]--&amp;gt; Frontend Module

Memory B --[USES]--&amp;gt; React
Memory B --[USES]--&amp;gt; TypeScript
Memory B --[PART_OF]--&amp;gt; Frontend Module

Inferred: Memory A --[RELATED_TO]--&amp;gt; Memory B (confidence: 0.45)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is the graph equivalent of finding low-dimensional structure: memories that occupy similar "positions" in the relationship space (share many neighbors) are likely semantically related, even without explicit links.&lt;/p&gt;

&lt;h3&gt;
  
  
  Confidence Scores and Lossy Compression
&lt;/h3&gt;

&lt;p&gt;Every inferred relationship carries a confidence score (0-1). Lower confidence means the inference is more speculative—it's a "lossier" compression of the underlying evidence.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"relationship_type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"DEPENDS_ON"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"inferred"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"confidence"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"rule"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"transitive_depends_on"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"depth"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Users can tune this tradeoff: accept more inferred relationships (richer graph, more noise) or fewer (sparser graph, higher precision). This is exactly the rate-distortion tradeoff that information theory describes—you choose how much fidelity to sacrifice for how much compression.&lt;/p&gt;

&lt;p&gt;The cleanup API makes this explicit:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;POST /inference/cleanup?min_confidence&lt;span class="o"&gt;=&lt;/span&gt;0.3&amp;amp;max_age_days&lt;span class="o"&gt;=&lt;/span&gt;30
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This removes low-confidence edges older than 30 days—literally pruning the graph to maintain a target compression quality.&lt;/p&gt;




&lt;h2&gt;
  
  
  Practical Implementation Ideas
&lt;/h2&gt;

&lt;p&gt;The inference engine demonstrates that compression principles already work in MemoryGraph. Based on the Berkeley framework, we're exploring several enhancements that go deeper:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Type-Aware Embeddings
&lt;/h3&gt;

&lt;p&gt;Currently, our semantic search (planned for the SDK) would treat all content identically. But the Berkeley framework suggests projecting embeddings through type-specific learned transformations:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;TypeAwareEmbedding&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;base_encoder&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SentenceTransformer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;all-MiniLM-L6-v2&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="c1"&gt;# Learned projections for each entity type
&lt;/span&gt;        &lt;span class="c1"&gt;# Dimensions chosen based on type complexity
&lt;/span&gt;        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;projections&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;person&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;LinearProjection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;384&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;64&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;project&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;LinearProjection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;384&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;128&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;solution&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;LinearProjection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;384&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;96&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;error&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;LinearProjection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;384&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;48&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;code_pattern&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;LinearProjection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;384&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;64&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;embed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;entity_type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ndarray&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;base&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;base_encoder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;projection&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;projections&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;entity_type&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;projection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;base&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;projection&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="n"&gt;base&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The dimensionality of each subspace reflects the inherent complexity of that type. People are relatively simple to characterize; projects have more nuance. This extends our existing type inference—instead of just labeling types, we'd represent them in mathematically distinct subspaces.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Progressive Memory Consolidation
&lt;/h3&gt;

&lt;p&gt;The book describes how deep networks progressively compress representations layer by layer. Our inference engine already does single-pass compression. We can extend this to multi-layer consolidation over time:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Layer 0: Raw Observations (high-dimensional, ephemeral)
         "User mentioned they prefer tabs over spaces"
         "User asked about Python formatting"
         "User corrected a spacing issue in code"
              │
              ▼ Compression (after session) [EXISTING: type inference]

Layer 1: Working Memory (mid-dimensional, session-persistent)  
         "User has strong code formatting preferences"
              │
              ▼ Compression (after multiple sessions) [NEW: consolidation]

Layer 2: Consolidated Knowledge (low-dimensional, long-term)
         Entity property: coding_style = "strict_formatting"
              │
              ▼ Compression (over time) [NEW: schema evolution]

Layer 3: Core Identity (minimal, permanent)
         Entity: User with trait "detail_oriented"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This mirrors how human memory consolidation works—episodic memories compress into semantic knowledge over time. Our existing &lt;code&gt;similar_tags_affinity&lt;/code&gt; rule hints at this: memories that share structure get linked. The next step is actually &lt;em&gt;merging&lt;/em&gt; them.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Expansion-Compression Retrieval
&lt;/h3&gt;

&lt;p&gt;Our planned hybrid search (ADR-005) optimizes for similarity. The rate reduction framework suggests we should &lt;em&gt;also&lt;/em&gt; optimize for distinctiveness:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;retrieval_score&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;memory&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Memory&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
                    &lt;span class="n"&gt;other_retrieved&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Memory&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# Compression term: how relevant is this memory?
&lt;/span&gt;    &lt;span class="n"&gt;similarity&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;cosine_similarity&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="nf"&gt;embed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; 
        &lt;span class="nf"&gt;embed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;memory&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Expansion term: how distinct is this from other results?
&lt;/span&gt;    &lt;span class="c1"&gt;# This prevents returning 5 near-duplicate memories
&lt;/span&gt;    &lt;span class="n"&gt;distinctiveness&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;1.0&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nf"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
        &lt;span class="nf"&gt;cosine_similarity&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;embed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;memory&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nf"&gt;embed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;other&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;other&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;other_retrieved&lt;/span&gt;
    &lt;span class="p"&gt;])&lt;/span&gt;

    &lt;span class="c1"&gt;# Balance both objectives (like rate reduction's R - Rc)
&lt;/span&gt;    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;alpha&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;similarity&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;alpha&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;distinctiveness&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This directly implements the MCR² principle: maximize total coding rate (diverse results) while minimizing within-group coding rate (each result is relevant). It prevents retrieval from returning redundant results—a common failure mode of pure similarity search.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Graph-Guided Manifold Navigation
&lt;/h3&gt;

&lt;p&gt;Our inference engine already exploits graph structure for discovery. We can extend this to retrieval:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;graph_aware_search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;depth&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# 1. Find entry points via embedding similarity
&lt;/span&gt;    &lt;span class="n"&gt;seeds&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;semantic_search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;limit&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# 2. Expand along graph edges (follow the manifold)
&lt;/span&gt;    &lt;span class="c1"&gt;# This uses our existing relationship structure
&lt;/span&gt;    &lt;span class="n"&gt;expanded&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;seeds&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;frontier&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;seeds&lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;depth&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;entity&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;frontier&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="c1"&gt;# Relations define valid transitions on the manifold
&lt;/span&gt;            &lt;span class="c1"&gt;# Inferred edges from our engine help here!
&lt;/span&gt;            &lt;span class="n"&gt;neighbors&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;get_related_entities&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;entity&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;expanded&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;update&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;neighbors&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;frontier&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;expanded&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;seeds&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# 3. Re-rank by combined graph + semantic relevance
&lt;/span&gt;    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;rank_by_rate_reduction_score&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;expanded&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The graph provides a strong inductive bias about which memories are likely relevant together. Transitive inferred edges (from our inference engine) act as "shortcuts" on the manifold—they represent compressed paths through the relationship space.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Inference Rules as Compression Operators
&lt;/h3&gt;

&lt;p&gt;Looking at our inference rules through the Berkeley lens reveals their true nature:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Rule&lt;/th&gt;
&lt;th&gt;Compression Type&lt;/th&gt;
&lt;th&gt;Information-Theoretic Interpretation&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;transitive_depends_on&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Path compression&lt;/td&gt;
&lt;td&gt;Removes redundant traversals&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;type_from_solves&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Semantic compression&lt;/td&gt;
&lt;td&gt;Encodes role in single label&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;co_occurrence_affinity&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Structural compression&lt;/td&gt;
&lt;td&gt;Finds shared subspace membership&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;reverse_solves&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Bidirectional encoding&lt;/td&gt;
&lt;td&gt;Enables queries from either direction&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;This suggests new rules we could add:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Cluster compression: memories with 5+ shared tags → merge into summary entity
&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;high_overlap_merge&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;condition&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;shared_tags &amp;gt;= 5 AND same_project&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;action&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;create_summary_entity&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;compression_type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;lossy&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;# Temporal compression: daily memories → weekly summary
&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;temporal_consolidation&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;condition&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;age &amp;gt; 7 days AND same_topic&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;action&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;compress_to_summary&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;preserve&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;key_decisions&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;blockers&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;outcomes&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;p&gt;The inference engine proves that compression principles work for AI memory. We're now exploring how to go deeper:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Already Built:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Transitive inference (path compression)&lt;/li&gt;
&lt;li&gt;Type inference (semantic compression)&lt;/li&gt;
&lt;li&gt;Co-occurrence affinity (structural compression)&lt;/li&gt;
&lt;li&gt;Confidence-based cleanup (rate-distortion tradeoff)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Actively Researching:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Type-aware embeddings&lt;/strong&gt;: How do we train type-specific projections without massive labeled datasets? Self-supervised approaches using the graph structure itself look promising—the relationships encode supervision signal.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Consolidation triggers&lt;/strong&gt;: When should observations compress into entity updates? Too aggressive and we lose detail; too conservative and memory bloats. We're exploring information-theoretic triggers: consolidate when adding a new observation wouldn't increase the coding rate significantly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cross-type retrieval&lt;/strong&gt;: How do we handle queries spanning multiple types? "Find solutions that Alice worked on for projects using Python" crosses Person, Solution, and Project subspaces. The graph edges provide a natural answer—they're the transformation operators between subspaces.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Rate reduction metrics&lt;/strong&gt;: Can we use MCR² as an actual quality metric during development? This would let us evaluate memory architectures principally, not just via retrieval benchmarks. Early experiments suggest the metric correlates well with subjective "memory usefulness."&lt;/p&gt;




&lt;h2&gt;
  
  
  The Bigger Picture
&lt;/h2&gt;

&lt;p&gt;The Berkeley book's framework suggests something profound: compression isn't just a storage optimization—it's the fundamental operation of learning itself.&lt;/p&gt;

&lt;p&gt;Every time you explain a complex system by its key components, you're doing rate reduction. Every time you recognize a pattern across multiple experiences, you're finding low-dimensional structure. Every time you organize knowledge hierarchically, you're building a manifold.&lt;/p&gt;

&lt;p&gt;For AI memory systems, this means we shouldn't think of memory as a retrieval problem with storage attached. &lt;strong&gt;Memory is a compression problem with retrieval as a side effect.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Get the compression right—find the true low-dimensional structure in the experiences—and retrieval becomes almost trivial. The memories that matter will naturally cluster together, and the right memory for a given context will be the one that reduces uncertainty the most.&lt;/p&gt;

&lt;p&gt;MemoryGraph's inference engine is our first step down this path. Transitive compression, type inference, and co-occurrence affinity are all compression operators—they find structure and make it explicit. The next steps—type-aware embeddings, progressive consolidation, and rate-reduction-guided retrieval—push further toward a system that truly learns from experiences rather than just storing them.&lt;/p&gt;

&lt;p&gt;The theoretical grounding matters because it tells us &lt;em&gt;why&lt;/em&gt; these approaches work and &lt;em&gt;what&lt;/em&gt; to try next. When your inference rule creates a transitive edge, that's not just a database optimization—it's the system discovering that a three-hop path can be compressed to one hop without losing essential information. When type inference labels a memory as "solution," it's compressing the memory's structural role into a single bit of semantic information.&lt;/p&gt;

&lt;p&gt;That's what we're building toward with MemoryGraph. Not just a database that stores what AI assistants experience, but a system that truly learns from those experiences—compressing them into structured knowledge that makes every future interaction more intelligent.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;MemoryGraph is an open-source graph-based memory system for AI assistants. Check out the project at &lt;a href="https://github.com/gregorydickson/memory-graph" rel="noopener noreferrer"&gt;github.com/gregorydickson/memory-graph&lt;/a&gt; or try the cloud platform at &lt;a href="https://memorygraph.dev" rel="noopener noreferrer"&gt;memorygraph.dev&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;The Berkeley textbook "Learning Deep Representations of Data Distributions" is freely available at &lt;a href="https://ma-lab-berkeley.github.io/deep-representation-learning-book/" rel="noopener noreferrer"&gt;ma-lab-berkeley.github.io/deep-representation-learning-book&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;




&lt;h3&gt;
  
  
  References
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Buchanan, S., Pai, D., Wang, P., &amp;amp; Ma, Y. (2025). &lt;em&gt;Learning Deep Representations of Data Distributions&lt;/em&gt;. Online textbook.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Chan, K.H.R., Yu, Y., You, C., Qi, H., Wright, J., &amp;amp; Ma, Y. (2022). ReduNet: A White-box Deep Network from the Principle of Maximizing Rate Reduction. &lt;em&gt;Journal of Machine Learning Research&lt;/em&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Yu, Y., Buchanan, S., Pai, D., et al. (2024). White-Box Transformers via Sparse Rate Reduction: Compression Is All There Is? &lt;em&gt;Journal of Machine Learning Research&lt;/em&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;




&lt;p&gt;&lt;em&gt;Tags: #AI #Memory #LLM #MachineLearning #KnowledgeGraphs #Compression #Research&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>memory</category>
      <category>agents</category>
    </item>
    <item>
      <title>Building a Prolog-Inspired Inference Engine for AI Coding Agents</title>
      <dc:creator>Gregory Dickson</dc:creator>
      <pubDate>Thu, 11 Dec 2025 14:32:03 +0000</pubDate>
      <link>https://dev.to/gregory_dickson_6dd6e2b55/building-a-prolog-inspired-inference-engine-for-ai-coding-agents-48l</link>
      <guid>https://dev.to/gregory_dickson_6dd6e2b55/building-a-prolog-inspired-inference-engine-for-ai-coding-agents-48l</guid>
      <description>&lt;p&gt;&lt;em&gt;How we're adding automatic relationship discovery to MemoryGraph using FalkorDB and good old-fashioned AI techniques&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;If you've ever used an AI coding assistant like Claude Code, Cursor, or GitHub Copilot, you've probably noticed they have the memory of a goldfish. Every session starts fresh. You explain your project architecture, your coding conventions, your preferences—and tomorrow, you do it all again.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/gregorydickson/memory-graph" rel="noopener noreferrer"&gt;MemoryGraph&lt;/a&gt; is an open-source project that gives AI coding agents persistent, graph-based memory. But storing memories is only half the battle. The real magic happens when the system starts &lt;em&gt;understanding&lt;/em&gt; the connections you didn't explicitly create.&lt;/p&gt;

&lt;p&gt;We're building an inference engine. Here's how.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Prolog Connection
&lt;/h2&gt;

&lt;p&gt;Before diving into implementation, let's talk about why graph databases and inference feel so natural together.&lt;/p&gt;

&lt;p&gt;If you squint at a graph database query, it looks suspiciously like Prolog, my first (well, actually my second) programming language:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight prolog"&gt;&lt;code&gt;&lt;span class="c1"&gt;% Prolog&lt;/span&gt;
&lt;span class="ss"&gt;parent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;tom&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;mary&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
&lt;span class="ss"&gt;parent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;mary&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;ann&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
&lt;span class="ss"&gt;grandparent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;X&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Z&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;:-&lt;/span&gt; &lt;span class="ss"&gt;parent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;X&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Y&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="ss"&gt;parent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;Y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Z&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cypher"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Cypher (FalkorDB/Neo4j)&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="ss"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tom&lt;/span&gt;&lt;span class="ss"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="ss"&gt;[&lt;/span&gt;&lt;span class="nc"&gt;:PARENT&lt;/span&gt;&lt;span class="ss"&gt;]&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="ss"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mary&lt;/span&gt;&lt;span class="ss"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="ss"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mary&lt;/span&gt;&lt;span class="ss"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="ss"&gt;[&lt;/span&gt;&lt;span class="nc"&gt;:PARENT&lt;/span&gt;&lt;span class="ss"&gt;]&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="ss"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ann&lt;/span&gt;&lt;span class="ss"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;// Query: find grandparents&lt;/span&gt;
&lt;span class="k"&gt;MATCH&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="ss"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="ss"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="ss"&gt;[&lt;/span&gt;&lt;span class="nc"&gt;:PARENT&lt;/span&gt;&lt;span class="ss"&gt;]&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="ss"&gt;(&lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="ss"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="ss"&gt;[&lt;/span&gt;&lt;span class="nc"&gt;:PARENT&lt;/span&gt;&lt;span class="ss"&gt;]&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="ss"&gt;(&lt;/span&gt;&lt;span class="n"&gt;z&lt;/span&gt;&lt;span class="ss"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;RETURN&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="ss"&gt;,&lt;/span&gt; &lt;span class="n"&gt;z&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Both are fundamentally declarative. You describe &lt;em&gt;what&lt;/em&gt; you want, not &lt;em&gt;how&lt;/em&gt; to find it. The system figures out the traversal.&lt;/p&gt;

&lt;p&gt;This insight shapes our entire approach: &lt;strong&gt;inference rules are just parameterized Cypher queries&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  What We're Building
&lt;/h2&gt;

&lt;p&gt;When a developer stores a memory like "Auth Service depends on JWT Library," and later adds "JWT Library depends on Crypto Utils," we want the system to automatically understand that Auth Service &lt;em&gt;transitively&lt;/em&gt; depends on Crypto Utils.&lt;/p&gt;

&lt;p&gt;More ambitiously:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;If something &lt;code&gt;SOLVES&lt;/code&gt; a &lt;code&gt;problem&lt;/code&gt;, it's probably a &lt;code&gt;solution&lt;/code&gt; (type inference)&lt;/li&gt;
&lt;li&gt;If two memories share 3+ connections, they're probably related (affinity detection)&lt;/li&gt;
&lt;li&gt;If A &lt;code&gt;CAUSES&lt;/code&gt; problem P and B &lt;code&gt;SOLVES&lt;/code&gt; P, then A and B are connected (problem-solution bridging)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All of this should happen automatically, in the background, without slowing down writes.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why FalkorDB?
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://www.falkordb.com/" rel="noopener noreferrer"&gt;FalkorDB&lt;/a&gt; is a Redis-based graph database with full Cypher support. For MemoryGraph, it offers:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Speed&lt;/strong&gt; - Sub-millisecond queries for the graph sizes we're dealing with&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cypher&lt;/strong&gt; - Industry-standard query language, portable knowledge&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Redis Protocol&lt;/strong&gt; - Easy deployment, familiar ops story&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;In-Database Processing&lt;/strong&gt; - We can push inference logic into the database itself&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That last point is crucial. Instead of:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Read data → Process in Python → Write results
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We can do:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Run Cypher query that reads AND writes in one transaction
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  The Architecture
&lt;/h2&gt;

&lt;p&gt;Here's the high-level flow:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌─────────────────────────────────────────────────────────────┐
│                     Memory Write                            │
└──────────────────────────┬──────────────────────────────────┘
                           │
                           ▼
┌─────────────────────────────────────────────────────────────┐
│              Store Memory (immediate)                        │
│              Return to User (&amp;lt; 10ms)                        │
│              Queue for Inference                            │
└──────────────────────────┬──────────────────────────────────┘
                           │ (async, batched)
                           ▼
┌─────────────────────────────────────────────────────────────┐
│                   Inference Engine                          │
│  ┌─────────────────────────────────────────────────────┐   │
│  │  Rule: transitive_depends_on                         │   │
│  │  Rule: type_from_solves                              │   │
│  │  Rule: co_occurrence_affinity                        │   │
│  └─────────────────────────────────────────────────────┘   │
│                           │                                 │
│                           ▼                                 │
│              FalkorDB (Cypher execution)                    │
│              Creates edges marked {inferred: true}          │
└─────────────────────────────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The key insight: &lt;strong&gt;inference is decoupled from the write path&lt;/strong&gt;. Users never wait for inference to complete.&lt;/p&gt;

&lt;h2&gt;
  
  
  Defining Rules as Cypher
&lt;/h2&gt;

&lt;p&gt;Each inference rule is a self-contained Cypher query that:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Matches a pattern involving the triggering memory&lt;/li&gt;
&lt;li&gt;Creates new relationships (marked as inferred)&lt;/li&gt;
&lt;li&gt;Returns a count for logging&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Here's the transitive dependency rule:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nc"&gt;InferenceRule&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;transitive_depends_on&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Propagate DEPENDS_ON transitively (A→B→C means A→C)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
        MATCH path = (a:Memory {id: $memory_id})-[:DEPENDS_ON*2..3]-&amp;gt;(c:Memory)
        WHERE a &amp;lt;&amp;gt; c
          AND NOT (a)-[:DEPENDS_ON {inferred: true}]-&amp;gt;(c)
        WITH a, c, length(path) as depth
        MERGE (a)-[r:DEPENDS_ON {
            inferred: true,
            rule: &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;transitive_depends_on&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;,
            depth: depth,
            confidence: 1.0 / depth,
            created_at: datetime()
        }]-&amp;gt;(c)
        RETURN count(r) as created
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let's break this down:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;[:DEPENDS_ON*2..3]&lt;/code&gt; - Match paths of length 2-3 (we don't want infinite chains)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;WHERE NOT (a)-[:DEPENDS_ON {inferred: true}]-&amp;gt;(c)&lt;/code&gt; - Don't create duplicates&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;confidence: 1.0 / depth&lt;/code&gt; - Longer chains = lower confidence&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;inferred: true&lt;/code&gt; - Mark it so we can filter/weight differently in search&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The beauty is that this runs entirely in FalkorDB. No data leaves the database.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Batching Strategy
&lt;/h2&gt;

&lt;p&gt;Running inference on every single write would be wasteful. If a developer is rapidly creating memories, we'd thrash the database with redundant queries.&lt;/p&gt;

&lt;p&gt;Instead, we batch:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;InferenceService&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;batch_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;batch_delay&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;2.0&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pending_memories&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;deque&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;maxlen&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;batch_delay&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;batch_delay&lt;/span&gt;

    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;queue_for_inference&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;memory_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Called on every write - returns immediately&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pending_memories&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;memory_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_processor_running&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create_task&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;_batch_processor&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;

    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_batch_processor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Waits, then processes accumulated memories&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
        &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;batch_delay&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# Let writes accumulate
&lt;/span&gt;
        &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pending_memories&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;batch&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pending_memories&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;popleft&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; 
                     &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;batch_size&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pending_memories&lt;/span&gt;&lt;span class="p"&gt;)))]&lt;/span&gt;
            &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;_run_inference_batch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;batch&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The 2-second delay means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Single writes: 2 second latency to inference (invisible to user)&lt;/li&gt;
&lt;li&gt;Burst writes: All processed together efficiently&lt;/li&gt;
&lt;li&gt;No thundering herd on the database&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Inference-Aware Search
&lt;/h2&gt;

&lt;p&gt;Creating inferred edges is useless if search doesn't leverage them. Here's how we blend explicit and inferred relationships:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cypher"&gt;&lt;code&gt;&lt;span class="k"&gt;MATCH&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="ss"&gt;(&lt;/span&gt;&lt;span class="py"&gt;m:&lt;/span&gt;&lt;span class="n"&gt;Memory&lt;/span&gt;&lt;span class="ss"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;m.title&lt;/span&gt; &lt;span class="ow"&gt;CONTAINS&lt;/span&gt; &lt;span class="n"&gt;$query&lt;/span&gt; &lt;span class="ow"&gt;OR&lt;/span&gt; &lt;span class="n"&gt;m.content&lt;/span&gt; &lt;span class="ow"&gt;CONTAINS&lt;/span&gt; &lt;span class="n"&gt;$query&lt;/span&gt;
&lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="ss"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;1.0&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;base_score&lt;/span&gt;

&lt;span class="c1"&gt;// Boost from explicit (user-created) relationships&lt;/span&gt;
&lt;span class="k"&gt;OPTIONAL&lt;/span&gt; &lt;span class="k"&gt;MATCH&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="ss"&gt;(&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="ss"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="ss"&gt;[&lt;/span&gt;&lt;span class="n"&gt;r1&lt;/span&gt;&lt;span class="ss"&gt;]&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="ss"&gt;(&lt;/span&gt;&lt;span class="py"&gt;related:&lt;/span&gt;&lt;span class="n"&gt;Memory&lt;/span&gt;&lt;span class="ss"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;r1.inferred&lt;/span&gt; &lt;span class="k"&gt;IS&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt; &lt;span class="ow"&gt;OR&lt;/span&gt; &lt;span class="n"&gt;r1.inferred&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;false&lt;/span&gt;
&lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="ss"&gt;,&lt;/span&gt; &lt;span class="n"&gt;base_score&lt;/span&gt;&lt;span class="ss"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;count&lt;/span&gt;&lt;span class="ss"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r1&lt;/span&gt;&lt;span class="ss"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.3&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;explicit_boost&lt;/span&gt;

&lt;span class="c1"&gt;// Smaller boost from inferred relationships&lt;/span&gt;
&lt;span class="k"&gt;OPTIONAL&lt;/span&gt; &lt;span class="k"&gt;MATCH&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="ss"&gt;(&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="ss"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="ss"&gt;[&lt;/span&gt;&lt;span class="n"&gt;r2&lt;/span&gt; &lt;span class="ss"&gt;{&lt;/span&gt;&lt;span class="py"&gt;inferred:&lt;/span&gt; &lt;span class="k"&gt;true&lt;/span&gt;&lt;span class="ss"&gt;}]&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="ss"&gt;(&lt;/span&gt;&lt;span class="py"&gt;inferred:&lt;/span&gt;&lt;span class="n"&gt;Memory&lt;/span&gt;&lt;span class="ss"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="ss"&gt;,&lt;/span&gt; &lt;span class="n"&gt;base_score&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;explicit_boost&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nf"&gt;count&lt;/span&gt;&lt;span class="ss"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r2&lt;/span&gt;&lt;span class="ss"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.15&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nf"&gt;coalesce&lt;/span&gt;&lt;span class="ss"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r2.confidence&lt;/span&gt;&lt;span class="ss"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="ss"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;final_score&lt;/span&gt;

&lt;span class="k"&gt;RETURN&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="ss"&gt;,&lt;/span&gt; &lt;span class="n"&gt;final_score&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;final_score&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Explicit relationships get more weight (0.3) than inferred ones (0.15), and inferred edges are further scaled by their confidence score. This means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;User-created connections are always prioritized&lt;/li&gt;
&lt;li&gt;High-confidence inferences boost results&lt;/li&gt;
&lt;li&gt;Low-confidence guesses have minimal impact&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The Type Inference Pattern
&lt;/h2&gt;

&lt;p&gt;One of my favorite rules is type inference. MemoryGraph has a taxonomy of memory types: &lt;code&gt;solution&lt;/code&gt;, &lt;code&gt;problem&lt;/code&gt;, &lt;code&gt;error&lt;/code&gt;, &lt;code&gt;fix&lt;/code&gt;, &lt;code&gt;pattern&lt;/code&gt;, etc.&lt;/p&gt;

&lt;p&gt;But users often just dump content without classifying it. The inference engine can help:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nc"&gt;InferenceRule&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type_from_solves&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;If memory SOLVES a problem, infer it&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s a solution&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
        MATCH (m:Memory {id: $memory_id})-[:SOLVES]-&amp;gt;(p:Memory {type: &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;problem&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;})
        WHERE m.type = &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;general&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt; OR m.type IS NULL
        SET m.type = &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;solution&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;, m.type_inferred = true
        RETURN m.id as updated
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you create a memory and link it with &lt;code&gt;SOLVES&lt;/code&gt; to something typed as &lt;code&gt;problem&lt;/code&gt;, the system infers your memory is a &lt;code&gt;solution&lt;/code&gt;. Simple, but surprisingly useful for keeping the knowledge graph clean.&lt;/p&gt;

&lt;h2&gt;
  
  
  Cloud-Only Premium Features
&lt;/h2&gt;

&lt;p&gt;We're building MemoryGraph as open-source with a cloud offering. Some inference rules only make sense (or are only cost-effective) in the cloud:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Affinity Detection&lt;/strong&gt; - Find memories that share multiple connections:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cypher"&gt;&lt;code&gt;&lt;span class="k"&gt;MATCH&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="ss"&gt;(&lt;/span&gt;&lt;span class="py"&gt;a:&lt;/span&gt;&lt;span class="n"&gt;Memory&lt;/span&gt; &lt;span class="ss"&gt;{&lt;/span&gt;&lt;span class="py"&gt;id:&lt;/span&gt; &lt;span class="n"&gt;$memory_id&lt;/span&gt;&lt;span class="ss"&gt;})&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="ss"&gt;[&lt;/span&gt;&lt;span class="n"&gt;r1&lt;/span&gt;&lt;span class="ss"&gt;]&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="ss"&gt;(&lt;/span&gt;&lt;span class="py"&gt;common:&lt;/span&gt;&lt;span class="n"&gt;Memory&lt;/span&gt;&lt;span class="ss"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="ss"&gt;[&lt;/span&gt;&lt;span class="n"&gt;r2&lt;/span&gt;&lt;span class="ss"&gt;]&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="ss"&gt;(&lt;/span&gt;&lt;span class="py"&gt;b:&lt;/span&gt;&lt;span class="n"&gt;Memory&lt;/span&gt;&lt;span class="ss"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;
  &lt;span class="ow"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;NOT&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="ss"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="ss"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="ss"&gt;[&lt;/span&gt;&lt;span class="nc"&gt;:AFFINITY&lt;/span&gt;&lt;span class="ss"&gt;]&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="ss"&gt;(&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="ss"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="ss"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="ss"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;count&lt;/span&gt;&lt;span class="ss"&gt;(&lt;/span&gt;&lt;span class="k"&gt;DISTINCT&lt;/span&gt; &lt;span class="n"&gt;common&lt;/span&gt;&lt;span class="ss"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;shared_count&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;shared_count&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;
&lt;span class="k"&gt;MERGE&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="ss"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="ss"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="ss"&gt;[&lt;/span&gt;&lt;span class="py"&gt;r:&lt;/span&gt;&lt;span class="n"&gt;AFFINITY&lt;/span&gt; &lt;span class="ss"&gt;{&lt;/span&gt;
    &lt;span class="py"&gt;inferred:&lt;/span&gt; &lt;span class="k"&gt;true&lt;/span&gt;&lt;span class="ss"&gt;,&lt;/span&gt;
    &lt;span class="py"&gt;strength:&lt;/span&gt; &lt;span class="nf"&gt;toFloat&lt;/span&gt;&lt;span class="ss"&gt;(&lt;/span&gt;&lt;span class="n"&gt;shared_count&lt;/span&gt;&lt;span class="ss"&gt;)&lt;/span&gt; &lt;span class="err"&gt;/&lt;/span&gt; &lt;span class="mf"&gt;5.0&lt;/span&gt;&lt;span class="ss"&gt;,&lt;/span&gt;
    &lt;span class="py"&gt;shared_connections:&lt;/span&gt; &lt;span class="n"&gt;shared_count&lt;/span&gt;
&lt;span class="ss"&gt;}]&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="ss"&gt;(&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="ss"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Problem-Solution Bridging&lt;/strong&gt; - Connect root causes to their fixes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cypher"&gt;&lt;code&gt;&lt;span class="k"&gt;MATCH&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="ss"&gt;(&lt;/span&gt;&lt;span class="py"&gt;cause:&lt;/span&gt;&lt;span class="n"&gt;Memory&lt;/span&gt;&lt;span class="ss"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="ss"&gt;[&lt;/span&gt;&lt;span class="nc"&gt;:CAUSES&lt;/span&gt;&lt;span class="ss"&gt;]&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="ss"&gt;(&lt;/span&gt;&lt;span class="py"&gt;problem:&lt;/span&gt;&lt;span class="n"&gt;Memory&lt;/span&gt;&lt;span class="ss"&gt;)&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="ss"&gt;[&lt;/span&gt;&lt;span class="nc"&gt;:SOLVES&lt;/span&gt;&lt;span class="ss"&gt;]&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="ss"&gt;(&lt;/span&gt;&lt;span class="py"&gt;solution:&lt;/span&gt;&lt;span class="n"&gt;Memory&lt;/span&gt;&lt;span class="ss"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;cause&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;solution&lt;/span&gt;
&lt;span class="k"&gt;MERGE&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="ss"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cause&lt;/span&gt;&lt;span class="ss"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="ss"&gt;[&lt;/span&gt;&lt;span class="nc"&gt;:ADDRESSED_BY&lt;/span&gt; &lt;span class="ss"&gt;{&lt;/span&gt;&lt;span class="py"&gt;inferred:&lt;/span&gt; &lt;span class="k"&gt;true&lt;/span&gt;&lt;span class="ss"&gt;,&lt;/span&gt; &lt;span class="py"&gt;via_problem:&lt;/span&gt; &lt;span class="n"&gt;problem.id&lt;/span&gt;&lt;span class="ss"&gt;}]&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="ss"&gt;(&lt;/span&gt;&lt;span class="n"&gt;solution&lt;/span&gt;&lt;span class="ss"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;These run asynchronously in the cloud, invisible to users but enriching their knowledge graphs over time.&lt;/p&gt;

&lt;h2&gt;
  
  
  Handling False Positives
&lt;/h2&gt;

&lt;p&gt;Inference isn't perfect. Sometimes the system will create relationships that don't make sense. Our mitigations:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Everything is marked&lt;/strong&gt; - &lt;code&gt;{inferred: true}&lt;/code&gt; means we can always filter it out&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Confidence scores&lt;/strong&gt; - Lower confidence = less impact on search&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Periodic cleanup&lt;/strong&gt; - A background job prunes old, low-confidence edges:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cypher"&gt;&lt;code&gt;&lt;span class="k"&gt;MATCH&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="ss"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="ss"&gt;[&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="ss"&gt;{&lt;/span&gt;&lt;span class="py"&gt;inferred:&lt;/span&gt; &lt;span class="k"&gt;true&lt;/span&gt;&lt;span class="ss"&gt;}]&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="ss"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;r.confidence&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mf"&gt;0.3&lt;/span&gt;
  &lt;span class="ow"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;r.created_at&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="ss"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;duration&lt;/span&gt;&lt;span class="ss"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'P30D'&lt;/span&gt;&lt;span class="ss"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;DELETE&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;User feedback&lt;/strong&gt; - Future: let users thumbs-down bad inferences, feeding back into rule tuning&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;p&gt;This inference engine is the foundation for more ambitious features:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;LLM-Powered Classification&lt;/strong&gt; - For memories the rules can't classify, use a small/fast model:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;llm_classify_memory&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;memory_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;memory&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_memory&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;memory_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;memory&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;general&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;anthropic&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-3-haiku-20240307&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;max_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Classify: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;memory&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;memory&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="p"&gt;}]&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="c1"&gt;# Update memory type based on response
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Temporal Inference&lt;/strong&gt; - Memories created close together with shared tags are probably related:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cypher"&gt;&lt;code&gt;&lt;span class="k"&gt;MATCH&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="ss"&gt;(&lt;/span&gt;&lt;span class="py"&gt;a:&lt;/span&gt;&lt;span class="n"&gt;Memory&lt;/span&gt;&lt;span class="ss"&gt;),&lt;/span&gt; &lt;span class="ss"&gt;(&lt;/span&gt;&lt;span class="py"&gt;b:&lt;/span&gt;&lt;span class="n"&gt;Memory&lt;/span&gt;&lt;span class="ss"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;duration.between&lt;/span&gt;&lt;span class="ss"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a.created_at&lt;/span&gt;&lt;span class="ss"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b.created_at&lt;/span&gt;&lt;span class="ss"&gt;)&lt;/span&gt;&lt;span class="n"&gt;.minutes&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;30&lt;/span&gt;
  &lt;span class="ow"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;any&lt;/span&gt;&lt;span class="ss"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tag&lt;/span&gt; &lt;span class="ow"&gt;IN&lt;/span&gt; &lt;span class="n"&gt;a.tags&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;tag&lt;/span&gt; &lt;span class="ow"&gt;IN&lt;/span&gt; &lt;span class="n"&gt;b.tags&lt;/span&gt;&lt;span class="ss"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;MERGE&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="ss"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="ss"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="ss"&gt;[&lt;/span&gt;&lt;span class="nc"&gt;:TEMPORAL_PROXIMITY&lt;/span&gt; &lt;span class="ss"&gt;{&lt;/span&gt;&lt;span class="py"&gt;inferred:&lt;/span&gt; &lt;span class="k"&gt;true&lt;/span&gt;&lt;span class="ss"&gt;}]&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="ss"&gt;(&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="ss"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Cross-Project Patterns&lt;/strong&gt; - In enterprise deployments, detect common problem-solution pairs across teams (anonymized, of course).&lt;/p&gt;

&lt;h2&gt;
  
  
  Try It Yourself
&lt;/h2&gt;

&lt;p&gt;MemoryGraph is open source: &lt;a href="https://github.com/gregorydickson/memory-graph" rel="noopener noreferrer"&gt;github.com/gregorydickson/memory-graph&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The inference engine is coming in the next release. If you're building AI-powered developer tools and need persistent memory, give it a look. &lt;/p&gt;

&lt;p&gt;Or if you just think graph databases and declarative inference are cool (they are), come contribute. We're always looking for new rules to add to the engine.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Building MemoryGraph at &lt;a href="https://memorygraph.dev" rel="noopener noreferrer"&gt;memorygraph.dev&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Tags:&lt;/strong&gt; &lt;code&gt;#ai&lt;/code&gt; &lt;code&gt;#graphdatabase&lt;/code&gt; &lt;code&gt;#python&lt;/code&gt; &lt;code&gt;#opensource&lt;/code&gt; &lt;code&gt;#devtools&lt;/code&gt; &lt;code&gt;falkordb&lt;/code&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Discussion Questions
&lt;/h2&gt;

&lt;p&gt;I'd love to hear from the community:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;What inference rules would be useful for your workflow?&lt;/strong&gt; We're always looking for patterns that would help developers.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;How do you handle "memory" in your AI tooling today?&lt;/strong&gt; Curious what workarounds people have built.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Prolog nostalgia?&lt;/strong&gt; Anyone else miss declarative logic programming? There's something elegant about it that modern systems have lost.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Drop a comment below or find me on &lt;a href="https://github.com/gregorydickson" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>agents</category>
      <category>database</category>
      <category>ai</category>
      <category>architecture</category>
    </item>
    <item>
      <title>Context-Efficient AI Coding Agent Memory Without Abandoning MCP</title>
      <dc:creator>Gregory Dickson</dc:creator>
      <pubDate>Sun, 07 Dec 2025 13:29:38 +0000</pubDate>
      <link>https://dev.to/gregory_dickson_6dd6e2b55/memorygraph-context-efficient-mcp-memory-without-abandoning-mcp-ii0</link>
      <guid>https://dev.to/gregory_dickson_6dd6e2b55/memorygraph-context-efficient-mcp-memory-without-abandoning-mcp-ii0</guid>
      <description>&lt;h2&gt;
  
  
  The Context Window Problem Is Real
&lt;/h2&gt;

&lt;p&gt;If you've worked with AI coding agents, you've experienced it: your agent slows down, token costs spike, or tasks fail because the context window hit its limit. A recent article highlighted this pain point, showing that just three popular MCP servers consumed 26% of a coding agent's context window.&lt;/p&gt;

&lt;p&gt;The culprit? MCP servers that pre-load dozens of tool definitions into the context window whether the agent needs them or not. Some memory solutions expose 40+ tools, each with verbose descriptions that compound into thousands of tokens before your agent even starts working.&lt;/p&gt;

&lt;p&gt;This is a legitimate concern. But the solution isn't to abandon MCP entirely—it's to design MCP servers with context efficiency as a first-class requirement.&lt;/p&gt;

&lt;h2&gt;
  
  
  MemoryGraph's Approach: Judicious Tool Design
&lt;/h2&gt;

&lt;p&gt;MemoryGraph takes a different path. Instead of offering every conceivable memory operation as a separate tool, we designed around a core principle: &lt;strong&gt;minimum tools, maximum capability&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Numbers
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Solution&lt;/th&gt;
&lt;th&gt;Default Tools&lt;/th&gt;
&lt;th&gt;Typical Context Usage&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Heavy MCP servers&lt;/td&gt;
&lt;td&gt;40+ tools&lt;/td&gt;
&lt;td&gt;20-30% of context&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MemoryGraph Core&lt;/td&gt;
&lt;td&gt;9 tools&lt;/td&gt;
&lt;td&gt;~2-3% of context&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MemoryGraph Extended&lt;/td&gt;
&lt;td&gt;11 tools&lt;/td&gt;
&lt;td&gt;~3-4% of context&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Nine tools. That's it for 95% of use cases. And each tool description is crafted to be concise while remaining discoverable.&lt;/p&gt;

&lt;h3&gt;
  
  
  Tool Profiles: Context When You Need It
&lt;/h3&gt;

&lt;p&gt;We implemented &lt;strong&gt;tool profiles&lt;/strong&gt; to give users explicit control over their context footprint:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Core mode (default) - 9 tools, minimal context&lt;/span&gt;
memorygraph

&lt;span class="c"&gt;# Extended mode - 11 tools, adds statistics and advanced queries&lt;/span&gt;
memorygraph &lt;span class="nt"&gt;--profile&lt;/span&gt; extended
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Most users never need extended mode. The core profile provides:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Memory CRUD&lt;/strong&gt;: store, get, update, delete, search (5 tools)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Relationships&lt;/strong&gt;: create links, traverse graph (2 tools)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Discovery&lt;/strong&gt;: fuzzy recall, session briefings (2 tools)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That covers storing solutions, linking problems to fixes, recalling past work, and catching up on project context. Extended mode adds database statistics and complex relationship queries—useful for power users, but users have to opt-in.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why We Didn't Abandon MCP
&lt;/h2&gt;

&lt;p&gt;Some memory vendors have moved from MCP to CLI interfaces, arguing that agents are "natively fluent" in shell commands. While there's merit to this argument, we believe it conflates two separate concerns:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. The Problem Isn't MCP—It's Tool Sprawl
&lt;/h3&gt;

&lt;p&gt;MCP itself is a thin protocol. The context cost comes from tool &lt;em&gt;definitions&lt;/em&gt;, not the protocol. A well-designed MCP server with 9 concise tools uses far less context than a CLI wrapper with verbose &lt;code&gt;--help&lt;/code&gt; output that gets loaded anyway.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. CLI Loses MCP's Ecosystem Benefits
&lt;/h3&gt;

&lt;p&gt;MCP provides:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Standardized tool discovery across clients (Claude Code, Cursor, VS Code Copilot, etc.)&lt;/li&gt;
&lt;li&gt;Consistent installation and configuration&lt;/li&gt;
&lt;li&gt;Client-managed tool execution and error handling&lt;/li&gt;
&lt;li&gt;Cross-platform support without wrapper scripts&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Moving to CLI means maintaining separate integrations for each coding agent, handling authentication differently per environment, and losing the growing MCP ecosystem.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Graph Relationships Are Our Value Prop
&lt;/h3&gt;

&lt;p&gt;A CLI interface forces flat, document-style storage. MemoryGraph's power comes from &lt;strong&gt;typed relationships&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[timeout_fix] --CAUSES--&amp;gt; [memory_leak] --SOLVED_BY--&amp;gt; [connection_pooling]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Query: "What happened with retry logic?" returns the full causal chain—something flat storage can't provide efficiently.&lt;/p&gt;

&lt;h2&gt;
  
  
  Concise Tool Descriptions: How We Stay Lean
&lt;/h2&gt;

&lt;p&gt;Here's an example of how we approach tool descriptions. Compare a verbose approach:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Verbose (typical)
recall_memories: This is the recommended starting point for recalling past 
memories and learnings from your knowledge graph. This tool wraps search_memories with optimal defaults for natural language queries. When you want to search for past work, solutions, problems, patterns, or project context, use this tool first. 
It automatically uses fuzzy matching which handles plurals, tenses, and case 
variations. Results always include relationship context showing what connects to what. This is simpler than search_memories for common use cases because it has optimized default settings applied. Pass a natural language query and optionally filter by memory types or project path. Results are ranked by relevance with match quality hints included.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Versus our actual approach:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Concise (MemoryGraph)
recall_memories: Search memories with fuzzy matching and relationship context. 
Best starting point for "What did we learn about X?" queries. Handles plurals and tenses automatically.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Same capability, fraction of the tokens.&lt;/p&gt;

&lt;h2&gt;
  
  
  What This Means in Practice
&lt;/h2&gt;

&lt;p&gt;When you add MemoryGraph to Claude Code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;claude mcp add &lt;span class="nt"&gt;--scope&lt;/span&gt; user memorygraph &lt;span class="nt"&gt;--&lt;/span&gt; memorygraph
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Your agent gets persistent memory with graph relationships while consuming roughly &lt;strong&gt;2-3% of context&lt;/strong&gt;—leaving the rest for your actual work.&lt;/p&gt;

&lt;p&gt;Compare that to solutions that consume 20%+ before you've even asked a question.&lt;/p&gt;

&lt;h2&gt;
  
  
  Our Commitment
&lt;/h2&gt;

&lt;p&gt;We're adding context footprint tracking to our documentation and website. Users should know exactly how much context each MCP server costs before they install it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Upcoming improvements:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Published context token counts per tool profile&lt;/li&gt;
&lt;li&gt;Tool description audit to minimize verbosity&lt;/li&gt;
&lt;li&gt;Continued focus on "minimum tools, maximum capability"&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;The context window problem is real, but MCP isn't the enemy. Tool sprawl is. MemoryGraph proves you can have powerful graph-based memory with relationship tracking while staying context-efficient.&lt;/p&gt;

&lt;p&gt;Nine tools. Graph relationships. 2-3% context usage.&lt;/p&gt;

&lt;p&gt;That's the balance we've found.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Get started:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pipx &lt;span class="nb"&gt;install &lt;/span&gt;memorygraphMCP
claude mcp add &lt;span class="nt"&gt;--scope&lt;/span&gt; user memorygraph &lt;span class="nt"&gt;--&lt;/span&gt; memorygraph
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://github.com/gregorydickson/memory-graph" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt; | &lt;a href="https://github.com/gregorydickson/memory-graph/blob/main/docs/" rel="noopener noreferrer"&gt;Documentation&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;MemoryGraph is an open-source MCP memory server for AI coding agents. We believe context efficiency and powerful features aren't mutually exclusive.&lt;/em&gt;&lt;/p&gt;

</description>
    </item>
    <item>
      <title>MCP Gets Tasks: A Game-Changer for Long-Running AI Operations</title>
      <dc:creator>Gregory Dickson</dc:creator>
      <pubDate>Fri, 05 Dec 2025 19:54:25 +0000</pubDate>
      <link>https://dev.to/gregory_dickson_6dd6e2b55/mcp-gets-tasks-a-game-changer-for-long-running-ai-operations-2kel</link>
      <guid>https://dev.to/gregory_dickson_6dd6e2b55/mcp-gets-tasks-a-game-changer-for-long-running-ai-operations-2kel</guid>
      <description>&lt;p&gt;&lt;strong&gt;The Model Context Protocol is adding async task support—and it's going to fundamentally change how AI agents handle complex, time-intensive work.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;December 5, 2024 - Gregory Dickson&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;The Model Context Protocol (MCP) has been revolutionizing how AI agents interact with external tools and data sources since its release. But there's been a significant limitation holding back more sophisticated use cases: every tool call blocks until completion. No way to check progress. No way to retrieve results later. No way to handle operations that take minutes or hours.&lt;/p&gt;

&lt;p&gt;That's about to change.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem: When Tool Calls Take Too Long
&lt;/h2&gt;

&lt;p&gt;If you've built any serious MCP server, you've hit this wall. Maybe you're wrapping a workflow API that processes large datasets. Maybe you're orchestrating multiple AI agents. Maybe you're running comprehensive test suites or complex data analysis pipelines.&lt;/p&gt;

&lt;p&gt;The current pattern forces an uncomfortable choice:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Option 1: Block and wait&lt;/strong&gt; - Your agent sits idle for minutes or hours while a single operation completes. If the connection drops, you lose everything and start over.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Option 2: Split into multiple tools&lt;/strong&gt; - Create &lt;code&gt;start_job&lt;/code&gt;, &lt;code&gt;check_status&lt;/code&gt;, and &lt;code&gt;get_result&lt;/code&gt; tools. Now you're relying on prompt engineering to make the agent poll correctly. Sometimes it works. Sometimes the agent "forgets" to check back. Sometimes it hallucinates job IDs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Option 3: Build a polling server&lt;/strong&gt; - Your MCP server does nothing but poll other services. You're just moving the problem around.&lt;/p&gt;

&lt;p&gt;None of these are good solutions.&lt;/p&gt;

&lt;h2&gt;
  
  
  Enter SEP-1686: Tasks
&lt;/h2&gt;

&lt;p&gt;The MCP core team has accepted &lt;a href="https://github.com/modelcontextprotocol/modelcontextprotocol/issues/1686" rel="noopener noreferrer"&gt;SEP-1686&lt;/a&gt;, a specification for &lt;strong&gt;first-class async task support&lt;/strong&gt; in the protocol. And it's elegant.&lt;/p&gt;

&lt;h3&gt;
  
  
  How It Works
&lt;/h3&gt;

&lt;p&gt;Tasks introduce a three-phase pattern:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// 1. CREATE - Start the operation, get task metadata back immediately&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;task&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;callTool&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;analyze_dataset&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;arguments&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;dataset&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;large_file.csv&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; 
  &lt;span class="na"&gt;createTask&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;ttl&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;3600000&lt;/span&gt; &lt;span class="c1"&gt;// Keep results for 1 hour&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="c1"&gt;// Returns immediately with taskId: "abc-123", status: "working"&lt;/span&gt;

&lt;span class="c1"&gt;// 2. POLL - Check status when you want&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getTaskStatus&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;task&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;taskId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="c1"&gt;// { status: "working", pollInterval: 5000 }&lt;/span&gt;

&lt;span class="c1"&gt;// 3. RETRIEVE - Get the actual result when complete&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getTaskResult&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;task&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;taskId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="c1"&gt;// Returns the actual tool call result&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Your host application stays in control. The agent can do other work. You can show progress in your UI. If the connection drops, you can reconnect and fetch results using the task ID.&lt;/p&gt;

&lt;h3&gt;
  
  
  Key Features
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Generic Primitive&lt;/strong&gt;: This isn't just for tools. Tasks work with &lt;em&gt;any&lt;/em&gt; MCP request type—tools, resources, prompts, sampling, you name it. The same pattern, consistently applied across the entire protocol.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Idempotent &amp;amp; Retry-Safe&lt;/strong&gt;: Client-generated task IDs mean you can safely retry requests without creating duplicate tasks. Perfect for unreliable networks.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Resource Management&lt;/strong&gt;: Built-in TTL (time-to-live) support means servers can clean up completed tasks automatically. No memory leaks from abandoned operations.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Graceful Degradation&lt;/strong&gt;: Servers that don't support tasks just ignore the metadata and return results normally. No version negotiation needed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Bidirectional&lt;/strong&gt;: Either clients &lt;em&gt;or&lt;/em&gt; servers can create tasks. A server can task-ify a sampling request that needs user input, for example.&lt;/p&gt;

&lt;h2&gt;
  
  
  Real-World Impact
&lt;/h2&gt;

&lt;p&gt;Amazon cited several production use cases driving this specification:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Healthcare &amp;amp; Life Sciences&lt;/strong&gt;: Molecular analysis jobs processing hundreds of thousands of data points over several hours&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Enterprise Automation&lt;/strong&gt;: SDLC workflows spanning multiple teams and systems&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Code Migration&lt;/strong&gt;: Automated refactoring across large codebases with dependency analysis&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Test Execution&lt;/strong&gt;: Comprehensive test suites with thousands of cases&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multi-Agent Systems&lt;/strong&gt;: Agents that need to coordinate without blocking each other&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These aren't edge cases. These are fundamental patterns for production AI applications.&lt;/p&gt;

&lt;h2&gt;
  
  
  MemoryGraph + Tasks = Powerful Memory Operations
&lt;/h2&gt;

&lt;p&gt;I'm particularly excited about this because of what it means for &lt;a href="https://github.com/gregorydickson/memorygraph" rel="noopener noreferrer"&gt;MemoryGraph&lt;/a&gt;, my open-source MCP memory server.&lt;/p&gt;

&lt;p&gt;MemoryGraph uses graph-based relationship tracking to give AI agents sophisticated, queryable memory. But some operations are computationally expensive:&lt;/p&gt;

&lt;h3&gt;
  
  
  Complex Graph Traversals
&lt;/h3&gt;

&lt;p&gt;Finding all solutions related to a problem, following relationship chains, or exploring multi-hop connections across hundreds of memories—these queries can take time, especially as the graph grows.&lt;/p&gt;

&lt;h3&gt;
  
  
  Batch Memory Operations
&lt;/h3&gt;

&lt;p&gt;Importing large conversation histories, bulk relationship creation, or memory consolidation operations that process hundreds of nodes.&lt;/p&gt;

&lt;h3&gt;
  
  
  Semantic Search at Scale
&lt;/h3&gt;

&lt;p&gt;Vector similarity searches across large memory sets, especially with complex filtering or multi-term queries.&lt;/p&gt;

&lt;h3&gt;
  
  
  Memory Curation
&lt;/h3&gt;

&lt;p&gt;Background cleanup operations, relationship strength decay, automated summarization of old memories, or graph optimization.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;With task support, MemoryGraph can:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Return immediately&lt;/strong&gt; for expensive queries, letting agents continue other work&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Provide progress updates&lt;/strong&gt; as complex traversals complete&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cache results&lt;/strong&gt; so agents can retrieve them multiple times without re-computation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Support background operations&lt;/strong&gt; without blocking the conversation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Enable proactive polling&lt;/strong&gt; from host applications to show memory operation status in the UI&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Here's what it might look like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Start a complex memory query&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;task&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;callTool&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;memorygraph:recall_memories&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;arguments&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;authentication solutions&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;maxDepth&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// Deep relationship traversal&lt;/span&gt;
    &lt;span class="na"&gt;includeRelated&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;createTask&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="c1"&gt;// Agent continues with other tasks...&lt;/span&gt;

&lt;span class="c1"&gt;// Host application polls and shows progress&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getTaskStatus&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;task&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;taskId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="c1"&gt;// UI shows: "Searching memories... (traversed 450 nodes)"&lt;/span&gt;

&lt;span class="c1"&gt;// Retrieve when ready&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;memories&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getTaskResult&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;task&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;taskId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Timeline &amp;amp; Implementation
&lt;/h2&gt;

&lt;p&gt;The specification is &lt;strong&gt;already accepted&lt;/strong&gt; and targeted for the &lt;strong&gt;DRAFT-2025-11-25&lt;/strong&gt; milestone. The full spec text is available in &lt;a href="https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1732" rel="noopener noreferrer"&gt;PR #1732&lt;/a&gt;, and SDK updates are in progress.&lt;/p&gt;

&lt;p&gt;MemoryGraph will add task support once the official SDKs land. I'm planning to start with:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Semantic search operations&lt;/strong&gt; (initial implementation)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Complex graph traversals&lt;/strong&gt; with relationship depth &amp;gt; 2&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Batch imports&lt;/strong&gt; for large memory sets&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Background memory curation&lt;/strong&gt; operations&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Future Possibilities
&lt;/h2&gt;

&lt;p&gt;The task primitive is designed to be extensible. Future enhancements being discussed include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Push notifications&lt;/strong&gt; for state changes (no polling needed)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Intermediate results&lt;/strong&gt; (stream partial outputs as they're available)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Nested tasks&lt;/strong&gt; (hierarchical workflows with parent/child relationships)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These would enable even more sophisticated patterns, like a memory query that spawns subtasks for different relationship types, or real-time streaming of search results as they're found.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why This Matters
&lt;/h2&gt;

&lt;p&gt;Tasks aren't just a nice-to-have feature. They're a fundamental building block that unlocks entire categories of MCP applications that weren't practically feasible before.&lt;/p&gt;

&lt;p&gt;You can now build MCP servers that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Wrap existing workflow APIs cleanly&lt;/li&gt;
&lt;li&gt;Handle genuinely long-running operations (minutes to hours)&lt;/li&gt;
&lt;li&gt;Support sophisticated multi-step processes&lt;/li&gt;
&lt;li&gt;Enable true agent concurrency&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And you can do it with a &lt;strong&gt;standard, well-defined protocol pattern&lt;/strong&gt; instead of ad-hoc conventions that every server implements differently.&lt;/p&gt;

&lt;p&gt;For MemoryGraph specifically, this means more sophisticated memory operations without blocking agents, better user experience in host applications, and the ability to handle much larger memory graphs efficiently.&lt;/p&gt;




&lt;h2&gt;
  
  
  Get Involved
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Follow the specification&lt;/strong&gt;: &lt;a href="https://github.com/modelcontextprotocol/modelcontextprotocol/issues/1686" rel="noopener noreferrer"&gt;SEP-1686 on GitHub&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Try MemoryGraph&lt;/strong&gt;: &lt;a href="https://github.com/gregorydickson/memorygraph" rel="noopener noreferrer"&gt;github.com/gregorydickson/memorygraph&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The future of AI tooling is async. And it's arriving in MCP.
&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;Gregory Dickson is a Senior AI Developer &amp;amp; Solutions Architect specializing in AI/ML development and cloud architecture. He's the creator of MemoryGraph, an open-source MCP memory server using graph-based relationship tracking.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>agents</category>
      <category>ai</category>
      <category>mcp</category>
    </item>
  </channel>
</rss>
