<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Matteo Tuzi</title>
    <description>The latest articles on DEV Community by Matteo Tuzi (@matteo_tuzi_db01db7df0671).</description>
    <link>https://dev.to/matteo_tuzi_db01db7df0671</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3700758%2Fc8c354d1-8bdf-459e-8629-0ec9059512c1.png</url>
      <title>DEV Community: Matteo Tuzi</title>
      <link>https://dev.to/matteo_tuzi_db01db7df0671</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/matteo_tuzi_db01db7df0671"/>
    <language>en</language>
    <item>
      <title>Beyond RAG: Building Intelligent Memory Systems for AI Agents</title>
      <dc:creator>Matteo Tuzi</dc:creator>
      <pubDate>Thu, 08 Jan 2026 17:53:25 +0000</pubDate>
      <link>https://dev.to/matteo_tuzi_db01db7df0671/beyond-rag-building-intelligent-memory-systems-for-ai-agents-3kah</link>
      <guid>https://dev.to/matteo_tuzi_db01db7df0671/beyond-rag-building-intelligent-memory-systems-for-ai-agents-3kah</guid>
      <description>&lt;p&gt;Vector search alone isn't memory. Real AI memory needs structured extraction, multi-strategy retrieval, and separation of concerns. Here's how i built it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Problem Nobody Talks About&lt;/strong&gt;&lt;br&gt;
You've built a RAG system. Congratulations! You can now retrieve documents based on semantic similarity. But here's the uncomfortable truth:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Vector search ≠ Memory&lt;/strong&gt;&lt;br&gt;
When a user asks "What did I order last week?", your RAG system needs to:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Understand that "last week" is a &lt;em&gt;temporal filter&lt;/em&gt;, not a search term&lt;/li&gt;
&lt;li&gt;Know that "orders" live in a specific &lt;strong&gt;logical bucket&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Recognize the intent is &lt;em&gt;direct lookup&lt;/em&gt;, not semantic search&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;A simple "cosine_similarity(query_embedding, document_embeddings)" won't cut it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Three Paths&lt;/strong&gt;&lt;br&gt;
When building memory for AI agents, developers typically face three choices. We didn't like any of the first two.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. The "Do It Yourself" RAG&lt;/strong&gt;&lt;br&gt;
&lt;strong&gt;The Promise:&lt;/strong&gt; Full control over every component.&lt;br&gt;
&lt;strong&gt;The Reality:&lt;/strong&gt; Weeks of infrastructure work. You're building ingestion pipelines, vector stores, and retrieval logic from scratch. You end up maintaining glue code instead of building your product.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Black-box Memory APIs&lt;/strong&gt;&lt;br&gt;
&lt;strong&gt;The Promise:&lt;/strong&gt; Quick start, "just works".&lt;br&gt;
&lt;strong&gt;The Reality:&lt;/strong&gt; Zero control over the schema. You dump text in, you get text out. You can't define structured fields or custom extraction logic.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. The Middle Ground: memorymodel.dev&lt;/strong&gt;&lt;br&gt;
&lt;strong&gt;The Approach:&lt;/strong&gt; You define the &lt;strong&gt;schema&lt;/strong&gt; (the "Memory Nodes") and the &lt;strong&gt;intent&lt;/strong&gt;, memorymodel.dev handle the &lt;strong&gt;infrastructure&lt;/strong&gt; (embedding, storage, retrieval strategies).&lt;br&gt;
&lt;strong&gt;Result:&lt;/strong&gt; The flexibility of DIY with the speed of a managed service.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Code Reality&lt;/strong&gt;&lt;br&gt;
&lt;strong&gt;DIY RAG Setup&lt;/strong&gt; (simplified — real implementations are worse):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Vector store setup
&lt;/span&gt;&lt;span class="n"&gt;vector_store&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Pinecone&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;PINECONE_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="n"&gt;index&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;vector_store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create_index&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;memories&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dimension&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1536&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Embedding pipeline
&lt;/span&gt;&lt;span class="n"&gt;embedder&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;OpenAIEmbeddings&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text-embedding-3-small&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Extraction logic (you write this)
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;extract_fields&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;openai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-4&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Extract order_id, date, items from: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;choices&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Ingestion (you write this)
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;ingest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;fields&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;extract_fields&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;embedder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;embed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;upsert&lt;/span&gt;&lt;span class="p"&gt;([(&lt;/span&gt;&lt;span class="n"&gt;fields&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;order_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;fields&lt;/span&gt;&lt;span class="p"&gt;)])&lt;/span&gt;

&lt;span class="c1"&gt;# Retrieval with temporal logic (you write this)
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# TODO: Parse dates, detect intent, handle entity lookups...
&lt;/span&gt;    &lt;span class="k"&gt;pass&lt;/span&gt;  &lt;span class="c1"&gt;# Good luck.
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;memorymodel.dev&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;memory_model&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;MemoryClient&lt;/span&gt;

&lt;span class="n"&gt;memory&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;MemoryClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;your-key&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cluster_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;your-cluster&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;memory&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Customer ordered 3 units of SKU-789 on 2025-01-15&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;memory&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;orders from last week&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# Just works.
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;The Architecture&lt;/strong&gt;&lt;br&gt;
Here's what's actually needed for production-grade AI memory:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1w85odggw9nt7atpecos.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1w85odggw9nt7atpecos.png" alt="Architecture Graph" width="800" height="819"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Let's break down the key components.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Memory Nodes: Beyond Flat Vector Stores&lt;/strong&gt;&lt;br&gt;
The core abstraction is the &lt;strong&gt;Memory Node&lt;/strong&gt; - a logical classification with its own extraction schema.&lt;/p&gt;

&lt;p&gt;Instead of dumping everything into one vector collection, you define nodes like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;UserProfile
├── extraction_prompt: "Extract user_id, plan, balance..."
├── embedding_template: "User {{user_id}} on {{plan}} plan"
└── fields: [user_id, plan, balance, activity]

OrderHistory  
├── extraction_prompt: "Extract order_id, items, total..."
├── embedding_template: "Order {{order_id}}: {{items}}"
└── fields: [order_id, items, total, date]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Why this matters:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Each node has its own &lt;strong&gt;LLM extraction prompt&lt;/strong&gt; - structured data, not raw text&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;embedding template&lt;/strong&gt; controls &lt;strong&gt;what gets vectorized&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Retrieval can target specific nodes or let the system decide&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;2. The Relevance Router: Intent-Aware Retrieval&lt;/strong&gt;&lt;br&gt;
When a query comes in, we don't just embed and search. First, the &lt;strong&gt;Relevance Router&lt;/strong&gt; determines which Memory Nodes are semantically relevant.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fckb179240aap7ni95da4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fckb179240aap7ni95da4.png" alt="Relevance Graph" width="800" height="256"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The router returns relevance scores (0-1) for each available node. Only high-scoring nodes are queried, reducing noise and improving precision.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key insight:&lt;/strong&gt; This is &lt;strong&gt;zero-shot routing&lt;/strong&gt; - no training required, just node names that semantically describe their content.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Performance note:&lt;/strong&gt; The router uses a fast model (Gemini Flash) with aggressive in-memory caching. Repeated or similar queries hit the cache, keeping latency low.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Multi-Strategy Retrieval&lt;/strong&gt;&lt;br&gt;
Here's where it gets interesting. MemoryModel don't rely on a single retrieval method. The system detects &lt;strong&gt;query intent&lt;/strong&gt; and selects the appropriate strategy:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Semantic Intent:&lt;/strong&gt; "Tell me about the refund policy"&lt;br&gt;
Uses standard &lt;strong&gt;Vector similarity search&lt;/strong&gt;.&lt;br&gt;
&lt;strong&gt;Direct Lookup:&lt;/strong&gt; "Show order #12345"&lt;br&gt;
Uses &lt;strong&gt;Exact match&lt;/strong&gt; on the &lt;code&gt;order_id&lt;/code&gt; field.&lt;br&gt;
&lt;strong&gt;Temporal Query:&lt;/strong&gt; "What happened last week?"&lt;br&gt;
Uses &lt;strong&gt;Date range filters&lt;/strong&gt; combined with vector search.&lt;br&gt;
&lt;strong&gt;Entity Anchor:&lt;/strong&gt; "Everything about Company X"&lt;br&gt;
Uses &lt;strong&gt;Entity filtering&lt;/strong&gt; + expansion.&lt;br&gt;
&lt;strong&gt;Visual Search:&lt;/strong&gt; [Image input]&lt;br&gt;
Uses &lt;strong&gt;Multimodal embedding&lt;/strong&gt; search.&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;Resonator&lt;/strong&gt; orchestrates these strategies:&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1gfs18pys0zeyww5trdj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1gfs18pys0zeyww5trdj.png" alt="Resonator Graph" width="800" height="387"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Intent Detection Examples&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// "Show me order ORD-12345" → Direct Lookup&lt;/span&gt;
&lt;span class="nf"&gt;detectIntent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Show me order ORD-12345&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;// → { type: 'direct', key: 'order_id', value: 'ORD-12345' }&lt;/span&gt;

&lt;span class="c1"&gt;// "What happened before January 15th?" → Temporal&lt;/span&gt;
&lt;span class="nf"&gt;detectIntent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;What happened before January 15th?&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  
&lt;span class="c1"&gt;// → { type: 'temporal', range: { lte: '2025-01-15' } }&lt;/span&gt;

&lt;span class="c1"&gt;// "Tell me about Acme Corp" → Entity Anchor&lt;/span&gt;
&lt;span class="nf"&gt;detectIntent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Tell me about Acme Corp&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;// → { type: 'anchor', entity: 'Acme Corp' }&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;4. Automatic Field Injection&lt;/strong&gt;&lt;br&gt;
When you ingest data, the Extraction Engine doesn't just run your custom prompt. It &lt;strong&gt;automatically injects system fields&lt;/strong&gt; that power advanced retrieval:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;"entity_anchors[]" — Business entities, structural references&lt;/li&gt;
&lt;li&gt;"happened_at" — Temporal context (resolved from relative dates)&lt;/li&gt;
&lt;li&gt;"context_ref_id" — Links to parent documents&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You define what you need. &lt;strong&gt;Memorymodel.dev&lt;/strong&gt; add what retrieval needs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. The Developer Experience&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Console: Visual Schema Design&lt;/strong&gt;&lt;br&gt;
Configure your memory architecture visually:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Create &lt;strong&gt;Projects&lt;/strong&gt; and &lt;strong&gt;Clusters&lt;/strong&gt; (logical environments)&lt;/li&gt;
&lt;li&gt;Define &lt;strong&gt;Memory Nodes&lt;/strong&gt; with extraction schemas&lt;/li&gt;
&lt;li&gt;Configure which nodes are active for ingestion vs retrieval&lt;/li&gt;
&lt;li&gt;Monitor memory usage and analytics&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;SDK: 4 Lines to Production&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;MemoryClient&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;memory-model&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;memory&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;MemoryClient&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; 
  &lt;span class="na"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;your-key&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;clusterId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;your-cluster&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; 
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="c1"&gt;// Ingest&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;memory&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Customer ordered 3 units of SKU-789 on 2025-01-15&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// Retrieve  &lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;memory&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;recent orders&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Python SDK also available:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;memory_model&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;MemoryClient&lt;/span&gt;

&lt;span class="n"&gt;memory&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;MemoryClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;your-key&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cluster_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;your-cluster&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;memory&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Customer ordered 3 units of SKU-789&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;memory&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;recent orders&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Real-World Pattern: Asymmetric Clusters&lt;/strong&gt;&lt;br&gt;
Here's a powerful pattern i've seen in production: &lt;strong&gt;the same Memory Node can behave differently across clusters&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use Case: Customer Care + Sales Intelligence&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frnkjtjevma7dtrwq33zi.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frnkjtjevma7dtrwq33zi.png" alt="Customer Care Sales" width="800" height="310"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What's happening:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The Customer Care agent ingests conversations into all three nodes&lt;/li&gt;
&lt;li&gt;It only &lt;em&gt;retrieves&lt;/em&gt; from &lt;code&gt;AppKnowledge&lt;/code&gt; and &lt;code&gt;UserProfile&lt;/code&gt; to answer questions&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;SalesInsight&lt;/code&gt; node is &lt;strong&gt;ingestion-only&lt;/strong&gt; in this cluster&lt;/li&gt;
&lt;li&gt;A separate Sales Ops cluster has &lt;code&gt;SalesInsight&lt;/code&gt; as &lt;strong&gt;extraction-only&lt;/strong&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Result:&lt;/strong&gt; Customer support automatically generates sales leads without any additional code. The sales team sees a real-time dashboard of opportunities with structured fields:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"target_user_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"user001"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"implied_need"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"High withdrawal limits for travel"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"life_event_trigger"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Traveling to Japan"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"suggested_product"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Metal Plan"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"conversion_probability"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"High"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;What Runs Behind the Scenes&lt;/strong&gt;&lt;br&gt;
This is where memorymodel.dev stops being "just a managed RAG" and becomes an &lt;strong&gt;autonomous memory system&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Architect: Self-Tuning Retrieval&lt;/strong&gt;&lt;br&gt;
Every 24 hours, the Architect analyzes your retrieval logs and &lt;strong&gt;automatically adjusts system parameters&lt;/strong&gt; using Control Theory principles (PID-like dampening):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;"meta_threshold" — How aggressive to filter low-relevance results&lt;/li&gt;
&lt;li&gt;"cluster_margin" — Tolerance for centroid-based matching&lt;/li&gt;
&lt;li&gt;"top_k" limits — How many results to return per strategy&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It adapts based on your actual usage patterns. High-precision workload? It tightens thresholds. Recall-heavy queries? It loosens them. &lt;strong&gt;You don't configure this. It learns.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Dreamer: Meta-Insight Synthesis&lt;/strong&gt;&lt;br&gt;
The Dreamer periodically scans recent memories and &lt;strong&gt;generates higher-order insights&lt;/strong&gt; that weren't explicitly stated:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Input memories:
- "Bought organic vegetables at Whole Foods"
- "Searched for plant-based protein recipes"  
- "Cancelled steakhouse reservation"

Generated insight:
→ "User is likely transitioning to a vegetarian lifestyle"
  (confidence: 0.85)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;These synthesized insights become searchable memories themselves, enabling queries like &lt;strong&gt;"What are my behavioral patterns?"&lt;/strong&gt; to return meaningful results.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Maintenance Workers&lt;/strong&gt;&lt;br&gt;
&lt;strong&gt;Deduplication:&lt;/strong&gt; Detects semantically similar memories and merges them to avoid redundancy.&lt;br&gt;
&lt;strong&gt;Consolidation:&lt;/strong&gt; Compacts old memories into summaries to prevent database bloat.&lt;br&gt;
&lt;strong&gt;Cleanup:&lt;/strong&gt; Prunes stale or low-confidence data to maintain high quality.&lt;br&gt;
&lt;strong&gt;Centroid Calculation:&lt;/strong&gt; Pre-computes cluster centroids to enable ultra-fast retrieval.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The result:&lt;/strong&gt; A memory system that doesn't just store — it &lt;strong&gt;evolves&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;When to Use This&lt;/strong&gt;&lt;br&gt;
✅ &lt;strong&gt;AI Agents&lt;/strong&gt; that need persistent, structured memory across conversations&lt;br&gt;
✅ &lt;strong&gt;Document Intelligence&lt;/strong&gt; — contracts, manuals, knowledge bases&lt;br&gt;
✅ &lt;strong&gt;Customer Support Bots&lt;/strong&gt; with user context and history&lt;br&gt;
✅ &lt;strong&gt;Personal Assistants&lt;/strong&gt; that remember preferences and events &lt;/p&gt;

&lt;p&gt;❌ &lt;strong&gt;Simple Q&amp;amp;A&lt;/strong&gt; over static documents (vanilla RAG is fine)&lt;br&gt;
❌ &lt;strong&gt;Real-time streaming&lt;/strong&gt; use cases (we're optimized for persistence)&lt;/p&gt;

&lt;p&gt;💡 &lt;strong&gt;Memorymodel.dev approach:&lt;/strong&gt; I optimize for &lt;strong&gt;retrieval quality&lt;/strong&gt; and &lt;strong&gt;developer experience&lt;/strong&gt; over raw throughput. If you need sub-100ms ingestion for IoT streams, use a time-series DB. If you need AI agents that actually remember context correctly, i've got you.&lt;/p&gt;

&lt;p&gt;📊 See memorymodel's &lt;a href="https://docs.memorymodel.dev/benchmarks" rel="noopener noreferrer"&gt;LoCoMo benchmark results&lt;/a&gt; for a detailed accuracy comparison with other memory systems.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Getting Started&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Sign up at &lt;a href="https://memorymodel.dev" rel="noopener noreferrer"&gt;memorymodel.dev&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Create a Project and Cluster via the Console&lt;/li&gt;
&lt;li&gt;Define your first Memory Node&lt;/li&gt;
&lt;li&gt;Install the SDK: "npm install memory-model" or "pip install memory-model"&lt;/li&gt;
&lt;li&gt;Start ingesting and retrieving&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Conclusion&lt;/strong&gt;&lt;br&gt;
Building AI memory isn't about finding the nearest vector. It's about:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Structured extraction&lt;/strong&gt; — LLM-powered field parsing&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Intent-aware retrieval&lt;/strong&gt; — knowing &lt;strong&gt;how&lt;/strong&gt; to search, not just &lt;strong&gt;what&lt;/strong&gt; to search&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Schema control&lt;/strong&gt; — you define the shape of your memories&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Managed complexity&lt;/strong&gt; — workers, deduplication, and optimization handled for you&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;I built memorymodel.dev because i needed this myself. Now it's available for everyone.&lt;/strong&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>rag</category>
      <category>memory</category>
      <category>productivity</category>
    </item>
  </channel>
</rss>
