<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Jeff Witters</title>
    <description>The latest articles on DEV Community by Jeff Witters (@jeffwitters).</description>
    <link>https://dev.to/jeffwitters</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3818487%2Fc00fb410-4ea8-4b63-bf99-c0144f6d17e1.jpeg</url>
      <title>DEV Community: Jeff Witters</title>
      <link>https://dev.to/jeffwitters</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/jeffwitters"/>
    <language>en</language>
    <item>
      <title>Give Your AI Agent Persistent Memory in 30 Seconds</title>
      <dc:creator>Jeff Witters</dc:creator>
      <pubDate>Thu, 12 Mar 2026 12:04:39 +0000</pubDate>
      <link>https://dev.to/jeffwitters/give-your-ai-agent-persistent-memory-in-30-seconds-29c6</link>
      <guid>https://dev.to/jeffwitters/give-your-ai-agent-persistent-memory-in-30-seconds-29c6</guid>
      <description>&lt;p&gt;Your AI agent is brilliant in the moment. Then the session ends, and it forgets everything.&lt;/p&gt;

&lt;p&gt;Every new conversation starts from zero. It doesn't remember that you prefer TypeScript. It doesn't know the architectural decision you made last week. It doesn't know it already tried that approach and it didn't work.&lt;/p&gt;

&lt;p&gt;This is the agent memory problem. Most solutions involve vector databases, API keys, and cloud infrastructure. &lt;strong&gt;engram-mcp&lt;/strong&gt; doesn't.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx &lt;span class="nt"&gt;-y&lt;/span&gt; @cartisien/engram-mcp
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. Persistent semantic memory for Claude Desktop, Cursor, Windsurf, or any MCP client — in 30 seconds, no signup, no cloud.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem with Agent Memory Today
&lt;/h2&gt;

&lt;p&gt;The common approach: dump everything into a vector store. Every message, every fact, every decision — stored with equal confidence, recalled with equal weight.&lt;/p&gt;

&lt;p&gt;The result after a few weeks: contradictory facts at similar confidence scores. The agent remembers both "user prefers dark mode" and "user prefers light mode" and doesn't know which is current. It remembers five different attempts at the same problem with no signal about which one worked.&lt;/p&gt;

&lt;p&gt;More memory doesn't automatically mean better memory.&lt;/p&gt;




&lt;h2&gt;
  
  
  How engram-mcp Works
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Storage:&lt;/strong&gt; SQLite. No server to run, no port to expose, no Docker container. The database lives at &lt;code&gt;~/.engram/memory.db&lt;/code&gt; by default. It's a file.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Semantic search:&lt;/strong&gt; Uses &lt;a href="https://ollama.ai" rel="noopener noreferrer"&gt;Ollama&lt;/a&gt; + &lt;code&gt;nomic-embed-text&lt;/code&gt; locally. Embeddings are computed on your machine. No API key, no data leaving your box.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fallback:&lt;/strong&gt; If Ollama isn't running, it falls back to keyword search automatically. You never get a crash — you get a slightly less smart search.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Sessions:&lt;/strong&gt; Memories are scoped by &lt;code&gt;sessionId&lt;/code&gt;. Your Claude Desktop agent, your Cursor agent, and your personal automation scripts can each have their own isolated memory space — or share one.&lt;/p&gt;




&lt;h2&gt;
  
  
  Setup: Claude Desktop
&lt;/h2&gt;

&lt;p&gt;Add to &lt;code&gt;~/Library/Application Support/Claude/claude_desktop_config.json&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mcpServers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"engram"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"npx"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"-y"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"@cartisien/engram-mcp"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Restart Claude Desktop. You now have 5 new tools:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;What it does&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;remember&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Store a memory with automatic embedding&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;recall&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Semantic search — "what did I say about auth?"&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;history&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Recent entries in chronological order&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;forget&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Delete a memory, a session, or entries before a date&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;stats&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;How much is in there, embedding coverage, etc.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Setup: Cursor / Windsurf
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mcpServers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"engram"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"npx"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"-y"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"@cartisien/engram-mcp"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  What It Looks Like in Practice
&lt;/h2&gt;

&lt;p&gt;After a few sessions, your agent builds up real context:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;recall(sessionId="myproject", query="why did we choose SQLite over postgres?")

→ "Chose SQLite to avoid infra requirements for local-first tools. 
   Postgres adds a server dependency that breaks the zero-config install story."
   (similarity: 0.91)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The agent can ask itself questions about its own history and get back coherent, relevant answers — not a flat list of everything it's ever stored.&lt;/p&gt;




&lt;h2&gt;
  
  
  Local-First Is a Real Constraint
&lt;/h2&gt;

&lt;p&gt;Most agent memory tools are hosted. That's a fine choice for many teams, but it means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Your agent's memory (and by extension, context about your work) lives on someone else's server&lt;/li&gt;
&lt;li&gt;There's a network call on every recall&lt;/li&gt;
&lt;li&gt;There's a subscription or usage cost as memory grows&lt;/li&gt;
&lt;li&gt;There's a new service to keep running&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;engram-mcp stores everything in a SQLite file on your machine. The embedding model runs locally via Ollama. The search happens in-process. There's no external service to maintain, authenticate against, or pay for.&lt;/p&gt;




&lt;h2&gt;
  
  
  Semantic Search Without a Vector Database
&lt;/h2&gt;

&lt;p&gt;This is the part people ask about most.&lt;/p&gt;

&lt;p&gt;Traditional approach: run a vector database (Qdrant, Pinecone, Chroma), push embeddings into it, query by cosine similarity. Works great, but requires running and maintaining a separate process.&lt;/p&gt;

&lt;p&gt;Our approach: store embeddings as raw floats in SQLite, compute cosine similarity in the application layer at query time. For personal-scale memory (thousands to tens of thousands of entries), this is fast enough — and it eliminates the dependency.&lt;/p&gt;

&lt;p&gt;Ollama generates the embeddings locally. &lt;code&gt;nomic-embed-text&lt;/code&gt; is small, fast, and good at semantic similarity on natural-language text.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# One-time setup&lt;/span&gt;
ollama pull nomic-embed-text
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After that, &lt;code&gt;recall&lt;/code&gt; finds semantically similar memories even when the exact words don't match.&lt;/p&gt;




&lt;h2&gt;
  
  
  Install
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Verify it works&lt;/span&gt;
npx &lt;span class="nt"&gt;-y&lt;/span&gt; @cartisien/engram-mcp
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.npmjs.com/package/@cartisien/engram-mcp" rel="noopener noreferrer"&gt;npm&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/Cartisien/engram-mcp" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://glama.ai/mcp/servers/Cartisien/engram-mcp" rel="noopener noreferrer"&gt;Glama listing&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Built on &lt;a href="https://github.com/Cartisien/engram" rel="noopener noreferrer"&gt;&lt;code&gt;@cartisien/engram&lt;/code&gt;&lt;/a&gt; — the underlying memory SDK if you want to integrate it directly rather than through MCP.&lt;/p&gt;




&lt;p&gt;Curious what other approaches people are using for agent memory. The certainty/contradiction problem in particular — most tools I've seen treat all stored facts as equally valid, which compounds badly over time.&lt;/p&gt;

</description>
      <category>mcp</category>
      <category>showdev</category>
      <category>agents</category>
      <category>typescript</category>
    </item>
    <item>
      <title>I Gave My AI Assistant Permanent Memory — Here's Exactly How</title>
      <dc:creator>Jeff Witters</dc:creator>
      <pubDate>Wed, 11 Mar 2026 17:47:05 +0000</pubDate>
      <link>https://dev.to/jeffwitters/i-gave-my-ai-assistant-permanent-memory-heres-exactly-how-1g8l</link>
      <guid>https://dev.to/jeffwitters/i-gave-my-ai-assistant-permanent-memory-heres-exactly-how-1g8l</guid>
      <description>&lt;p&gt;My AI assistant woke up every morning with no idea who I was.&lt;/p&gt;

&lt;p&gt;I'd been running the same assistant for months. It knew my stack, my projects, my preferences — but only within a session. The next day? Blank slate. Every conversation started with context-dumping. "Here's what we're building. Here's where we left off. Here's what matters."&lt;/p&gt;

&lt;p&gt;I got tired of it. So I built the thing that fixes it.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem With Context Windows
&lt;/h2&gt;

&lt;p&gt;Most people solve AI memory by stuffing everything into the system prompt. Project docs, previous decisions, preferences — all of it, every session.&lt;/p&gt;

&lt;p&gt;This works until it doesn't. Context windows have limits. More importantly, not all memory is equal. You don't need to know &lt;em&gt;everything&lt;/em&gt; — you need the &lt;em&gt;right things&lt;/em&gt; at the right time.&lt;/p&gt;

&lt;p&gt;That's a retrieval problem, not a storage problem.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I Built
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;@cartisien/engram&lt;/code&gt; — persistent, queryable memory for AI assistants. SQLite-backed, TypeScript-first, zero config.&lt;/p&gt;

&lt;p&gt;The core API is intentionally simple:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Engram&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@cartisien/engram&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;memory&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Engram&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;dbPath&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;./assistant.db&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="c1"&gt;// Store something&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;memory&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;remember&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;sessionId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;User is building a federal contracting app in React 19&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;user&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// Retrieve what's relevant&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;context&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;memory&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;recall&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;sessionId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;what are we building?&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. Drop it into any agent loop, any chat handler, any LLM integration.&lt;/p&gt;




&lt;h2&gt;
  
  
  v0.1: Keyword Search
&lt;/h2&gt;

&lt;p&gt;The first version was straightforward. SQLite table, indexes on session + timestamp, LIKE-based keyword matching on recall.&lt;/p&gt;

&lt;p&gt;It worked. But keyword search has the obvious problem — it only finds what you literally asked for. "What are we building?" wouldn't surface a memory stored as "working on GovScout, a federal contracting app."&lt;/p&gt;




&lt;h2&gt;
  
  
  v0.2: Semantic Search via Local Embeddings
&lt;/h2&gt;

&lt;p&gt;This week I shipped v0.2 with semantic search. The key decision: &lt;strong&gt;no external API, no managed vector database&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;I'm running an RTX 5090 with Ollama locally. &lt;code&gt;nomic-embed-text&lt;/code&gt; is already pulled. So the embedding call is a local HTTP request:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;http://localhost:11434/api/embeddings&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;method&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;POST&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;nomic-embed-text&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;text&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;embedding&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt; &lt;span class="c1"&gt;// 768-dim float array&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;On &lt;code&gt;remember()&lt;/code&gt;, we embed the content and store the vector as JSON alongside the memory. On &lt;code&gt;recall()&lt;/code&gt;, we embed the query, compute cosine similarity against every stored vector, and return the top-k by score.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="nf"&gt;cosineSimilarity&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;a&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;[],&lt;/span&gt; &lt;span class="nx"&gt;b&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;[]):&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;dot&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;magA&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;magB&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="nx"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;dot&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nx"&gt;a&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nx"&gt;b&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
    &lt;span class="nx"&gt;magA&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nx"&gt;a&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nx"&gt;a&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
    &lt;span class="nx"&gt;magB&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nx"&gt;b&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nx"&gt;b&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;dot&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sqrt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;magA&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sqrt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;magB&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No sqlite-vss extension. No pgvector. No Pinecone. Just math on JSON arrays.&lt;/p&gt;

&lt;p&gt;For the scale Engram targets (one assistant, thousands of memories — not millions), this is plenty fast.&lt;/p&gt;

&lt;p&gt;If Ollama is unreachable, it falls back to keyword search automatically. No crashes, no config required.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Real Test: Does It Actually Work?
&lt;/h2&gt;

&lt;p&gt;I'm running Engram as my own assistant's memory store right now. Every significant memory gets posted to a local API server (PM2, port 3470) alongside the markdown files I was already using.&lt;/p&gt;

&lt;p&gt;First semantic query I ran:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="s2"&gt;"http://localhost:3470/memory/charli?query=what+projects+is+jeff+working+on&amp;amp;limit=5"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Result:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"content"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Jeff is building GovScout, a federal contracting app..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"similarity"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.525&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"content"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Engram v0.2 ships semantic search via nomic-embed-text..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"similarity"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.396&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;"What projects is Jeff working on" surfaced the GovScout memory (0.53 similarity) over the Engram memory (0.40). No keyword overlap. Right answer.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Architecture Behind It
&lt;/h2&gt;

&lt;p&gt;Engram is part of a larger framework I'm calling the &lt;strong&gt;Cartisien Memory Suite&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;@cartisien/engram&lt;/code&gt; — persistent memory (this)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;@cartisien/extensa&lt;/code&gt; — vector infrastructure layer&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;@cartisien/cogito&lt;/code&gt; — agent identity and wake/sleep lifecycle&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The framing comes from Descartes. &lt;em&gt;Res cogitans&lt;/em&gt; (thinking substance) and &lt;em&gt;res extensa&lt;/em&gt; (extended substance) — mind and body. Cogito is the agent's sense of self. Extensa is the vector layer it thinks through. Engram is where experience accumulates.&lt;/p&gt;

&lt;p&gt;The thesis: agents need more than a context window. They need a substrate of self.&lt;/p&gt;




&lt;h2&gt;
  
  
  Install It
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; @cartisien/engram
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;v0.2.0 is live. GitHub: &lt;a href="https://github.com/Cartisien/engram" rel="noopener noreferrer"&gt;github.com/Cartisien/engram&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Still testing the semantic search in production before pushing to npm — watching for edge cases, checking Ollama timeout handling, making sure the cosine math holds up at scale.&lt;/p&gt;

&lt;p&gt;If you're building agents and hitting the memory problem, I'd love to know what you're doing about it. The space is wide open.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>typescript</category>
      <category>agents</category>
      <category>memory</category>
    </item>
    <item>
      <title>Why AI Agents Forget Everything (And How to Fix It)</title>
      <dc:creator>Jeff Witters</dc:creator>
      <pubDate>Wed, 11 Mar 2026 12:53:56 +0000</pubDate>
      <link>https://dev.to/jeffwitters/why-ai-agents-forget-everything-and-how-to-fix-it-3nf1</link>
      <guid>https://dev.to/jeffwitters/why-ai-agents-forget-everything-and-how-to-fix-it-3nf1</guid>
      <description>&lt;p&gt;If you've built anything with AI agents, you've hit this wall.&lt;/p&gt;

&lt;p&gt;Your agent has a great conversation. It learns the user's preferences, picks up context, starts feeling like it actually &lt;em&gt;knows&lt;/em&gt; something. Then the session ends. Next time? Blank slate. It asks the same onboarding questions. It forgot the user hates dark mode. It forgot the decision you made last Tuesday.&lt;/p&gt;

&lt;p&gt;This isn't a bug — it's how LLMs work. But it doesn't have to be how your &lt;em&gt;agent&lt;/em&gt; works.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem With "Just Use Context"
&lt;/h2&gt;

&lt;p&gt;The first instinct is to dump everything into the context window. Just pass in the conversation history, right?&lt;/p&gt;

&lt;p&gt;This breaks down fast:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Context windows are expensive.&lt;/strong&gt; Sending 50k tokens of history every request adds up.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;They have limits.&lt;/strong&gt; Even 200k tokens isn't infinite — and most relevant history is older than that.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;More context ≠ better recall.&lt;/strong&gt; LLMs are famously bad at finding the needle in a haystack. Relevant information buried in a long context often gets missed.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;They don't persist.&lt;/strong&gt; Context is ephemeral by definition. When the session ends, it's gone.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;What you need isn't more context. You need &lt;em&gt;memory&lt;/em&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Memory vs. Context: What's the Difference?
&lt;/h2&gt;

&lt;p&gt;Context is what the model can see right now. Memory is what the agent &lt;em&gt;retains&lt;/em&gt; across sessions.&lt;/p&gt;

&lt;p&gt;Real memory has properties that raw context doesn't:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Semantic retrieval&lt;/strong&gt; — find related memories by meaning, not just keyword match&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Importance weighting&lt;/strong&gt; — not all information is equally worth remembering&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Persistence&lt;/strong&gt; — survives session resets&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Agent-scoped&lt;/strong&gt; — each agent has its own memory space&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is what we built &lt;code&gt;@cartisien/engram&lt;/code&gt; for.&lt;/p&gt;




&lt;h2&gt;
  
  
  How Engram Works
&lt;/h2&gt;

&lt;p&gt;Engram gives your agent a persistent memory store with semantic search. The API is intentionally simple:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Engram&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@cartisien/engram&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;mem&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Engram&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;adapter&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;memory&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;agentId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;my-agent&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;mem&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;wake&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c1"&gt;// Store something worth remembering&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;mem&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;store&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;The user prefers dark mode and works late at night&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;source&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;observation&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;confidence&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.9&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;importance&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="c1"&gt;// Later — semantic search, not keyword search&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;mem&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;user interface preferences&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="nx"&gt;results&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;forEach&lt;/span&gt;&lt;span class="p"&gt;(({&lt;/span&gt; &lt;span class="nx"&gt;memory&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;score&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;score&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;toFixed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nx"&gt;memory&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;mem&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;wake()&lt;/code&gt; / &lt;code&gt;sleep()&lt;/code&gt; lifecycle mirrors how agents actually work — they come online, do work, and go dormant. Memory initializes on wake and persists on sleep.&lt;/p&gt;




&lt;h2&gt;
  
  
  The &lt;code&gt;importance&lt;/code&gt; Field Actually Matters
&lt;/h2&gt;

&lt;p&gt;One thing that separates this from just "storing strings in a database" is the &lt;code&gt;importance&lt;/code&gt; score.&lt;/p&gt;

&lt;p&gt;Not all memories are equal. "User mentioned they like coffee" is less important than "User said they're about to cancel their subscription." When you retrieve memories, importance influences what surfaces first.&lt;/p&gt;

&lt;p&gt;This is closer to how human memory works — emotionally significant or practically important information is retained more reliably than background noise.&lt;/p&gt;




&lt;h2&gt;
  
  
  Multiple Adapters, Same API
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;adapter&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;memory'&lt;/span&gt;    &lt;span class="s"&gt;// In-process, great for testing&lt;/span&gt;
&lt;span class="na"&gt;adapter&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;sqlite'&lt;/span&gt;    &lt;span class="s"&gt;// Local file, no server needed&lt;/span&gt;
&lt;span class="na"&gt;adapter&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;postgres'&lt;/span&gt;  &lt;span class="s"&gt;// Production scale with pgvector&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Same &lt;code&gt;Engram&lt;/code&gt; interface regardless of where you're storing. Swap adapters without changing your agent code.&lt;/p&gt;




&lt;h2&gt;
  
  
  Where This Fits in the Stack
&lt;/h2&gt;

&lt;p&gt;Engram sits in the middle of the Cartisien memory stack:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Cogito  ←→  Engram  ←→  Extensa
identity    memory      vectors
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Cogito&lt;/strong&gt; handles agent identity and lifecycle. &lt;strong&gt;Extensa&lt;/strong&gt; handles the vector infrastructure and embeddings layer. &lt;strong&gt;Engram&lt;/strong&gt; is the bridge — the part your agent actually talks to.&lt;/p&gt;

&lt;p&gt;You don't need the whole stack. Engram works standalone.&lt;/p&gt;




&lt;h2&gt;
  
  
  Install
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; @cartisien/engram
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Docs and source: &lt;a href="https://github.com/cartisien/engram" rel="noopener noreferrer"&gt;github.com/cartisien/engram&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;If you're building agents that need to remember things across sessions, give it a try. And if you're hitting memory architecture questions that aren't covered here — drop them in the comments. This is a problem worth solving properly.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>agents</category>
      <category>typescript</category>
      <category>machinelearning</category>
    </item>
  </channel>
</rss>
