<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Suraj Goyal</title>
    <description>The latest articles on DEV Community by Suraj Goyal (@surajkgoyal).</description>
    <link>https://dev.to/surajkgoyal</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3955630%2F26ae49d7-1313-4d98-9621-a456bcd55e3c.jpg</url>
      <title>DEV Community: Suraj Goyal</title>
      <link>https://dev.to/surajkgoyal</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/surajkgoyal"/>
    <language>en</language>
    <item>
      <title>I built an MCP server that gives AI persistent memory of your SQL database</title>
      <dc:creator>Suraj Goyal</dc:creator>
      <pubDate>Thu, 28 May 2026 03:58:56 +0000</pubDate>
      <link>https://dev.to/surajkgoyal/i-built-an-mcp-server-that-gives-ai-persistent-memory-of-your-sql-database-3d2a</link>
      <guid>https://dev.to/surajkgoyal/i-built-an-mcp-server-that-gives-ai-persistent-memory-of-your-sql-database-3d2a</guid>
      <description>&lt;p&gt;A while ago I tried to build a local coding assistant. I downloaded Qwen3, fired it up on my MacBook with 16GB of RAM, and within a day realized the output quality was nowhere close to Claude or GPT-5. The model could &lt;em&gt;fit&lt;/em&gt;. It just couldn't &lt;em&gt;compete&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;So I changed the question.&lt;/p&gt;

&lt;p&gt;If I can't make the model smarter on my hardware, can I make what I feed it smarter?&lt;/p&gt;

&lt;h2&gt;
  
  
  Where the tokens actually go
&lt;/h2&gt;

&lt;p&gt;I started watching where my Claude / Cursor / Copilot sessions actually spent their tokens. The surprise: most of it wasn't reasoning. It was &lt;strong&gt;lookup&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Every fresh chat about my company's database re-discovered the same things:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;What does &lt;code&gt;status = 3&lt;/code&gt; mean? (cancelled)&lt;/li&gt;
&lt;li&gt;How does &lt;code&gt;orders&lt;/code&gt; join to &lt;code&gt;users&lt;/code&gt;? (&lt;code&gt;orders.user_id → users.id&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;What's that cryptic &lt;code&gt;JobStatus&lt;/code&gt; enum? (a dozen integer codes nobody remembers)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The model figured it out, the session ended, and tomorrow it figured it out &lt;em&gt;again&lt;/em&gt;. Same tokens, same latency, every single time. The expensive part of working with an AI wasn't the thinking — it was re-teaching it things it had already learned yesterday.&lt;/p&gt;

&lt;p&gt;There's a lot of attention right now on trimming AI &lt;strong&gt;output&lt;/strong&gt; tokens (talk like a caveman, strip the pleasantries, etc.). But in my workflow the bigger leak was on the &lt;strong&gt;input&lt;/strong&gt; side: paying full token cost every session to re-establish context that never changed.&lt;/p&gt;

&lt;h2&gt;
  
  
  "Memory" isn't a feature, it's an architecture question
&lt;/h2&gt;

&lt;p&gt;AI clients are starting to bolt on "memory" features. But they're proprietary, opaque, and locked to one tool. Claude's memory doesn't help Cursor. Cursor's doesn't help Copilot. You can't inspect it, you can't share it with a teammate, and you can't diff it.&lt;/p&gt;

&lt;p&gt;What I actually wanted was an &lt;strong&gt;explicit, inspectable, shareable context layer&lt;/strong&gt; that &lt;em&gt;any&lt;/em&gt; AI client could read deterministically — same answer every time, same file my team could hand off.&lt;/p&gt;

&lt;p&gt;I picked the highest re-learn cost in my world to start with: &lt;strong&gt;SQL databases.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Enter amnesic
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/SurajKGoyal/amnesic" rel="noopener noreferrer"&gt;amnesic&lt;/a&gt; is an open-source MCP server that gives any AI client persistent semantic memory of your SQL databases. The name is ironic — it's anything but amnesic. It remembers.&lt;/p&gt;

&lt;p&gt;You (or the AI) annotate a table or column once:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nf"&gt;db_annotate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;table&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;orders&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;column&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;column_description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Order lifecycle state&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;enum_values&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pending&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;2&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;shipped&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;3&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cancelled&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;4&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;delivered&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;…and it's stored in a local SQLite file. Every future &lt;code&gt;db_get_schema&lt;/code&gt; call merges those annotations back into the response — across sessions, across AI clients, forever:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="n"&gt;You&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="n"&gt;How&lt;/span&gt; &lt;span class="n"&gt;many&lt;/span&gt; &lt;span class="n"&gt;cancelled&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="n"&gt;this&lt;/span&gt; &lt;span class="k"&gt;month&lt;/span&gt;&lt;span class="o"&gt;?&lt;/span&gt;
&lt;span class="n"&gt;AI&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;   &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;calls&lt;/span&gt; &lt;span class="n"&gt;db_get_schema&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;"orders"&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
      &lt;span class="err"&gt;→&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="k"&gt;column&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;enum&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;"3"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nv"&gt;"cancelled"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;...}&lt;/span&gt;
      &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;writes&lt;/span&gt; &lt;span class="n"&gt;correct&lt;/span&gt; &lt;span class="k"&gt;SQL&lt;/span&gt; &lt;span class="n"&gt;immediately&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;no&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;discovery&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
      &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No re-explaining. No wasted turns. The annotation persisted.&lt;/p&gt;

&lt;h2&gt;
  
  
  The technical decisions I'd defend
&lt;/h2&gt;

&lt;p&gt;A few choices that might interest people building similar tools:&lt;/p&gt;

&lt;h3&gt;
  
  
  SQLite FTS5 over a vector DB
&lt;/h3&gt;

&lt;p&gt;I started with ChromaDB for search — "find the table that handles payments." Then I ripped it out. SQLite's built-in &lt;strong&gt;FTS5&lt;/strong&gt; with BM25 ranking covers the "find the right table/column" use case at &lt;strong&gt;zero dependency cost&lt;/strong&gt;. No embeddings, no model download, no external service. For a tool that's supposed to be a lightweight local layer, pulling in a 50MB+ vector stack was the wrong trade.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nf"&gt;db_search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;payment&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# → ranked: orders.payment_method, consumerpayments table, ...
#   all from a local FTS5 index, no network, no embeddings
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Two-layer read-only enforcement
&lt;/h3&gt;

&lt;p&gt;amnesic connects to production databases, so the AI must &lt;strong&gt;never&lt;/strong&gt; be able to mutate anything. Two independent layers:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Static SQL analysis&lt;/strong&gt; — reject anything that isn't &lt;code&gt;SELECT&lt;/code&gt; / &lt;code&gt;WITH&lt;/code&gt;; catch write keywords, &lt;code&gt;SELECT ... INTO OUTFILE&lt;/code&gt;, and writes smuggled inside CTEs.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Transaction rollback&lt;/strong&gt; — every query runs inside a transaction that's immediately rolled back. Even if a write slipped past layer 1, nothing commits.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Belt and suspenders. The AI shouldn't be able to drop your table even by accident.&lt;/p&gt;

&lt;h3&gt;
  
  
  One SQLite file per connection
&lt;/h3&gt;

&lt;p&gt;Schema cache + annotations + FK relationship graph + the FTS5 index all live in one SQLite file per database connection. Portable, inspectable, &lt;code&gt;chmod 600&lt;/code&gt;. Want to hand your accumulated knowledge to a teammate? It's a single file.&lt;/p&gt;

&lt;h3&gt;
  
  
  Data minimization as a side effect
&lt;/h3&gt;

&lt;p&gt;A nice property fell out of the design: a well-annotated schema means the AI answers most questions from the &lt;strong&gt;local knowledge file&lt;/strong&gt; — without ever querying the database. &lt;em&gt;"What does status=3 mean?"&lt;/em&gt; resolves from the annotation. &lt;em&gt;"How do orders join users?"&lt;/em&gt; resolves from the FK graph. That's measurably less row data leaving your machine than a "naked" SQL MCP that runs &lt;code&gt;SELECT DISTINCT status FROM orders&lt;/code&gt; every time it's confused.&lt;/p&gt;

&lt;h2&gt;
  
  
  What it's &lt;em&gt;not&lt;/em&gt;
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;It doesn't make the model smarter.&lt;/li&gt;
&lt;li&gt;It doesn't do natural-language-to-SQL.&lt;/li&gt;
&lt;li&gt;It's not a replacement for execution-focused MCP servers — those handle query execution and live introspection well. amnesic's only job is the &lt;strong&gt;persistence/annotation layer&lt;/strong&gt; I couldn't find in any of them.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;amnesic
amnesic init          &lt;span class="c"&gt;# interactive setup wizard&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then add it to your AI client's &lt;code&gt;mcp.json&lt;/code&gt; and restart. Works with &lt;strong&gt;PostgreSQL, MySQL, MSSQL, and SQLite&lt;/strong&gt;. MIT-licensed, on PyPI, and registered on the official Linux Foundation MCP Registry.&lt;/p&gt;

&lt;p&gt;GitHub: &lt;a href="https://github.com/SurajKGoyal/amnesic" rel="noopener noreferrer"&gt;github.com/SurajKGoyal/amnesic&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The takeaway
&lt;/h2&gt;

&lt;p&gt;Not every AI problem needs a smarter model. Sometimes the win is an external context layer that's &lt;strong&gt;deterministic, inspectable, and shared&lt;/strong&gt; — so the model never has to learn the same thing twice.&lt;/p&gt;

&lt;p&gt;I'd love feedback, especially on the read-only enforcement — that's the part that has to be bulletproof. Issues and PRs welcome.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>mcp</category>
      <category>python</category>
      <category>showdev</category>
    </item>
  </channel>
</rss>
