<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Emad Jumaah</title>
    <description>The latest articles on DEV Community by Emad Jumaah (@emadjumaah).</description>
    <link>https://dev.to/emadjumaah</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F4005981%2Fa1bd1b66-77a5-45b7-8f2a-051a81dfec0e.png</url>
      <title>DEV Community: Emad Jumaah</title>
      <link>https://dev.to/emadjumaah</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/emadjumaah"/>
    <language>en</language>
    <item>
      <title>Monlite – documents, vectors, cache, and job queue in one SQLite file</title>
      <dc:creator>Emad Jumaah</dc:creator>
      <pubDate>Sun, 28 Jun 2026 02:45:32 +0000</pubDate>
      <link>https://dev.to/emadjumaah/monlite-documents-vectors-cache-and-job-queue-in-one-sqlite-file-4fh3</link>
      <guid>https://dev.to/emadjumaah/monlite-documents-vectors-cache-and-job-queue-in-one-sqlite-file-4fh3</guid>
      <description>&lt;p&gt;Every local AI agent project I start begins the same way — not with agent code, but with infrastructure. MongoDB for memory, Redis for cache and locks, Qdrant for vectors, BullMQ for the task queue. An hour in and I haven't written a single line of application logic yet.&lt;/p&gt;

&lt;p&gt;There's nothing wrong with these tools at scale. But running five Docker containers just to test an idea locally started to feel like a tax I was paying on every experiment. So I started asking: what if SQLite could just do all of this?&lt;/p&gt;

&lt;p&gt;That question became monlite. It's a TypeScript library where one &lt;code&gt;.db&lt;/code&gt; file covers the whole local stack — document store, vector search, full-text search, key-value cache, job queue, and cron scheduler. Everything shares the same file and the same connection. No Docker, no glue code connecting services together, no "start them in the right order."&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt;    &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;createDb&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;./agent.db&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;store&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;createVectorStore&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;queue&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;createQueue&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;cache&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;kv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The foundation is SQLite doing what it already does well. ACID transactions, WAL mode for durability, FTS5 built right into the engine. Vector search comes from the &lt;code&gt;sqlite-vec&lt;/code&gt; extension — it adds a &lt;code&gt;vec0&lt;/code&gt; virtual table type that handles KNN queries. The document API is &lt;code&gt;json_extract()&lt;/code&gt; and a TypeScript layer on top, with Mongo/Prisma-style &lt;code&gt;where&lt;/code&gt;/&lt;code&gt;orderBy&lt;/code&gt; that narrows return types when you use typed collections.&lt;/p&gt;

&lt;p&gt;The piece that took the longest to get right was exactly-once job claiming across multiple worker processes. The naive approach — read a pending job, then update it to active — has a race when several workers share the same file. I tried optimistic locking first but the retry logic was messy. The real answer was simpler: &lt;code&gt;BEGIN IMMEDIATE&lt;/code&gt;. SQLite's write-intent lock makes the read-and-claim a single atomic step, and the whole thing collapses to this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;job&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;jobs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;findOneAndUpdate&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;where&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;pending&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;summarize&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$set&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;active&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="na"&gt;$inc&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;version&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;returnDocument&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;after&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If another process beat you to it, you get null. I tested this with 8 concurrent workers racing for a single job — exactly one wins every time. It's the same guarantee Redis and BullMQ give you, just over a file on disk instead of a network socket.&lt;/p&gt;

&lt;p&gt;Something I didn't plan but turned out to be genuinely useful: Python reads the same &lt;code&gt;.db&lt;/code&gt;.&lt;br&gt;
Because the format is plain SQLite with documented conventions — no proprietary encoding, no wire protocol — the Python port can open the file and query it directly. So a common pattern is Python doing the heavy lifting (chunking, embedding, ingesting) while Node handles the serving side, and they share one file without any translation layer between them.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;db&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;create_db&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;agent.db&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;   &lt;span class="c1"&gt;# the same file Node is writing
&lt;/span&gt;&lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;collection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;docs&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;find_many&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;where&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ready&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="nf"&gt;kv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;set_nx&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;lock:job:42&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ttl&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5_000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We verify the interop with a round-trip test suite — write from Node, read from Python, write from Python, read from Node. The schema conventions are documented so any runtime can join.&lt;/p&gt;

&lt;p&gt;On Node &amp;gt;= 22.5 the core runs on the built-in &lt;code&gt;node:sqlite&lt;/code&gt; with zero native dependencies. For Node 18/20 you install &lt;code&gt;better-sqlite3&lt;/code&gt; and it gets picked up automatically. Same interface,&lt;br&gt;
same tests either way.&lt;/p&gt;

&lt;p&gt;I should be clear about the limits. SQLite is single-writer — great for local agent workloads, not right for thousands of concurrent writes per second. The reactive &lt;code&gt;watch()&lt;/code&gt; works in-process and doesn't automatically fire across separate processes. And this is deliberately not a distributed system; &lt;code&gt;@monlite/sync&lt;/code&gt; can replicate to MongoDB, Postgres, or MySQL when you need the cloud, but monlite's real home is a single machine.&lt;/p&gt;

&lt;p&gt;The core is at v2.6.1 with a frozen API. Separate opt-in packages cover vectors, FTS, cache, queue, cron, sync, browser (via SQLite-WASM), and Electron. The Python port ships documents and kv today with the rest in progress.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; @monlite/core
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;GitHub: &lt;a href="https://github.com/qataruts/monlite" rel="noopener noreferrer"&gt;https://github.com/qataruts/monlite&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Happy to talk through the implementation — the sqlite-vec wiring, how the plugin system keeps FTS and vector indexes in sync with &lt;code&gt;afterWrite&lt;/code&gt;, the LWW conflict resolution in the sync engine, or why &lt;code&gt;BEGIN IMMEDIATE&lt;/code&gt; beat optimistic locking for the CAS problem.&lt;/p&gt;

</description>
      <category>typescript</category>
      <category>sqlite</category>
      <category>ai</category>
      <category>opensource</category>
    </item>
  </channel>
</rss>
