Every local AI agent project I start begins the same way — not with agent code, but with infrastructure. MongoDB for memory, Redis for cache and locks, Qdrant for vectors, BullMQ for the task queue. An hour in and I haven't written a single line of application logic yet.
There's nothing wrong with these tools at scale. But running five Docker containers just to test an idea locally started to feel like a tax I was paying on every experiment. So I started asking: what if SQLite could just do all of this?
That question became monlite. It's a TypeScript library where one .db file covers the whole local stack — document store, vector search, full-text search, key-value cache, job queue, and cron scheduler. Everything shares the same file and the same connection. No Docker, no glue code connecting services together, no "start them in the right order."
const db = createDb("./agent.db")
const store = createVectorStore(db)
const queue = createQueue(db)
const cache = kv(db)
The foundation is SQLite doing what it already does well. ACID transactions, WAL mode for durability, FTS5 built right into the engine. Vector search comes from the sqlite-vec extension — it adds a vec0 virtual table type that handles KNN queries. The document API is json_extract() and a TypeScript layer on top, with Mongo/Prisma-style where/orderBy that narrows return types when you use typed collections.
The piece that took the longest to get right was exactly-once job claiming across multiple worker processes. The naive approach — read a pending job, then update it to active — has a race when several workers share the same file. I tried optimistic locking first but the retry logic was messy. The real answer was simpler: BEGIN IMMEDIATE. SQLite's write-intent lock makes the read-and-claim a single atomic step, and the whole thing collapses to this:
const job = await jobs.findOneAndUpdate({
where: { status: "pending", type: "summarize" },
data: { $set: { status: "active" }, $inc: { version: 1 } },
returnDocument: "after",
})
If another process beat you to it, you get null. I tested this with 8 concurrent workers racing for a single job — exactly one wins every time. It's the same guarantee Redis and BullMQ give you, just over a file on disk instead of a network socket.
Something I didn't plan but turned out to be genuinely useful: Python reads the same .db.
Because the format is plain SQLite with documented conventions — no proprietary encoding, no wire protocol — the Python port can open the file and query it directly. So a common pattern is Python doing the heavy lifting (chunking, embedding, ingesting) while Node handles the serving side, and they share one file without any translation layer between them.
db = create_db("agent.db") # the same file Node is writing
db.collection("docs").find_many(where={"status": "ready"})
kv(db).set_nx("lock:job:42", 1, ttl=5_000)
We verify the interop with a round-trip test suite — write from Node, read from Python, write from Python, read from Node. The schema conventions are documented so any runtime can join.
On Node >= 22.5 the core runs on the built-in node:sqlite with zero native dependencies. For Node 18/20 you install better-sqlite3 and it gets picked up automatically. Same interface,
same tests either way.
I should be clear about the limits. SQLite is single-writer — great for local agent workloads, not right for thousands of concurrent writes per second. The reactive watch() works in-process and doesn't automatically fire across separate processes. And this is deliberately not a distributed system; @monlite/sync can replicate to MongoDB, Postgres, or MySQL when you need the cloud, but monlite's real home is a single machine.
The core is at v2.6.1 with a frozen API. Separate opt-in packages cover vectors, FTS, cache, queue, cron, sync, browser (via SQLite-WASM), and Electron. The Python port ships documents and kv today with the rest in progress.
npm install @monlite/core
GitHub: https://github.com/qataruts/monlite
Happy to talk through the implementation — the sqlite-vec wiring, how the plugin system keeps FTS and vector indexes in sync with afterWrite, the LWW conflict resolution in the sync engine, or why BEGIN IMMEDIATE beat optimistic locking for the CAS problem.
Top comments (0)