I'm building a local AI agent orchestration tool. Multiple agents run asynchronously — user messages come in, executors return results, approvals fire callbacks. All of these are events, and they need to flow reliably through the system.
The key word is reliably. An agent crashes mid-task. The process gets killed. The machine restarts. When things come back up, pending events must still be there, ready to retry. Losing an event means losing work.
So I needed a durable message queue. But here's the constraint: this is a local developer tool, not a cloud service. Users install it with npm install and it should just work. Requiring them to run Redis or RabbitMQ — or any external process — is a non-starter. SQLite was already in the stack for persistent storage — and it has strong transactional guarantees. Could it also be the queue?
Why not just a "jobs" table?
The first instinct is to create a table with a status column and poll it:
SELECT * FROM jobs WHERE status = 'pending' ORDER BY created LIMIT 1;
-- process it, then:
UPDATE jobs SET status = 'done' WHERE id = ?;
This works until it doesn't. Two consumers grab the same row. A consumer crashes after SELECT but before UPDATE — the job vanishes. You start adding locked_by, locked_at, retry counters, timeout checks... and you realize you're reinventing a message queue, poorly.
I found goqite, a Go library that does exactly this — a durable message queue on top of SQLite, modeled after Amazon SQS. It's a proven design. But there was nothing like it in the TypeScript ecosystem. So I built sqliteq.
The design
Single table, all queues
Every queue shares one table. No migrations when you add a queue — just pass a different name:
const emails = new Queue(db, 'emails')
const sync = new Queue(db, 'sync-jobs')
The schema:
create table if not exists sqliteq (
id text primary key default ('m_' || lower(hex(randomblob(16)))),
created text not null default (strftime('%Y-%m-%dT%H:%M:%fZ')),
queue text not null,
body text not null,
timeout text not null default (strftime('%Y-%m-%dT%H:%M:%fZ')),
received integer not null default 0,
priority integer not null default 0
) strict;
No status column. The timeout timestamp is the status. If timeout is in the past, the message is available. If it's in the future, it's claimed by a consumer and invisible to everyone else. This is the SQS visibility timeout model.
Atomic claim — no locks needed
Receiving a message is a single SQL statement:
update sqliteq
set timeout = ?, received = received + 1
where id = (
select id from sqliteq
where queue = ? and ? >= timeout and received < ?
order by priority desc, created
limit 1
)
returning id, body, received
The subquery finds the next eligible message. The outer UPDATE claims it by bumping the timeout into the future — atomically. No advisory locks. No two-step SELECT-then-UPDATE race. SQLite's single-writer guarantee does the rest.
Fencing with received
Here's a subtle problem: consumer A receives a message, takes too long processing it, and the visibility timeout expires. Consumer B receives the same message (redelivery). Now A finishes and deletes the message — but it's B's message now.
The received column solves this. It increments on every receive. When you delete or extend a message, you must pass the received value you got:
const msg = queue.receive()
// msg.received === 1
// later:
queue.delete(msg.id, msg.received)
// DELETE ... WHERE id = ? AND received = ?
If the message was redelivered to someone else, received no longer matches, the DELETE affects zero rows, and you get false back. Deletes and extensions are conditional on both id and received, so a stale consumer can at most fail to delete a message — never delete one it doesn't own. This is the same idea as a fencing token in distributed locking.
Dead letters
After maxReceive deliveries (default 3), a message is no longer eligible for receive. It's dead-lettered — still in the table but won't be delivered again. You can inspect it, requeue it, or purge it:
const dead = queue.deadLetters()
// [{ id, body, received: 3 }]
Benchmarks
On a MacBook Pro M-series, file-backed SQLite (not :memory:), with better-sqlite3:
| Operation | Throughput | Latency |
|---|---|---|
| send + receive + delete | ~20,000 ops/sec | 49 μs/op |
| send only | ~31,000 ops/sec | — |
| receive + delete (100K rows, 10 queues) | ~18,000 ops/sec | 55 μs/op |
| sendBatch (100 msgs/tx) | ~120,000 ops/sec | — |
For a local tool or single-machine service, this is more than enough. The bottleneck in my AI agent system is the LLM API call (seconds), not the queue (microseconds).
Quick start
npm install @minnzen/sqliteq better-sqlite3
import Database from 'better-sqlite3'
import { Queue } from '@minnzen/sqliteq'
const db = new Database('app.db')
const q = new Queue(db, 'tasks')
// Send
q.send({ type: 'process', payload: 'data' })
// Receive and acknowledge
const msg = q.receive()
if (msg) {
console.log(msg.body) // { type: 'process', payload: 'data' }
q.delete(msg.id, msg.received)
}
There's also a Processor class for long-running consumers with auto-retry and auto-extend:
import { Processor } from '@minnzen/sqliteq'
const processor = new Processor(q, {
handler(msg) {
// return normally = auto-delete
// throw = leave for retry after timeout
},
concurrency: 4,
})
processor.start()
Wrapping up
sqliteq is MIT licensed, zero runtime dependencies, and works with both better-sqlite3 and bun:sqlite. TypeScript with full generic support (Queue<T>).
The queue is currently powering the event-driven runtime of an AI agent orchestration project I'm working on — where agents crash, timeout, and retry are not edge cases but the normal operating mode. Having a durable queue that's just SQLite, with no external processes to manage, has been exactly the right trade-off. In that world, the queue is the backbone — everything else is optional.
If you have a similar "I just need a reliable local queue" situation, give it a try.
Top comments (0)