DEV Community

Cover image for Why I Built a Job Queue With SQLite Instead of Redis (And What I Learned)
D_SECURITY
D_SECURITY

Posted on

Why I Built a Job Queue With SQLite Instead of Redis (And What I Learned)

I needed to coordinate background scripts running across different machines.

The obvious answer was Redis. Everyone uses Redis for this. The tutorials all use Redis. The Stack Overflow answers all say "just use Redis."

So I looked at what deploying Redis would actually cost me:

  • A running Redis server I had to maintain
  • A broker to connect workers to it
  • Celery or RQ on top of that
  • Memory-based storage with no persistence by default
  • A whole new failure mode: what happens when Redis goes down?

I'm running scripts on PythonAnywhere and a phone running Termux. I don't want to maintain infrastructure. I just want my cloud script to say "hey, do this" and have a worker somewhere execute it.

So I built Intent Bus — a pure HTTP job queue backed by SQLite. No Redis. No broker. No open ports required.

Here's what I learned.


The Core Idea

Any script POSTs a job to a Flask server. Workers anywhere poll for jobs matching their goal, claim them atomically, execute them, and mark them fulfilled.

Cloud Script
    │
    │ POST /intent  {"goal": "send_notification", "payload": {...}}
    ▼
Intent Bus (Flask + SQLite)
    │
    │ POST /claim?goal=send_notification
    ▼
Worker (Termux / Linux / anywhere)
    │
    │ executes job → POST /fulfill/<id>
    ▼
Done
Enter fullscreen mode Exit fullscreen mode

Workers poll outbound. They don't need an open port. They work behind any firewall.


The SQLite Locking Question

The part everyone pushes back on: can SQLite actually handle concurrent workers without race conditions?

Yes — with the right setup. Here's the exact claim query:

BEGIN IMMEDIATE;

WITH candidate AS (
    SELECT id
    FROM intents
    WHERE expires_at > :now
      AND status != 'failed'
      AND claim_attempts < :max_attempts
      AND (status='open' OR (status='claimed' AND claimed_at < :stale))
      AND (publisher=:worker_key OR visibility='public')
      AND goal=:target_goal
    ORDER BY claim_attempts ASC, created_at ASC, id ASC
    LIMIT 1
)
UPDATE intents
SET status='claimed',
    claimed_at=:now,
    claimed_by=:claimer,
    claim_attempts=claim_attempts+1
WHERE id = (SELECT id FROM candidate)
  AND claim_attempts < :max_attempts
  AND (status='open' OR (status='claimed' AND claimed_at < :stale))
RETURNING id, goal, payload, claim_attempts;
Enter fullscreen mode Exit fullscreen mode

BEGIN IMMEDIATE acquires a write lock before the CTE evaluates. The entire block — candidate selection, UPDATE, and RETURNING — is a single atomic statement. No other writer can start a transaction until this one commits. One worker gets the job; every other concurrent claim hits the claim_attempts < :max_attempts guard or finds no candidate and gets a 204 back.

With PRAGMA journal_mode=WAL and PRAGMA busy_timeout=30000, readers never block writers and the system handles short bursts cleanly.

In practice on PythonAnywhere's free tier: tested with 3 concurrent workers, zero double-claims.


What SQLite Gets You That Redis Doesn't

Aspect SQLite (Intent Bus) Redis + Celery
Infrastructure Zero — just a file Redis server + broker
Persistence ACID by default Requires AOF/RDB config
Inspect queue Any SQL tool Redis CLI or dashboard
Self-hostable Yes, free tier works Needs paid hosting or VPS
Mobile workers Yes, just HTTP Complex from Termux
Failure recovery 60s visibility timeout, auto-requeue Ack-based, more complex
Throughput ceiling ~50-100 writes/sec Tens of thousands/sec

The throughput ceiling is the honest limitation. SQLite WAL mode serializes writes. At high concurrency (>10 req/sec sustained) you'll see queue buildup. For that use case, use Redis — it's the right tool.

For low-to-medium frequency workloads (scripts, bots, scrapers, phone notifications), SQLite is not just adequate — it's better, because you get persistence, inspectability, and zero infrastructure.


Delivery Guarantees

Intent Bus provides at-least-once delivery. If a worker crashes after claiming but before fulfilling, the 60-second visibility timeout auto-requeues the job. The next available worker picks it up.

This means workers should be idempotent where possible. For most script automation use cases (send a notification, write a log, trigger a webhook), this is fine.


The Scale Ceiling

I get asked: "what happens when this gets popular?"

The upgrade path is straightforward:

  1. Swap SQLite for PostgreSQL — minimal code changes, the query logic is identical
  2. Move to a paid VPS for higher CPU allocation
  3. Split read/write paths if needed

But the honest answer is: if you need to worry about that upgrade path, Intent Bus isn't the right tool for you today. It's designed for the tier below that problem.


Try It

pip install intent-bus
Enter fullscreen mode Exit fullscreen mode
from intent_bus import IntentClient

bus = IntentClient(api_key="your_key")

# Publish a job
bus.publish(goal="send_notification", payload={"message": "Scrape complete"})

# Listen for jobs
def handle(payload):
    print(payload["message"])

bus.listen(goal="send_notification", handler=handle)
Enter fullscreen mode Exit fullscreen mode

Full server code, spec, and worker examples: https://github.com/dsecurity49/Intent-Bus

The server runs free on PythonAnywhere. Workers run anywhere that can make HTTP requests — including a phone.


Have you used SQLite as a message broker in production? I'd genuinely like to know where it held up and where it broke down — drop a comment.

Top comments (0)