HelperX

Posted on Jun 7

SQLite in Production: Why We Chose It Over Postgres for a Multi-Tenant SaaS

#database #node #architecture #saas

"You're using SQLite in production? For a SaaS?" — every developer who hears about our stack.

Yes. And after a year of running it with hundreds of active accounts, I'd make the same choice again. Here's why — and the specific patterns that make it work for multi-tenant workloads.

The architecture decision

HelperX manages multiple X accounts. Each account is an isolated "slot" with its own configuration, auth tokens, audit logs, and operational state. When we designed the data layer, we had two options:

Option A: Shared Postgres
One PostgreSQL database. All slots share tables. Isolation enforced by WHERE slot_id = ? on every query.

Option B: SQLite per slot
One SQLite database file per slot. Complete physical isolation. No shared state.

We chose Option B. Here's the decision matrix:

Factor	Shared Postgres	SQLite per slot
Data isolation	Logical (WHERE clause)	Physical (separate files)
Deployment complexity	Requires Postgres server	Zero — embedded
Backup granularity	Full DB or table-level	Per-slot file copy
Slot deletion	DELETE + potential orphans	`rm slot_abc.db`
Concurrent writes	Excellent	Limited (WAL mode helps)
Read performance	Network round-trip	Local disk, sub-ms
Operational overhead	Monitoring, upgrades, HA	None
Cost	$20-100/mo (managed DB)	$0

For our workload — lots of reads, moderate writes, strong isolation requirements — SQLite won on every axis except concurrent writes. And our write pattern (one slot writes sequentially, never concurrently) means that limitation doesn't apply.

The implementation

Database-per-slot pattern

import Database from 'better-sqlite3';
import path from 'path';
import fs from 'fs';

const DB_DIR = path.join(process.cwd(), 'data', 'slots');

function getSlotDb(slotId) {
  const dbPath = path.join(DB_DIR, `${slotId}.db`);
  const db = new Database(dbPath);

  db.pragma('journal_mode = WAL');
  db.pragma('synchronous = NORMAL');
  db.pragma('foreign_keys = ON');
  db.pragma('busy_timeout = 5000');

  return db;
}

Each slot gets its own .db file. The pragmas are critical:

WAL mode: Allows concurrent reads while writing. Without this, readers block on writes.
synchronous = NORMAL: Trades a small durability risk for 5-10x write speed. For our workload (losing the last 1-2 log entries in a crash is acceptable), this is the right tradeoff.
busy_timeout: If a write is in progress, wait up to 5 seconds instead of throwing immediately. Prevents spurious SQLITE_BUSY errors.

Schema per slot

Every slot database has identical schema:

function initSlotSchema(db) {
  db.exec(`
    CREATE TABLE IF NOT EXISTS config (
      key TEXT PRIMARY KEY,
      value TEXT NOT NULL,
      updated_at TEXT DEFAULT (datetime('now'))
    );

    CREATE TABLE IF NOT EXISTS audit_log (
      id TEXT PRIMARY KEY,
      module TEXT NOT NULL,
      action TEXT NOT NULL,
      target TEXT,
      detail TEXT,
      status TEXT NOT NULL,
      duration_ms INTEGER,
      timestamp TEXT NOT NULL DEFAULT (datetime('now')),
      metadata TEXT
    );

    CREATE TABLE IF NOT EXISTS queue (
      id INTEGER PRIMARY KEY AUTOINCREMENT,
      module TEXT NOT NULL,
      payload TEXT NOT NULL,
      status TEXT DEFAULT 'pending',
      created_at TEXT DEFAULT (datetime('now')),
      processed_at TEXT
    );

    CREATE INDEX IF NOT EXISTS idx_audit_ts ON audit_log(timestamp DESC);
    CREATE INDEX IF NOT EXISTS idx_queue_status ON queue(status, created_at);
  `);
}

Three tables cover everything a slot needs. Config stores module settings (encrypted). Audit log stores action history. Queue stores pending actions.

Connection pooling (you don't need it)

With Postgres, connection pooling is essential. With SQLite, it's unnecessary — each database connection is a file handle, not a network socket. Opening a SQLite database takes microseconds.

We do cache connections to avoid repeated file opens:

const connections = new Map();

function getDb(slotId) {
  if (connections.has(slotId)) {
    return connections.get(slotId);
  }

  const db = getSlotDb(slotId);
  initSlotSchema(db);
  connections.set(slotId, db);
  return db;
}

function closeDb(slotId) {
  const db = connections.get(slotId);
  if (db) {
    db.close();
    connections.delete(slotId);
  }
}

Simple Map caching. No pool manager, no connection limits, no timeout configuration.

What about migrations?

Schema migrations on 200+ SQLite databases sound nightmarish. In practice, it's straightforward:

const MIGRATIONS = [
  {
    version: 1,
    up: `ALTER TABLE audit_log ADD COLUMN metadata TEXT;`
  },
  {
    version: 2,
    up: `CREATE INDEX IF NOT EXISTS idx_audit_module ON audit_log(module, status);`
  },
];

function migrate(db) {
  db.exec(`
    CREATE TABLE IF NOT EXISTS _migrations (
      version INTEGER PRIMARY KEY,
      applied_at TEXT DEFAULT (datetime('now'))
    )
  `);

  const applied = db.prepare('SELECT MAX(version) as v FROM _migrations').get();
  const currentVersion = applied?.v || 0;

  for (const migration of MIGRATIONS) {
    if (migration.version > currentVersion) {
      db.exec(migration.up);
      db.prepare('INSERT INTO _migrations (version) VALUES (?)').run(migration.version);
    }
  }
}

Migrations run on database open. Each slot migrates independently. If a migration fails on one slot, it doesn't affect others. Compare this to a shared Postgres migration that locks the entire database.

The global database

Not everything belongs in per-slot databases. Platform-level data — user accounts, billing, plan configuration — lives in a single global SQLite database:

data/
├── global.db          ← users, plans, billing
└── slots/
    ├── slot_abc.db    ← slot A config + logs
    ├── slot_def.db    ← slot B config + logs
    └── slot_ghi.db    ← slot C config + logs

The global database handles authentication and authorization. Slot databases handle operations. No cross-database joins needed — the application layer bridges them when necessary.

Performance reality

After a year of production data:

Read performance:

Config lookup: 0.02ms (vs 2-5ms with Postgres over network)
Audit log query (last 24h): 0.3ms
Dashboard aggregation: 1.2ms

Write performance:

Single audit log insert: 0.05ms
Batch insert (100 rows): 2.1ms
Config update: 0.04ms

Storage:

Average slot database: 2.4MB after 90 days
Global database: 1.8MB
Total for 200 slots: ~480MB

For comparison, a managed Postgres instance with equivalent data would cost $20-50/month and add 2-5ms latency to every query. Our SQLite setup costs $0 and responds in microseconds.

The tradeoffs (honestly)

SQLite isn't perfect. Here's what we gave up:

1. No concurrent writes across connections.
If two processes try to write to the same slot database simultaneously, one blocks. For our workload (one worker per slot, sequential operations), this isn't an issue. For a web app with 50 concurrent users editing the same record, it would be.

2. No replication.
SQLite doesn't replicate. Our backup strategy is filesystem-level: daily snapshots of the data/ directory. If we needed real-time replication, we'd need Litestream or a similar tool.

3. No network access.
SQLite is local. If we split into microservices, the database can't be shared over the network. For our monolithic Node.js server, this is fine.

4. Schema changes require iteration.
Applying a migration to 200 databases takes 200x longer than one Postgres migration. In practice, each migration takes < 10ms, so 200 databases complete in under 2 seconds. Not a real problem at our scale.

When to choose SQLite for SaaS

SQLite per tenant works when:

Tenants are isolated. No cross-tenant queries needed.
Write concurrency is low per tenant. One writer at a time per database.
Data volume per tenant is moderate. Under 1GB per database.
Deployment is single-server. No need for database replication.
Operational simplicity matters. No DBA, no database server management.

SQLite per tenant breaks when:

Tenants need to query each other's data
Multiple processes write to the same tenant concurrently
You need distributed transactions
Individual databases exceed ~10GB
You need real-time read replicas

For HelperX, every checkbox in the "works when" column was checked. A year later, the database layer is the most boring part of the system — exactly what you want from infrastructure.

HelperX runs on SQLite with per-slot isolation — zero database management, sub-millisecond queries. Try it free for 30 days.

DEV Community