Egeo Minotti

Posted on Jan 21

I Built a 2M ops/sec Job Queue in Rust to Replace Redis

#rust #tutorial #opensource #javascript

Last week I got frustrated with Redis.

Not because Redis is bad — it’s great for caching. But using it as a job queue backend? That’s where things get painful.

Every job = network round trip. Every poll = commands to Redis. BullMQ doing 10-15 Redis commands just to check if there’s work. My Upstash bill climbing because of polling, not actual jobs.

So I built flashQ — a job queue server in Rust that doesn’t need Redis.

The Problem with Redis-based Queues

Here’s what happens when you push 10,000 jobs with BullMQ:

10,000 jobs × 1 network round-trip = 10,000 TCP calls
Average RTT: 0.3ms
Total time: ~3 seconds

And that’s just pushing. Pulling, acknowledging, retrying — each operation hits Redis over the network.

Redis is single-threaded. One CPU core handles ALL your queue operations. When you need more throughput, you start clustering, sharding, managing multiple Redis instances.

The flashQ Approach

What if the queue lived in-process? What if batch operations were truly atomic?

┌─────────────────────────────────────────┐
│            flashQ Server                 │
│  ┌────────────────────────────────────┐ │
│  │      32 Parallel Shards            │ │
│  │   (one per CPU core)               │ │
│  └────────────────────────────────────┘ │
│         In-Process: ~100 nanoseconds    │
└─────────────────────────────────────────┘
            ↑
      Single TCP connection
            ↑
┌─────────────────────────────────────────┐
│        Your App (batch push)            │
└─────────────────────────────────────────┘

Same 10,000 jobs:

10,000 jobs × 1 batch command = 1 TCP call
Processing time: ~5ms

That’s 600x faster for batch operations.

Real Benchmarks

Tested on Apple M2. No synthetic tests, no asterisks.

Operation	flashQ	BullMQ	Improvement
Batch throughput	2,127,660 ops/sec	36,232 ops/sec	58x
Pull + Ack	519,388 ops/sec	~10,000 ops/sec	52x
P99 Latency	127-196 μs	606-647 μs	3-5x
Memory per 1M jobs	~200 MB	~2 GB	10x less

BullMQ-Compatible API

I didn’t want to reinvent the wheel. If you know BullMQ, you know flashQ:

// Before (BullMQ)
import { Queue, Worker } from 'bullmq';

// After (flashQ)
import { Queue, Worker } from 'flashq';

// Same code works
const queue = new Queue('emails');

await queue.add('send', { 
  to: 'user@example.com',
  subject: 'Welcome!'
});

const worker = new Worker('emails', async (job) => {
  await sendEmail(job.data);
  return { sent: true };
});

Change one import. That’s it.

Features

Everything you’d expect from a production queue:

Priority queues — critical jobs first
Delayed jobs — schedule for later
Job dependencies — DAG-style workflows
Rate limiting — protect your APIs
Dead letter queue — failed jobs isolated
Retries with backoff — exponential, configurable
Cron scheduling — repeatable jobs
Real-time dashboard — monitor everything

Built with Rust

Why Rust?

No garbage collector — predictable latency
Memory safe — no segfaults in production
Fearless concurrency — 32 shards, zero data races
Small footprint — 30MB base memory

Libraries used:

tokio for async runtime
tonic for gRPC
parking_lot for fast locks
fxhash for fast hashing
mimalloc for memory allocation
simd-json for fast parsing

When to Use Redis (Still)

Redis is still great for:

✅ Caching
✅ Pub/Sub
✅ Session storage
✅ Low volume queues (<1K jobs/sec)

When to Use flashQ

✅ High throughput (>10K jobs/sec)
✅ Low latency (<1ms P99)
✅ Batch operations at scale
✅ No Redis infrastructure to manage
✅ Cost efficiency (less RAM, fewer servers)

Try It

# Docker
docker run -p 6789:6789 -p 6790:6790 flashq/flashq

# Dashboard at http://localhost:6790

# npm
bun add flashq

GitHub: github.com/egeominotti/flashq

Docs: flashq.dev/docs

It’s MIT licensed, open source, and I’m actively developing it.

Currently at v0.2 — battle-testing welcome. If you try it, let me know what breaks.

Questions? Drop a comment or open an issue.

DEV Community