Jufianto Henri

Posted on Feb 25

Queue vs Direct Insert: What Happens When 10,000 Users Hit Your Database at Once

#go #postgres #architecture #performance

I've run online exam systems before.

For a middle school with around 1,000 students, I needed a database server with 4 CPUs and 8GB RAM, plus an app server with 2 CPUs and 8GB RAM. That's a lot of hardware just for 1,000 users.

The system worked, but I kept wondering — what if I had 10,000 students? Would I need 10x the resources?

So I built two versions of an exam submission system and threw 10,000 concurrent users at them. One used direct database inserts. One used a message queue.

Direct insert failed 28% of requests. Queue-based succeeded 99.7% of the time — using way less hardware.

The Problem

If you've run an online exam system, you know how this goes. Testing with 10 users works perfectly. QA with 50 concurrent users is fine. Then exam day hits with 1,000+ students and you're just hoping the server survives.

Running exams for around 1,000 middle school students taught me that you need serious hardware just to handle everyone hitting the system at the same time. They all log in together. They all submit together. And when it lags, they all refresh together.

Here's what actually happened during one of our exams:

The moment the exam started, all 1,000 students tried to log in simultaneously. But it wasn't just them — we had no idea who else was hitting the system. Curious parents checking if the portal was up? Random people who found the URL? Someone trying to probe for vulnerabilities? All of it piled on top of our legitimate traffic.

The result? CPU spiked to 80%. RAM hit 80%. The login requests were stacking up faster than the server could process them.

I had to make an embarrassing call: ask teachers to stagger logins by class, waiting 10-15 minutes between groups. Imagine telling anxious students "please wait, the system can't handle everyone at once." Not a great look.

And that was with "only" 1,000 users on a 4 CPU / 8GB database server.

But here's what kept me up at night: What happens at 10,000 users?

At that scale, your API receives 3,000 requests per second. A typical PostgreSQL connection pool maxes out at 100 connections.

Do the math: 3,000 requests competing for 100 database connections.

I had to see this for myself.

The Test Setup

I built two identical exam submission APIs in Go:

Component	Direct Insert (Port 8080)	Queue-Based (Port 8081)
API Server	Go + Fiber	Go + Fiber
Database	PostgreSQL (1 CPU, 512MB)	PostgreSQL (1 CPU, 512MB)
Message Queue	None	NATS JetStream
Workers	None	10 concurrent workers

Notice the database specs: 1 CPU, 512MB RAM. My real exam system needed 4 CPUs and 8GB RAM for just 1,000 students. I wanted to see if architecture choices could make up for limited hardware.

Load Test Configuration

Concurrent Users: 10,000 students
Peak Load: 3,000 requests/second
Duration: 5 minutes
Tool: k6 load testing

Same traffic. Same database. Two different architectures.

Approach 1: Direct Database Insert

The straightforward approach. User submits → API writes to database → Return response.

3,000 requests fighting for 100 database connections. The other 2,900? They wait. And wait. And eventually... timeout.

Approach 2: Queue-Based Architecture

The API doesn't talk to the database at all. It just publishes a message to a queue and responds immediately. Workers consume messages in the background at whatever pace the database can handle.

The queue acts as a shock absorber. Traffic spikes hit the queue, not your database.

Results

I expected the queue to win. I didn't expect it to be a blowout.

Response Time Comparison

Metric	Direct Insert	Queue-Based	Improvement
Average	487ms	8ms	60x faster
P50 (median)	392ms	6ms	65x faster
P95	1,245ms	18ms	69x faster
P99	2,891ms	34ms	85x faster
Maximum	8,432ms	127ms	66x faster

Success Rate Comparison

Approach	Successful	Failed	Success Rate
Direct Insert	12,459	4,831	72.1%
Queue-Based	17,234	56	99.7%

With direct insert, 28% of exam submissions failed. In a real exam scenario, that's potentially thousands of students who need to resubmit, file complaints, or worse — lose their work entirely.

Database Resource Usage

Metric	Direct Insert	Queue-Based
DB Connections	98-100 (maxed out)	12-18
CPU Usage	95-100%	35-60%
Query Queue	Backed up	Smooth

The direct approach pushed the database to its absolute limit. The queue approach? The database barely noticed.

Why Did Direct Insert Fail?

Analyzing the 28% failure rate, here's the breakdown:

Error Type	Percentage	What Happened
Connection Timeout	52%	Connection pool exhausted, requests waited > 10s
Database Deadlock	23%	Too many concurrent writes, lock contention
Connection Refused	18%	Max connections reached, new ones rejected
Context Canceled	7%	Users gave up waiting

The connection pool was the bottleneck. No matter how fast your database is, it can only handle so many concurrent connections.

The User Experience Difference

Direct Insert: The Frustrating Experience

User clicks "Submit" 
    → Spinner...
    → Still waiting...
    → 3 seconds pass...
    → "Connection timeout" error

(28% of the time)

Average wait: 487ms (but often 1-3 seconds, sometimes 8+ seconds)

Queue-Based: The Smooth Experience

User clicks "Submit"
    → "Submission received!"

(Processing happens in background)
(User continues immediately)

Average wait: 8ms (consistently fast)

What This Costs in Practice

My original setup for ~1,000 users:

Database: 4 CPU, 8GB RAM
App Server: 2 CPU, 8GB RAM
Monthly cost: around $200-300
Stress level during exams: high

The queue-based test with 10,000 users:

Database: 1 CPU, 512MB RAM
Everything else: minimal
Success rate: 99.7%
The database didn't even notice the load

That's 10x the users with way less hardware.

Business Impact Comparison

Aspect	Direct Insert	Queue-Based
User Errors	28% see failures	0.3% see failures
Infrastructure	Need bigger DB for spikes	Smaller DB handles more
Exam Day Stress	Engineers on-call	System stays calm
Scalability	Linear cost increase	Efficient scaling

When to Use What

Queues aren't always the answer.

Direct insert works fine when traffic is low and predictable, you need data in the database immediately (like real-time dashboards), or you're doing mostly reads with occasional writes.

Queues make sense when you have traffic spikes, lots of concurrent writes, users who care about response time, or you can't afford to lose data. The queue gives you a buffer — it absorbs the spike and lets workers process at whatever pace the database can handle.

Think exam submissions, checkout during sales, ticket purchases, push notifications — anything where everyone hits the system at once.

How the Queue Architecture Works

Here's how it works under the hood:

1. API Layer (Fast Response)

The API receives the exam submission and publishes it to NATS JetStream. No database call here — just a message publish (~2ms).

// Pseudocode
func submitExam(c *fiber.Ctx) error {
    submission := parseRequest(c)

    // Publish to queue (non-blocking, ~2ms)
    err := nats.Publish("exam.submissions", submission)
    if err != nil {
        return err
    }

    // Immediately respond to user
    return c.JSON(Response{Status: "accepted", ID: submission.ID})
}

2. Message Queue (Shock Absorber)

NATS JetStream stores messages durably. Even if workers are slow, messages aren't lost. The queue absorbs traffic spikes and releases them at a steady pace.

3. Workers (Controlled Processing)

10 concurrent workers pull messages from the queue. Each worker processes in batches of 100 records — reducing database round trips.

// Pseudocode
func worker() {
    for {
        messages := queue.FetchBatch(100)  // Get 100 messages
        submissions := parseAll(messages)

        db.BatchInsert(submissions)  // Single DB call for 100 records

        queue.AckAll(messages)
    }
}

4. Database (Happy and Healthy)

Instead of 3,000 concurrent connections, the database sees a steady stream from 10 workers. CPU stays at 35-60%. No deadlocks. No timeouts.

What I Learned

Connection pools have hard limits. Doesn't matter how fast your database is — 3,000 requests can't share 100 connections without waiting. I learned this the hard way.

Queues trade immediate consistency for reliability. The user gets a fast "accepted" response, but the data isn't in the database yet. For most write-heavy workloads, that tradeoff is worth it.

The performance difference is huge. Not 2x or 3x — we're talking 60-85x faster response times.

Queues handle spikes gracefully. During a traffic spike, the queue just grows. Nothing crashes. When traffic drops, workers catch up. No 2 AM pages.

Architecture matters more than hardware. My 1,000-user system needed 4 CPU and 8GB RAM. This queue-based test handled 10,000 users on 1 CPU and 512MB.

Wish I'd known this earlier. Could've saved money and stress.

Wrapping Up

Queues aren't always the answer. But for traffic spikes and heavy writes? Worth considering.

Looking back at my exam systems with their beefy servers, I wish I'd tried this sooner. Usually you trade performance for cost — this gave me both.

Next time you're building something write-heavy, ask yourself: what happens when everyone hits it at once?

Code is on GitHub: jufianto/blog-resource/exam-app-queue

Further Reading

Have you dealt with database bottlenecks under load? I'd like to hear about it.

DEV Community