DEV Community

Jufianto Henri
Jufianto Henri

Posted on

Queue vs Direct Insert: What Happens When 10,000 Users Hit Your Database at Once

I've run online exam systems before.

For a middle school with around 1,000 students, I needed a database server with 4 CPUs and 8GB RAM, plus an app server with 2 CPUs and 8GB RAM. That's a lot of hardware just for 1,000 users.

The system worked, but I kept wondering — what if I had 10,000 students? Would I need 10x the resources?

So I built two versions of an exam submission system and threw 10,000 concurrent users at them. One used direct database inserts. One used a message queue.

Direct insert failed 28% of requests. Queue-based succeeded 99.7% of the time — using way less hardware.


The Problem

If you've run an online exam system, you know how this goes. Testing with 10 users works perfectly. QA with 50 concurrent users is fine. Then exam day hits with 1,000+ students and you're just hoping the server survives.

Running exams for around 1,000 middle school students taught me that you need serious hardware just to handle everyone hitting the system at the same time. They all log in together. They all submit together. And when it lags, they all refresh together.

Here's what actually happened during one of our exams:

The moment the exam started, all 1,000 students tried to log in simultaneously. But it wasn't just them — we had no idea who else was hitting the system. Curious parents checking if the portal was up? Random people who found the URL? Someone trying to probe for vulnerabilities? All of it piled on top of our legitimate traffic.

The result? CPU spiked to 80%. RAM hit 80%. The login requests were stacking up faster than the server could process them.

I had to make an embarrassing call: ask teachers to stagger logins by class, waiting 10-15 minutes between groups. Imagine telling anxious students "please wait, the system can't handle everyone at once." Not a great look.

And that was with "only" 1,000 users on a 4 CPU / 8GB database server.

But here's what kept me up at night: What happens at 10,000 users?

At that scale, your API receives 3,000 requests per second. A typical PostgreSQL connection pool maxes out at 100 connections.

Do the math: 3,000 requests competing for 100 database connections.

I had to see this for myself.


The Test Setup

I built two identical exam submission APIs in Go:

Component Direct Insert (Port 8080) Queue-Based (Port 8081)
API Server Go + Fiber Go + Fiber
Database PostgreSQL (1 CPU, 512MB) PostgreSQL (1 CPU, 512MB)
Message Queue None NATS JetStream
Workers None 10 concurrent workers

Notice the database specs: 1 CPU, 512MB RAM. My real exam system needed 4 CPUs and 8GB RAM for just 1,000 students. I wanted to see if architecture choices could make up for limited hardware.

Load Test Configuration

  • Concurrent Users: 10,000 students
  • Peak Load: 3,000 requests/second
  • Duration: 5 minutes
  • Tool: k6 load testing

Same traffic. Same database. Two different architectures.


Approach 1: Direct Database Insert

The straightforward approach. User submits → API writes to database → Return response.

Direct Database Insert

3,000 requests fighting for 100 database connections. The other 2,900? They wait. And wait. And eventually... timeout.


Approach 2: Queue-Based Architecture

The API doesn't talk to the database at all. It just publishes a message to a queue and responds immediately. Workers consume messages in the background at whatever pace the database can handle.

Queue Based Architechture

The queue acts as a shock absorber. Traffic spikes hit the queue, not your database.


Results

I expected the queue to win. I didn't expect it to be a blowout.

Response Time Comparison

Metric Direct Insert Queue-Based Improvement
Average 487ms 8ms 60x faster
P50 (median) 392ms 6ms 65x faster
P95 1,245ms 18ms 69x faster
P99 2,891ms 34ms 85x faster
Maximum 8,432ms 127ms 66x faster

Success Rate Comparison

Approach Successful Failed Success Rate
Direct Insert 12,459 4,831 72.1%
Queue-Based 17,234 56 99.7%

With direct insert, 28% of exam submissions failed. In a real exam scenario, that's potentially thousands of students who need to resubmit, file complaints, or worse — lose their work entirely.

Database Resource Usage

Metric Direct Insert Queue-Based
DB Connections 98-100 (maxed out) 12-18
CPU Usage 95-100% 35-60%
Query Queue Backed up Smooth

The direct approach pushed the database to its absolute limit. The queue approach? The database barely noticed.


Why Did Direct Insert Fail?

Analyzing the 28% failure rate, here's the breakdown:

Error Type Percentage What Happened
Connection Timeout 52% Connection pool exhausted, requests waited > 10s
Database Deadlock 23% Too many concurrent writes, lock contention
Connection Refused 18% Max connections reached, new ones rejected
Context Canceled 7% Users gave up waiting

The connection pool was the bottleneck. No matter how fast your database is, it can only handle so many concurrent connections.


The User Experience Difference

Direct Insert: The Frustrating Experience

User clicks "Submit" 
    → Spinner...
    → Still waiting...
    → 3 seconds pass...
    → "Connection timeout" error

(28% of the time)
Enter fullscreen mode Exit fullscreen mode

Average wait: 487ms (but often 1-3 seconds, sometimes 8+ seconds)

Queue-Based: The Smooth Experience

User clicks "Submit"
    → "Submission received!"

(Processing happens in background)
(User continues immediately)
Enter fullscreen mode Exit fullscreen mode

Average wait: 8ms (consistently fast)


What This Costs in Practice

My original setup for ~1,000 users:

  • Database: 4 CPU, 8GB RAM
  • App Server: 2 CPU, 8GB RAM
  • Monthly cost: around $200-300
  • Stress level during exams: high

The queue-based test with 10,000 users:

  • Database: 1 CPU, 512MB RAM
  • Everything else: minimal
  • Success rate: 99.7%
  • The database didn't even notice the load

That's 10x the users with way less hardware.

Business Impact Comparison

Aspect Direct Insert Queue-Based
User Errors 28% see failures 0.3% see failures
Infrastructure Need bigger DB for spikes Smaller DB handles more
Exam Day Stress Engineers on-call System stays calm
Scalability Linear cost increase Efficient scaling

When to Use What

Queues aren't always the answer.

Direct insert works fine when traffic is low and predictable, you need data in the database immediately (like real-time dashboards), or you're doing mostly reads with occasional writes.

Queues make sense when you have traffic spikes, lots of concurrent writes, users who care about response time, or you can't afford to lose data. The queue gives you a buffer — it absorbs the spike and lets workers process at whatever pace the database can handle.

Think exam submissions, checkout during sales, ticket purchases, push notifications — anything where everyone hits the system at once.


How the Queue Architecture Works

Here's how it works under the hood:

1. API Layer (Fast Response)

The API receives the exam submission and publishes it to NATS JetStream. No database call here — just a message publish (~2ms).

// Pseudocode
func submitExam(c *fiber.Ctx) error {
    submission := parseRequest(c)

    // Publish to queue (non-blocking, ~2ms)
    err := nats.Publish("exam.submissions", submission)
    if err != nil {
        return err
    }

    // Immediately respond to user
    return c.JSON(Response{Status: "accepted", ID: submission.ID})
}
Enter fullscreen mode Exit fullscreen mode

2. Message Queue (Shock Absorber)

NATS JetStream stores messages durably. Even if workers are slow, messages aren't lost. The queue absorbs traffic spikes and releases them at a steady pace.

3. Workers (Controlled Processing)

10 concurrent workers pull messages from the queue. Each worker processes in batches of 100 records — reducing database round trips.

// Pseudocode
func worker() {
    for {
        messages := queue.FetchBatch(100)  // Get 100 messages
        submissions := parseAll(messages)

        db.BatchInsert(submissions)  // Single DB call for 100 records

        queue.AckAll(messages)
    }
}
Enter fullscreen mode Exit fullscreen mode

4. Database (Happy and Healthy)

Instead of 3,000 concurrent connections, the database sees a steady stream from 10 workers. CPU stays at 35-60%. No deadlocks. No timeouts.


What I Learned

Connection pools have hard limits. Doesn't matter how fast your database is — 3,000 requests can't share 100 connections without waiting. I learned this the hard way.

Queues trade immediate consistency for reliability. The user gets a fast "accepted" response, but the data isn't in the database yet. For most write-heavy workloads, that tradeoff is worth it.

The performance difference is huge. Not 2x or 3x — we're talking 60-85x faster response times.

Queues handle spikes gracefully. During a traffic spike, the queue just grows. Nothing crashes. When traffic drops, workers catch up. No 2 AM pages.

Architecture matters more than hardware. My 1,000-user system needed 4 CPU and 8GB RAM. This queue-based test handled 10,000 users on 1 CPU and 512MB.

Wish I'd known this earlier. Could've saved money and stress.


Wrapping Up

Queues aren't always the answer. But for traffic spikes and heavy writes? Worth considering.

Looking back at my exam systems with their beefy servers, I wish I'd tried this sooner. Usually you trade performance for cost — this gave me both.

Next time you're building something write-heavy, ask yourself: what happens when everyone hits it at once?


Code is on GitHub: jufianto/blog-resource/exam-app-queue


Further Reading


Have you dealt with database bottlenecks under load? I'd like to hear about it.

Top comments (0)