DEV Community

Dhruv garg
Dhruv garg

Posted on • Originally published at dhruvg.dev on

Stop Hammering Your Database: A Guide to Request Coalescing

As we already know, the bottleneck is mostly the database instead of your application code for the majority of products. so let’s look at one of the ways in which we can reduce load on database and make everything faster using easy and smart trick. i.e. Request Coalascing. I first heard about it in a blog by discord

Request Coalascing is a way in which we only make 1 call to DB for all the concurrent calls for the same data, and then return same result to all the callers.

Let’s look at the code and run some benchmarks.

Benchmark Setup:

  • Macbook pro m3
  • Autocannon as benchmarking tool
  • Redis and Postgres for database
  • Fastify server (it’s criminal to use express nowadays)

Even though I am using node.js for this experiment, same pattern can be implemented in any language.

Redis Benchmarks

I will be using redis for this this test, so that we can see that even for fastest in-memory store, how much difference it makes to use request coalascing. The effect of it will be much larger for other databases or IO operations which take longer to process. We will also be running redis docker image on local, so that this test is easy to replicate and the latency due to network operation is minimal.

Run Redis on local using docker:

docker run -d --name redis -p 6379:6379 redis
Enter fullscreen mode Exit fullscreen mode

Let’s first take the base case:

import Fastify from "fastify";

import Redis from "ioredis";

const fastify = Fastify({});

const redis = new Redis("redis://127.0.0.1:6379", {

  noDelay: true,

  keepAlive: 100000,

});

fastify.get("/test", async function (request, reply) {

  const res = await redis.get(request.url.slice(1));

  reply.send(res);

});

fastify.listen({ port: 3000 }, function (err, address) {

  if (err) {

    fastify.log.error(err);

    process.exit(1);

  }

  console.log("Server is now listening on", address);

});
Enter fullscreen mode Exit fullscreen mode

When you run benchmark for this, you will observe:

> node --experimental-strip-types base.ts

(node:87719) ExperimentalWarning: Type Stripping is an experimental feature and might change at any time

(Use `node --trace-warnings ...` to show where the warning was created)

Server is now listening on http://[::1]:3000

❯ autocannon -d 10 -c 5000 --renderStatusCodes http://0.0.0.0:3000/test

Running 10s test @ http://0.0.0.0:3000/test

5000 connections

┌─────────┬───────┬───────┬────────┬────────┬──────────┬───────────┬─────────┐

│ Stat │ 2.5% │ 50% │ 97.5% │ 99% │ Avg │ Stdev │ Max │

├─────────┼───────┼───────┼────────┼────────┼──────────┼───────────┼─────────┤

│ Latency │ 12 ms │ 92 ms │ 151 ms │ 169 ms │ 92.57 ms │ 118.16 ms │ 9943 ms │

└─────────┴───────┴───────┴────────┴────────┴──────────┴───────────┴─────────┘
┌───────────┬─────────┬─────────┬────────┬─────────┬──────────┬──────────┐

│ Stat │ 1% │ 2.5% │ 50% │ 97.5% │ Avg │ Stdev │

├───────────┼─────────┼─────────┼────────┼─────────┼──────────┼──────────┼

│ Req/Sec │ 37,343 │ 37,343 │ 48,735 │ 51,199 │ 47,670.4 │ 3,738.22 │

├───────────┼─────────┼─────────┼────────┼─────────┼──────────┼──────────┼

│ Bytes/Sec │ 20.7 MB │ 20.7 MB │ 27 MB │ 28.4 MB │ 26.5 MB │ 2.08 MB │

└───────────┴─────────┴─────────┴────────┴─────────┴──────────┴──────────┘
┌──────┬────────┐

│ Code │ Count │

├──────┼────────┤

│ 200 │ 476742 │

└──────┴────────┘
Enter fullscreen mode Exit fullscreen mode

Now, the version with request coalascing:


import Fastify from "fastify";

import Redis from "ioredis";

const fastify = Fastify({});

const redis = new Redis("redis://127.0.0.1:6379", {

  noDelay: true,

  keepAlive: 100000,

});

const pendingPromises = new Map<string, Promise<string>>();

fastify.get("/test", async function (request, reply) {

  const key = request.url.slice(1);

  const inFlight = pendingPromises.get(key);

  if (inFlight !== undefined) {

    return reply.send(await inFlight);

  }

  const promise = redis.get(key);

  pendingPromises.set(key, promise);

  const res = await promise;

  pendingPromises.delete(key);

  reply.send(res);

});

fastify.listen({ port: 3000 }, function (err, address) {

  if (err) {

    fastify.log.error(err);

    process.exit(1);

  }

  console.log("Server is now listening on", address);

});
Enter fullscreen mode Exit fullscreen mode

When you run benchmark for this, you will observe:


❯ node --experimental-strip-types requestCoalascing.ts

(node:87719) ExperimentalWarning: Type Stripping is an experimental feature and might change at any time

(Use `node --trace-warnings ...` to show where the warning was created)

Server is now listening on http://[::1]:3000

❯ autocannon -d 10 -c 5000 --renderStatusCodes http://0.0.0.0:3000/test

Running 10s test @ http://0.0.0.0:3000/test

5000 connections

┌─────────┬───────┬───────┬────────┬────────┬──────────┬──────────┬─────────┐

│ Stat │ 2.5% │ 50% │ 97.5% │ 99% │ Avg │ Stdev │ Max │

├─────────┼───────┼───────┼────────┼────────┼──────────┼──────────┼─────────┤

│ Latency │ 30 ms │ 58 ms │ 106 ms │ 119 ms │ 64.46 ms │ 109.4 ms │ 9851 ms │

└─────────┴───────┴───────┴────────┴────────┴──────────┴──────────┴─────────┘
┌───────────┬─────────┬─────────┬─────────┬─────────┬─────────┬──────────┐

│ Stat │ 1% │ 2.5% │ 50% │ 97.5% │ Avg │ Stdev │

├───────────┼─────────┼─────────┼─────────┼─────────┼─────────┼──────────|

│ Req/Sec │ 66,815 │ 66,815 │ 68,351 │ 71,935 │ 68,960 │ 1,440.79 │

├───────────┼─────────┼─────────┼─────────┼─────────┼─────────┼──────────|

│ Bytes/Sec │ 37.1 MB │ 37.1 MB │ 37.9 MB │ 39.9 MB │ 38.3 MB │ 794 kB │

└───────────┴─────────┴─────────┴─────────┴─────────┴─────────┴──────────┘
┌──────┬────────┐

│ Code │ Count │

├──────┼────────┤

│ 200 │ 689591 │

└──────┴────────┘
Enter fullscreen mode Exit fullscreen mode

Results

As we can see just by adding Request Coalascing:

  • we were able to process 44% more requests.
  • There was lower load on Redis servers. (we will see actual numbers with postgres test)
  • We even improved p99 latency from 169ms to 119ms.

Postgres Benchmarks

We will run this same test with Postgres as well to see the difference. Since It’s One of the most used database.

Setup:


❯ docker run --name post -e POSTGRES_PASSWORD=password -p 5432:5432 -d postgres:17.6-alpine

❯ psql -h localhost -p 5432 -U postgres

Enter fullscreen mode Exit fullscreen mode

CREATE TABLE employees (

    employee_id SERIAL PRIMARY KEY,

    first_name VARCHAR(50) NOT NULL,

    last_name VARCHAR(50) NOT NULL,

    email VARCHAR(100) UNIQUE,

    phone_number VARCHAR(20),

    hire_date DATE NOT NULL,

    job_title VARCHAR(50),

    salary NUMERIC(10, 2),

    department_id INTEGER

);

INSERT INTO employees (first_name, last_name, email, phone_number, hire_date, job_title, salary, department_id) VALUES

('John', 'Doe', 'john.doe@example.com', '555-1234', '2020-01-15', 'Software Engineer', 75000.00, 101),

('Jane', 'Smith', 'jane.smith@example.com', '555-5678', '2018-03-20', 'Project Manager', 90000.00, 102),

('Peter', 'Jones', 'peter.jones@example.com', '555-9012', '2021-07-01', 'Data Analyst', 60000.00, 101),

('Alice', 'Brown', 'alice.brown@example.com', '555-3456', '2019-11-10', 'HR Specialist', 65000.00, 103),

('Robert', 'Davis', 'robert.davis@example.com', '555-7890', '2022-04-25', 'Marketing Coordinator', 50000.00, 104);

Enter fullscreen mode Exit fullscreen mode

import Fastify from "fastify";

import postgres from "postgres";

const fastify = Fastify({});

const postgresClient = postgres({

  host: "localhost",

  port: "5432",

  database: "postgres",

  username: "postgres",

  password: "password",

  max: 10,

  prepare: true,

  keep_alive: 60,

});

const pendingPromises = new Map<string, Promise<string>>();

fastify.get("/coalesced", async function (request, reply) {

  const key = request.url.slice(1);

  const inFlight = pendingPromises.get(key);

  if (inFlight !== undefined) {

    return reply.send(await inFlight);

  }

  const promise = postgresClient`select * from employees`;

  pendingPromises.set(key, promise);

  const res = await promise;

  pendingPromises.delete(key);

  reply.send(res);

});

fastify.get("/base", async function (request, reply) {

  const res = await postgresClient`select * from employees`;

  reply.send(res);

});

fastify.listen({ port: 3000 }, function (err, address) {

  if (err) {

    fastify.log.error(err);

    process.exit(1);

  }

  console.log("Server is now listening on", address);

});
Enter fullscreen mode Exit fullscreen mode

Benchmark Results - Base case


❯ autocannon -d 10 -c 5000 --renderStatusCodes http://0.0.0.0:3000/base

Running 10s test @ http://0.0.0.0:3000/base

5000 connections

┌─────────┬────────┬────────┬────────┬────────┬──────────┬───────────┬─────────┐

│ Stat │ 2.5% │ 50% │ 97.5% │ 99% │ Avg │ Stdev │ Max │

├─────────┼────────┼────────┼────────┼────────┼──────────┼───────────┼─────────┤

│ Latency │ 136 ms │ 156 ms │ 192 ms │ 352 ms │ 163.7 ms │ 162.57 ms │ 9040 ms │

└─────────┴────────┴────────┴────────┴────────┴──────────┴───────────┴─────────┘
┌───────────┬─────────┬─────────┬─────────┬─────────┬──────────┬──────────┐

│ Stat │ 1% │ 2.5% │ 50% │ 97.5% │ Avg │ Stdev │

├───────────┼─────────┼─────────┼─────────┼─────────┼──────────┼──────────┼

│ Req/Sec │ 24,639 │ 24,639 │ 30,623 │ 33,471 │ 30,391.2 │ 2,121.86 │

├───────────┼─────────┼─────────┼─────────┼─────────┼──────────┼──────────┼

│ Bytes/Sec │ 32.3 MB │ 32.3 MB │ 40.1 MB │ 43.9 MB │ 39.8 MB │ 2.78 MB │

└───────────┴─────────┴─────────┴─────────┴─────────┴──────────┴──────────┘
┌──────┬────────┐

│ Code │ Count │

├──────┼────────┤

│ 200 │ 303896 │

└──────┴────────┘
Enter fullscreen mode Exit fullscreen mode

Benchmark Results - Request Coalesced case:


❯ autocannon -d 10 -c 5000 --renderStatusCodes http://0.0.0.0:3000/coalesced

Running 10s test @ http://0.0.0.0:3000/coalesced

5000 connections

┌─────────┬───────┬────────┬────────┬────────┬──────────┬──────────┬─────────┐

│ Stat │ 2.5% │ 50% │ 97.5% │ 99% │ Avg │ Stdev │ Max │

├─────────┼───────┼────────┼────────┼────────┼──────────┼──────────┼─────────┤

│ Latency │ 99 ms │ 115 ms │ 129 ms │ 237 ms │ 117.2 ms │ 48.96 ms │ 2500 ms │

└─────────┴───────┴────────┴────────┴────────┴──────────┴──────────┴─────────┘
┌───────────┬────────┬────────┬─────────┬─────────┬──────────┬──────────┐

│ Stat │ 1% │ 2.5% │ 50% │ 97.5% │ Avg │ Stdev │

├───────────┼────────┼────────┼─────────┼─────────┼──────────┼──────────┼

│ Req/Sec │ 38,143 │ 38,143 │ 41,151 │ 45,439 │ 42,422.4 │ 2,217.76 │

├───────────┼────────┼────────┼─────────┼─────────┼──────────┼──────────┼

│ Bytes/Sec │ 50 MB │ 50 MB │ 53.9 MB │ 59.5 MB │ 55.6 MB │ 2.9 MB │

└───────────┴────────┴────────┴─────────┴─────────┴──────────┴──────────┘
┌──────┬────────┐

│ Code │ Count │

├──────┼────────┤

│ 200 │ 424220 │

└──────┴────────┘
Enter fullscreen mode Exit fullscreen mode

Postgres Resource Usage After Request Coalascing

Results

So for postgres we were able to:

  • process 40% more requests.
  • Reduce p99 latency from 352ms to 237ms.
  • But the real wins come for the resource usage, see the huge difference b/w both the cases.
    • Measured these for 30 secs using OrbStack Activity Monitor (Not the best way maybe, but good enough to give the idea).

When to Use Request Coalascing?

High-Concurrency, Low-Cardinality Resources

This is the absolute ideal scenario. It means many users are requesting one of a small set of shared resources. So with a very small memory footprint on application server side, we can achieve our goal.

Some of the use cases:

  • Leaderboards: The global or daily leaderboard is the same for everyone.
  • Live Game Scores or Stock Tickers: Thousands of users request the same score or price simultaneously.
  • Application setup data: Which is usually same for most of the users, with some variance.
  • Trending products/items: Usually this is same for everyone. Otherwise cardinality is still on lower side.

Read Heavy Workload

Caching works perfectly for read heavy workload. Adding a layer of request coalascing in front of it, is like caching on steroids. Even for concurrent requests, your database does not get bombarded. Use cases where TTL is very short, Request coalascing will help even more.

When to Avoid it?

High-Cardinality or Unique Resources

Let’s say we want to fetch some data by user id. Which is going to be unique to the user, in this case request coalascing won’t really help. As that resource can not be served to any of the other concurrent requests. We will be adding all the unique items to memory for no reason. In this scenario particularly there will be performance degradation and resource usage hit instead.

Top comments (0)