Darshan Raval

Posted on Jan 28

🚀 How We Reduced API Response Time by 60% in Node.js Using Caching, Query Optimization & Performance Profiling

#node #backend #performance #database

Performance issues in backend systems rarely come from one obvious bug.

In our production system, a high-traffic API showed:

📉 P95 latency ~ 820ms
📈 Frequent database CPU spikes
⛔ Increased timeout rates during traffic bursts

After a structured optimization process, we reduced:

⚡ P95 latency: 820ms → 310ms
🧠 Database CPU usage reduced by ~40%
📊 Timeout rate reduced by ~70%

Here’s the exact breakdown of what we did.

1️⃣ Step One: Measure Before Optimizing

Before touching any code, we collected:

P50 / P95 / P99 latency
Slow query logs
DB execution plans
CPU & memory metrics
Event loop lag

We added a lightweight response-time logger:

app.use((req, res, next) => {
  const start = process.hrtime.bigint();

  res.on("finish", () => {
    const duration =
      Number(process.hrtime.bigint() - start) / 1_000_000;

    console.log(`${req.method} ${req.url} - ${duration.toFixed(2)}ms`);
  });

  next();
});

🔎 Observation

~68% latency was database time
~20% was repeated identical queries
Remaining was serialization + network overhead

Conclusion:
🧩 Primary bottleneck = Database layer.

2️⃣ Query Optimization (Biggest Impact 🔥)

❌ Problem 1: Missing Composite Index

Original query:

SELECT * 
FROM orders
WHERE user_id = $1
AND status = 'completed'
ORDER BY created_at DESC
LIMIT 20;

Execution plan showed:

Sequential scan
High disk I/O
Large row filtering

✅ Fix: Composite Index

CREATE INDEX idx_orders_user_status_created
ON orders(user_id, status, created_at DESC);

📊 Result

Query time: 180ms → 35ms
Removed full table scan
Reduced disk reads significantly

Indexing alone reduced endpoint latency by ~25%.

❌ Problem 2: Over-Fetching Data

Original:

SELECT * FROM users WHERE id = $1;

But API only needed:

name
profile_picture

✅ Optimized Query

SELECT name, profile_picture
FROM users
WHERE id = $1;

*🎯 Impact *

Reduced payload size
Lower memory allocation
Faster JSON serialization

Small change. Measurable gain.

❌ Problem 3: N+1 Query Pattern

Original logic:

const orders = await getOrders(userId);

for (const order of orders) {
  order.items = await getItems(order.id);
}

Under 20 orders → 21 queries.

✅ Optimized Join Query

SELECT o.id, o.created_at, i.product_id, i.quantity
FROM orders o
LEFT JOIN order_items i ON i.order_id = o.id
WHERE o.user_id = $1;

📈 Result

Drastically reduced DB round trips

Improved P95 latency stability

3️⃣ Introducing Multi-Layer Caching 🧠

Important principle:

⚠️ Do NOT cache blindly.
Cache only:

Read-heavy endpoints
Expensive aggregations
Low-volatility data

🟢 Layer 1: In-Memory Cache (Short TTL)

Used for ultra-hot endpoints.

import NodeCache from "node-cache";

const cache = new NodeCache({ stdTTL: 30 });

async function getUserProfile(userId) {
  const key = `user:${userId}`;

  const cached = cache.get(key);
  if (cached) return cached;

  const data = await fetchFromDB(userId);
  cache.set(key, data);

  return data;
}

Use case:

Frequently accessed profile endpoints
Dashboard metadata

Latency improvement: ~10–15%

🟡 Layer 2: Redis Distributed Cache

Used for:

Aggregated stats
Leaderboards
Expensive computations

async function getDashboardStats(userId) {
  const key = `dashboard:${userId}`;

  const cached = await redis.get(key);
  if (cached) return JSON.parse(cached);

  const data = await computeStats(userId);

  await redis.set(key, JSON.stringify(data), "EX", 120);

  return data;
}

📊 Impact

Significant DB load reduction
Improved P95 & P99 latency
Reduced spike sensitivity

4️⃣ Cache Invalidation Strategy ⚠️

Caching without invalidation creates stale data problems.

We used:

Event-based invalidation after writes
Short TTL for volatile data
Versioned keys when necessary

Example:

await redis.del(`dashboard:${userId}`);

Triggered immediately after order creation.

We avoided:

Long-lived static caches
Global flushes
Blind TTL-only approaches

5️⃣ Connection Pool Optimization 🔌

We observed:

DB pool exhaustion during spikes
Increased wait time for connections

Original pool size: 10
Optimized to: 25 (after validating DB capacity)

📈 Result

Reduced queuing delay
Stabilized latency under burst traffic

6️⃣ JSON Serialization Optimization ⚡

Large nested objects increased serialization cost.

Instead of returning deeply populated objects, we:

Reduced unnecessary fields
Flattened response structure
Avoided over-population

Serialization overhead dropped ~8–10%.

7️⃣ Final Optimized Flow 🏗️

Client
  ↓
Load Balancer
  ↓
Node.js (Clustered)
  ↓
In-Memory Cache
  ↓
Redis
  ↓
Optimized Indexed Queries
  ↓
Database

Thanks :).....

Top comments (1)

Darshan Raval • Jan 28