DEV Community

Cover image for ๐Ÿš€ How We Reduced API Response Time by 60% in Node.js Using Caching, Query Optimization & Performance Profiling
Darshan Raval
Darshan Raval

Posted on

๐Ÿš€ How We Reduced API Response Time by 60% in Node.js Using Caching, Query Optimization & Performance Profiling

Performance issues in backend systems rarely come from one obvious bug.

In our production system, a high-traffic API showed:

  • ๐Ÿ“‰ P95 latency ~ 820ms
  • ๐Ÿ“ˆ Frequent database CPU spikes
  • โ›” Increased timeout rates during traffic bursts

After a structured optimization process, we reduced:

  • โšก P95 latency: 820ms โ†’ 310ms
  • ๐Ÿง  Database CPU usage reduced by ~40%
  • ๐Ÿ“Š Timeout rate reduced by ~70%

Hereโ€™s the exact breakdown of what we did.


1๏ธโƒฃ Step One: Measure Before Optimizing

Before touching any code, we collected:

  • P50 / P95 / P99 latency
  • Slow query logs
  • DB execution plans
  • CPU & memory metrics
  • Event loop lag

We added a lightweight response-time logger:

app.use((req, res, next) => {
  const start = process.hrtime.bigint();

  res.on("finish", () => {
    const duration =
      Number(process.hrtime.bigint() - start) / 1_000_000;

    console.log(`${req.method} ${req.url} - ${duration.toFixed(2)}ms`);
  });

  next();
});
Enter fullscreen mode Exit fullscreen mode

๐Ÿ”Ž Observation

  • ~68% latency was database time
  • ~20% was repeated identical queries
  • Remaining was serialization + network overhead

Conclusion:
๐Ÿงฉ Primary bottleneck = Database layer.


2๏ธโƒฃ Query Optimization (Biggest Impact ๐Ÿ”ฅ)

โŒ Problem 1: Missing Composite Index

Original query:

SELECT * 
FROM orders
WHERE user_id = $1
AND status = 'completed'
ORDER BY created_at DESC
LIMIT 20;
Enter fullscreen mode Exit fullscreen mode

Execution plan showed:

  • Sequential scan
  • High disk I/O
  • Large row filtering

โœ… Fix: Composite Index

CREATE INDEX idx_orders_user_status_created
ON orders(user_id, status, created_at DESC);
Enter fullscreen mode Exit fullscreen mode

๐Ÿ“Š Result

  • Query time: 180ms โ†’ 35ms
  • Removed full table scan
  • Reduced disk reads significantly

Indexing alone reduced endpoint latency by ~25%.


โŒ Problem 2: Over-Fetching Data

Original:

SELECT * FROM users WHERE id = $1;
Enter fullscreen mode Exit fullscreen mode

But API only needed:

  • name
  • profile_picture

โœ… Optimized Query

SELECT name, profile_picture
FROM users
WHERE id = $1;
Enter fullscreen mode Exit fullscreen mode

*๐ŸŽฏ Impact *

  • Reduced payload size
  • Lower memory allocation
  • Faster JSON serialization

Small change. Measurable gain.


โŒ Problem 3: N+1 Query Pattern

Original logic:

const orders = await getOrders(userId);

for (const order of orders) {
  order.items = await getItems(order.id);
}
Enter fullscreen mode Exit fullscreen mode

Under 20 orders โ†’ 21 queries.

โœ… Optimized Join Query

SELECT o.id, o.created_at, i.product_id, i.quantity
FROM orders o
LEFT JOIN order_items i ON i.order_id = o.id
WHERE o.user_id = $1;

Enter fullscreen mode Exit fullscreen mode

๐Ÿ“ˆ Result

Drastically reduced DB round trips

Improved P95 latency stability


3๏ธโƒฃ Introducing Multi-Layer Caching ๐Ÿง 

Important principle:

โš ๏ธ Do NOT cache blindly.
Cache only:

  • Read-heavy endpoints
  • Expensive aggregations
  • Low-volatility data

๐ŸŸข Layer 1: In-Memory Cache (Short TTL)

Used for ultra-hot endpoints.

import NodeCache from "node-cache";

const cache = new NodeCache({ stdTTL: 30 });

async function getUserProfile(userId) {
  const key = `user:${userId}`;

  const cached = cache.get(key);
  if (cached) return cached;

  const data = await fetchFromDB(userId);
  cache.set(key, data);

  return data;
}
Enter fullscreen mode Exit fullscreen mode

Use case:

  • Frequently accessed profile endpoints
  • Dashboard metadata

Latency improvement: ~10โ€“15%

๐ŸŸก Layer 2: Redis Distributed Cache

Used for:

  • Aggregated stats
  • Leaderboards
  • Expensive computations
async function getDashboardStats(userId) {
  const key = `dashboard:${userId}`;

  const cached = await redis.get(key);
  if (cached) return JSON.parse(cached);

  const data = await computeStats(userId);

  await redis.set(key, JSON.stringify(data), "EX", 120);

  return data;
}
Enter fullscreen mode Exit fullscreen mode

๐Ÿ“Š Impact

  • Significant DB load reduction
  • Improved P95 & P99 latency
  • Reduced spike sensitivity

4๏ธโƒฃ Cache Invalidation Strategy โš ๏ธ

Caching without invalidation creates stale data problems.

We used:

  • Event-based invalidation after writes
  • Short TTL for volatile data
  • Versioned keys when necessary

Example:

await redis.del(`dashboard:${userId}`);
Enter fullscreen mode Exit fullscreen mode

Triggered immediately after order creation.

We avoided:

  • Long-lived static caches
  • Global flushes
  • Blind TTL-only approaches

5๏ธโƒฃ Connection Pool Optimization ๐Ÿ”Œ

We observed:

  • DB pool exhaustion during spikes
  • Increased wait time for connections

Original pool size: 10
Optimized to: 25 (after validating DB capacity)

๐Ÿ“ˆ Result

  • Reduced queuing delay
  • Stabilized latency under burst traffic

6๏ธโƒฃ JSON Serialization Optimization โšก

Large nested objects increased serialization cost.

Instead of returning deeply populated objects, we:

  • Reduced unnecessary fields
  • Flattened response structure
  • Avoided over-population

Serialization overhead dropped ~8โ€“10%.


7๏ธโƒฃ Final Optimized Flow ๐Ÿ—๏ธ

Client
  โ†“
Load Balancer
  โ†“
Node.js (Clustered)
  โ†“
In-Memory Cache
  โ†“
Redis
  โ†“
Optimized Indexed Queries
  โ†“
Database
Enter fullscreen mode Exit fullscreen mode

Thanks :).....

Top comments (1)

Collapse
 
darshanraval profile image
Darshan Raval

:)