Performance issues in backend systems rarely come from one obvious bug.
In our production system, a high-traffic API showed:
- ๐ P95 latency ~ 820ms
- ๐ Frequent database CPU spikes
- โ Increased timeout rates during traffic bursts
After a structured optimization process, we reduced:
- โก P95 latency: 820ms โ 310ms
- ๐ง Database CPU usage reduced by ~40%
- ๐ Timeout rate reduced by ~70%
Hereโs the exact breakdown of what we did.
1๏ธโฃ Step One: Measure Before Optimizing
Before touching any code, we collected:
- P50 / P95 / P99 latency
- Slow query logs
- DB execution plans
- CPU & memory metrics
- Event loop lag
We added a lightweight response-time logger:
app.use((req, res, next) => {
const start = process.hrtime.bigint();
res.on("finish", () => {
const duration =
Number(process.hrtime.bigint() - start) / 1_000_000;
console.log(`${req.method} ${req.url} - ${duration.toFixed(2)}ms`);
});
next();
});
๐ Observation
- ~68% latency was database time
- ~20% was repeated identical queries
- Remaining was serialization + network overhead
Conclusion:
๐งฉ Primary bottleneck = Database layer.
2๏ธโฃ Query Optimization (Biggest Impact ๐ฅ)
โ Problem 1: Missing Composite Index
Original query:
SELECT *
FROM orders
WHERE user_id = $1
AND status = 'completed'
ORDER BY created_at DESC
LIMIT 20;
Execution plan showed:
- Sequential scan
- High disk I/O
- Large row filtering
โ
Fix: Composite Index
CREATE INDEX idx_orders_user_status_created
ON orders(user_id, status, created_at DESC);
๐ Result
- Query time: 180ms โ 35ms
- Removed full table scan
- Reduced disk reads significantly
Indexing alone reduced endpoint latency by ~25%.
โ Problem 2: Over-Fetching Data
Original:
SELECT * FROM users WHERE id = $1;
But API only needed:
- name
- profile_picture
โ
Optimized Query
SELECT name, profile_picture
FROM users
WHERE id = $1;
*๐ฏ Impact *
- Reduced payload size
- Lower memory allocation
- Faster JSON serialization
Small change. Measurable gain.
โ Problem 3: N+1 Query Pattern
Original logic:
const orders = await getOrders(userId);
for (const order of orders) {
order.items = await getItems(order.id);
}
Under 20 orders โ 21 queries.
โ
Optimized Join Query
SELECT o.id, o.created_at, i.product_id, i.quantity
FROM orders o
LEFT JOIN order_items i ON i.order_id = o.id
WHERE o.user_id = $1;
๐ Result
Drastically reduced DB round trips
Improved P95 latency stability
3๏ธโฃ Introducing Multi-Layer Caching ๐ง
Important principle:
โ ๏ธ Do NOT cache blindly.
Cache only:
- Read-heavy endpoints
- Expensive aggregations
- Low-volatility data
๐ข Layer 1: In-Memory Cache (Short TTL)
Used for ultra-hot endpoints.
import NodeCache from "node-cache";
const cache = new NodeCache({ stdTTL: 30 });
async function getUserProfile(userId) {
const key = `user:${userId}`;
const cached = cache.get(key);
if (cached) return cached;
const data = await fetchFromDB(userId);
cache.set(key, data);
return data;
}
Use case:
- Frequently accessed profile endpoints
- Dashboard metadata
Latency improvement: ~10โ15%
๐ก Layer 2: Redis Distributed Cache
Used for:
- Aggregated stats
- Leaderboards
- Expensive computations
async function getDashboardStats(userId) {
const key = `dashboard:${userId}`;
const cached = await redis.get(key);
if (cached) return JSON.parse(cached);
const data = await computeStats(userId);
await redis.set(key, JSON.stringify(data), "EX", 120);
return data;
}
๐ Impact
- Significant DB load reduction
- Improved P95 & P99 latency
- Reduced spike sensitivity
4๏ธโฃ Cache Invalidation Strategy โ ๏ธ
Caching without invalidation creates stale data problems.
We used:
- Event-based invalidation after writes
- Short TTL for volatile data
- Versioned keys when necessary
Example:
await redis.del(`dashboard:${userId}`);
Triggered immediately after order creation.
We avoided:
- Long-lived static caches
- Global flushes
- Blind TTL-only approaches
5๏ธโฃ Connection Pool Optimization ๐
We observed:
- DB pool exhaustion during spikes
- Increased wait time for connections
Original pool size: 10
Optimized to: 25 (after validating DB capacity)
๐ Result
- Reduced queuing delay
- Stabilized latency under burst traffic
6๏ธโฃ JSON Serialization Optimization โก
Large nested objects increased serialization cost.
Instead of returning deeply populated objects, we:
- Reduced unnecessary fields
- Flattened response structure
- Avoided over-population
Serialization overhead dropped ~8โ10%.
7๏ธโฃ Final Optimized Flow ๐๏ธ
Client
โ
Load Balancer
โ
Node.js (Clustered)
โ
In-Memory Cache
โ
Redis
โ
Optimized Indexed Queries
โ
Database
Thanks :).....
Top comments (1)
:)