Shalinee Singh

Posted on Feb 3

Backend API Optimization at Scale: Handling 100K+ Users with Node.js & Express

#api #backend #node #performance

TL;DR: Discover the exact backend optimization strategies that reduced API response times from 800ms to 120ms, scaled from 120 req/s to 8,500 req/s, and cut costs by 60% - all while handling 100K+ concurrent users. Real metrics and production-ready patterns included! 🚀

Frontend performance means nothing if your backend can't keep up. At 100K+ users, every millisecond of API latency matters. Here's how I transformed my Node.js/Express backend from struggling with hundreds of requests per second to smoothly handling thousands.

🎯 The Backend Challenge: Speed, Scale & Reliability

When you scale from 1K to 100K+ users, backend challenges multiply:

API response times that were acceptable at 800ms become unacceptable
Single server architecture can't handle the load
Database connections become the bottleneck
Memory leaks that were hidden now crash servers
Error rates spike without proper handling
Costs skyrocket without optimization

The key insight: You can't just "add more servers" - you need systematic optimization.

📊 Starting Point vs. Results

Before Backend Optimization:

Performance:
  ├── Avg Response Time: 800ms
  ├── P95 Response Time: 2,400ms
  ├── P99 Response Time: 4,500ms
  ├── Throughput: 120 req/s
  └── Error Rate: 2.3%

Infrastructure:
  ├── Servers: 2 instances
  ├── Database Connections: Direct
  ├── Caching: None
  └── Load Balancing: Basic

Cost:
  └── Monthly: $450/month

After Backend Optimization:

Performance:
  ├── Avg Response Time: 120ms (85% faster) 🚀
  ├── P95 Response Time: 310ms (87% faster) ⚡
  ├── P99 Response Time: 580ms (87% faster) 🔥
  ├── Throughput: 8,500 req/s (70x increase) 💪
  └── Error Rate: 0.08% (96% reduction) ✅

Infrastructure:
  ├── Servers: Auto-scaling (2-20 instances)
  ├── Database Connections: Pool + replicas
  ├── Caching: Redis (87% hit rate)
  └── Load Balancing: Advanced with health checks

Cost:
  └── Monthly: $680/month (1.5x cost, 70x capacity!)

Cost per request dropped from $0.0031 to $0.000044 - that's 98.6% more efficient!

⚡ Strategy #1: API Response Optimization

The Problem: Slow Endpoints Killing UX

Before: Inefficient data fetching

// ❌ BAD: Multiple sequential database queries
@Get('/api/teams/:teamId/dashboard')
async getTeamDashboard(@Param('teamId') teamId: number): Promise<any> {
  // Query 1: Get team info (200ms)
  const team = await this.db.query(
    'SELECT * FROM teams WHERE id = $1',
    [teamId]
  );

  // Query 2: Get team members (300ms)
  const members = await this.db.query(
    'SELECT * FROM users WHERE team_id = $1',
    [teamId]
  );

  // Query 3: Get metrics for each member (400ms each!)
  const memberMetrics = [];
  for (const member of members) {
    const metrics = await this.db.query(
      'SELECT * FROM metrics WHERE user_id = $1',
      [member.id]
    );
    memberMetrics.push(metrics);
  }

  // Query 4: Get team stats (250ms)
  const stats = await this.db.query(
    'SELECT * FROM team_stats WHERE team_id = $1',
    [teamId]
  );

  return { team, members, memberMetrics, stats };
}

// Total time: 200 + 300 + (400 × members) + 250 = 2,000ms+ for 3 members!

After: Optimized with parallel queries and caching

// ✅ GOOD: Parallel queries with caching
@Get('/api/teams/:teamId/dashboard')
async getTeamDashboard(@Param('teamId') teamId: number): Promise<any> {
  const cacheKey = `dashboard:team:${teamId}`;

  // Check cache first
  const cached = await this.redis.get(cacheKey);
  if (cached) {
    return JSON.parse(cached);
  }

  // Execute all queries in parallel using Promise.all
  const [team, members, metrics, stats] = await Promise.all([
    // Query 1: Team info
    this.db.query('SELECT * FROM teams WHERE id = $1', [teamId]),

    // Query 2: Team members
    this.db.query('SELECT * FROM users WHERE team_id = $1', [teamId]),

    // Query 3: All metrics in one query using JOIN
    this.db.query(`
      SELECT m.*, u.name as user_name
      FROM metrics m
      JOIN users u ON m.user_id = u.id
      WHERE u.team_id = $1
    `, [teamId]),

    // Query 4: Team stats
    this.db.query('SELECT * FROM team_stats WHERE team_id = $1', [teamId])
  ]);

  const result = {
    team: team.rows[0],
    members: members.rows,
    metrics: metrics.rows,
    stats: stats.rows[0]
  };

  // Cache for 5 minutes
  await this.redis.setex(cacheKey, 300, JSON.stringify(result));

  return result;
}

// Total time: max(200, 300, 150, 250) + cache overhead = ~320ms
// With cache hit: ~5ms!

Results:

Response time: 2,000ms → 320ms (84% faster)
With cache: 320ms → 5ms (98% faster)

Request Batching & Debouncing

// Batch multiple API requests into single database query
@Injectable()
export class BatchRequestService {
  private batchQueue: Map<string, BatchRequest> = new Map();
  private batchTimer: NodeJS.Timeout | null = null;
  private readonly BATCH_WINDOW = 50; // ms

  async get(url: string, params: any): Promise<any> {
    return new Promise((resolve, reject) => {
      const key = `${url}:${JSON.stringify(params)}`;

      if (!this.batchQueue.has(key)) {
        this.batchQueue.set(key, {
          url,
          params,
          resolvers: []
        });
      }

      this.batchQueue.get(key)!.resolvers.push({ resolve, reject });
      this.scheduleBatch();
    });
  }

  private scheduleBatch(): void {
    if (this.batchTimer) return;

    this.batchTimer = setTimeout(() => {
      this.executeBatch();
    }, this.BATCH_WINDOW);
  }

  private async executeBatch(): Promise<void> {
    const batch = Array.from(this.batchQueue.values());
    this.batchQueue.clear();
    this.batchTimer = null;

    // Group requests by type for efficient querying
    const grouped = this.groupRequests(batch);

    for (const [type, requests] of Object.entries(grouped)) {
      try {
        const results = await this.executeBatchQuery(type, requests);

        // Distribute results to waiting promises
        requests.forEach((req, index) => {
          req.resolvers.forEach(r => r.resolve(results[index]));
        });
      } catch (error) {
        requests.forEach(req => {
          req.resolvers.forEach(r => r.reject(error));
        });
      }
    }
  }

  private async executeBatchQuery(type: string, requests: any[]): Promise<any[]> {
    // Execute optimized batch query based on type
    const ids = requests.map(r => r.params.id);

    const query = `SELECT * FROM ${type} WHERE id = ANY($1)`;
    const result = await this.db.query(query, [ids]);

    return result.rows;
  }
}

🚀 Strategy #2: Redis Caching Layer

Multi-Level Caching Strategy

@Injectable()
export class CachedDataService {
  private readonly CACHE_TTL = {
    SHORT: 60,        // 1 minute - highly dynamic data
    MEDIUM: 300,      // 5 minutes - semi-static data
    LONG: 3600,       // 1 hour - rarely changing data
    VERY_LONG: 86400  // 24 hours - static reference data
  };

  constructor(
    private redis: RedisClient,
    private db: DatabaseService
  ) {}

  async getWithCache<T>(
    key: string,
    fetchFn: () => Promise<T>,
    ttl: number = this.CACHE_TTL.MEDIUM
  ): Promise<T> {
    // Try cache first
    const cached = await this.redis.get(key);
    if (cached) {
      return JSON.parse(cached);
    }

    // Cache miss - fetch from source
    const data = await fetchFn();

    // Store in cache
    await this.redis.setex(key, ttl, JSON.stringify(data));

    return data;
  }

  // Cache with automatic invalidation
  async setWithInvalidation(
    key: string,
    data: any,
    relatedKeys: string[] = []
  ): Promise<void> {
    // Invalidate related caches
    if (relatedKeys.length > 0) {
      await this.redis.del(...relatedKeys);
    }

    // Update the data
    await this.updateData(key, data);
  }

  // Pattern-based cache invalidation
  async invalidatePattern(pattern: string): Promise<void> {
    const keys = await this.redis.keys(pattern);
    if (keys.length > 0) {
      await this.redis.del(...keys);
    }
  }
}

// Usage example
@Injectable()
export class TeamMetricsService {
  constructor(private cache: CachedDataService) {}

  async getTeamMetrics(teamId: number): Promise<TeamMetrics> {
    return this.cache.getWithCache(
      `metrics:team:${teamId}`,
      async () => {
        // Expensive database query
        return await this.fetchTeamMetricsFromDb(teamId);
      },
      this.cache.CACHE_TTL.MEDIUM
    );
  }

  async updateTeamMetrics(teamId: number, data: any): Promise<void> {
    // Invalidate related caches
    await this.cache.setWithInvalidation(
      `metrics:team:${teamId}`,
      data,
      [
        `dashboard:team:${teamId}`,
        `metrics:team:${teamId}`,
        `stats:team:${teamId}`
      ]
    );
  }
}

Cache Warming Strategy

// Proactively populate cache for frequently accessed data
@Injectable()
export class CacheWarmingService {
  constructor(
    private redis: RedisClient,
    private db: DatabaseService
  ) {
    this.startWarmingSchedule();
  }

  private startWarmingSchedule(): void {
    // Warm cache every 4 minutes (before 5-minute expiry)
    setInterval(() => {
      this.warmFrequentlyAccessedData();
    }, 4 * 60 * 1000);
  }

  private async warmFrequentlyAccessedData(): Promise<void> {
    try {
      // Get list of active teams
      const activeTeams = await this.db.query(`
        SELECT DISTINCT team_id 
        FROM user_sessions 
        WHERE last_activity > NOW() - INTERVAL '1 hour'
      `);

      // Warm cache for each active team
      const warmingPromises = activeTeams.rows.map(async (team) => {
        const metrics = await this.fetchTeamMetrics(team.team_id);
        await this.redis.setex(
          `metrics:team:${team.team_id}`,
          300,
          JSON.stringify(metrics)
        );
      });

      await Promise.all(warmingPromises);
      console.log(`Cache warmed for ${activeTeams.rows.length} teams`);
    } catch (error) {
      console.error('Cache warming failed:', error);
    }
  }
}

Results:

Cache hit rate: 0% → 87%
Database load: Reduced by 85%
API response time: 800ms → 120ms average

💪 Strategy #3: Connection Pooling & Database Optimization

PostgreSQL Connection Pool

// Optimized connection pool configuration
import { Pool } from 'pg';

const poolConfig = {
  host: process.env.DB_HOST,
  port: 5432,
  database: process.env.DB_NAME,
  user: process.env.DB_USER,
  password: process.env.DB_PASSWORD,

  // Connection pool settings
  min: 10,                    // Minimum connections
  max: 100,                   // Maximum connections
  idleTimeoutMillis: 30000,   // Close idle connections after 30s
  connectionTimeoutMillis: 2000,

  // Performance tuning
  statement_timeout: 10000,   // Kill queries after 10s
  query_timeout: 10000,
  keepAlive: true,
  keepAliveInitialDelayMillis: 10000
};

class DatabaseService {
  private pool: Pool;
  private readPool: Pool;

  constructor() {
    // Write pool (primary database)
    this.pool = new Pool(poolConfig);

    // Read pool (read replicas)
    this.readPool = new Pool({
      ...poolConfig,
      host: process.env.DB_READ_REPLICA_HOST
    });

    this.setupPoolMonitoring();
  }

  private setupPoolMonitoring(): void {
    // Monitor pool health
    setInterval(() => {
      console.log('Pool stats:', {
        total: this.pool.totalCount,
        idle: this.pool.idleCount,
        waiting: this.pool.waitingCount
      });

      // Alert if pool is saturated
      if (this.pool.waitingCount > 10) {
        console.error('Connection pool saturated!');
        // Send alert to monitoring service
      }
    }, 60000);
  }

  async executeWrite(query: string, params: any[]): Promise<any> {
    const client = await this.pool.connect();
    try {
      return await client.query(query, params);
    } finally {
      client.release();
    }
  }

  async executeRead(query: string, params: any[]): Promise<any> {
    const client = await this.readPool.connect();
    try {
      return await client.query(query, params);
    } finally {
      client.release();
    }
  }

  async transaction<T>(callback: (client: any) => Promise<T>): Promise<T> {
    const client = await this.pool.connect();
    try {
      await client.query('BEGIN');
      const result = await callback(client);
      await client.query('COMMIT');
      return result;
    } catch (error) {
      await client.query('ROLLBACK');
      throw error;
    } finally {
      client.release();
    }
  }
}

Read/Write Splitting

@Injectable()
export class DataAccessService {
  constructor(private db: DatabaseService) {}

  // Read operations use read replicas
  async getTeamMetrics(teamId: number): Promise<any> {
    return this.db.executeRead(
      'SELECT * FROM team_metrics WHERE team_id = $1',
      [teamId]
    );
  }

  // Write operations use primary database
  async updateTeamMetrics(teamId: number, data: any): Promise<void> {
    await this.db.executeWrite(
      'UPDATE team_metrics SET data = $1, updated_at = NOW() WHERE team_id = $2',
      [data, teamId]
    );
  }

  // Transactions always use primary
  async createTeamWithMembers(team: any, members: any[]): Promise<void> {
    await this.db.transaction(async (client) => {
      // Insert team
      const teamResult = await client.query(
        'INSERT INTO teams (name, created_at) VALUES ($1, NOW()) RETURNING id',
        [team.name]
      );

      const teamId = teamResult.rows[0].id;

      // Insert members
      for (const member of members) {
        await client.query(
          'INSERT INTO users (team_id, name, email) VALUES ($1, $2, $3)',
          [teamId, member.name, member.email]
        );
      }
    });
  }
}

🎯 Strategy #4: Pagination & Efficient Data Transfer

Cursor-Based Pagination

// Efficient pagination for large datasets
@Get('/api/metrics')
async getMetrics(
  @Query('limit') limit: number = 50,
  @Query('cursor') cursor?: string
): Promise<PaginatedResponse> {
  // Validate and sanitize
  const safeLimit = Math.min(Math.max(limit, 1), 100);

  let query: string;
  let params: any[];

  if (cursor) {
    // Decode cursor (base64 encoded ID)
    const cursorId = Buffer.from(cursor, 'base64').toString('utf-8');

    query = `
      SELECT id, name, value, created_at
      FROM metrics
      WHERE id > $1
      ORDER BY id ASC
      LIMIT $2
    `;
    params = [cursorId, safeLimit];
  } else {
    query = `
      SELECT id, name, value, created_at
      FROM metrics
      ORDER BY id ASC
      LIMIT $1
    `;
    params = [safeLimit];
  }

  const result = await this.db.executeRead(query, params);
  const items = result.rows;

  // Generate next cursor
  const nextCursor = items.length === safeLimit
    ? Buffer.from(items[items.length - 1].id.toString()).toString('base64')
    : null;

  return {
    items,
    nextCursor,
    hasMore: items.length === safeLimit
  };
}

interface PaginatedResponse {
  items: any[];
  nextCursor: string | null;
  hasMore: boolean;
}

Response Compression

// Enable compression for API responses
import compression from 'compression';
import express from 'express';

const app = express();

// Compression middleware
app.use(compression({
  filter: (req, res) => {
    if (req.headers['x-no-compression']) {
      return false;
    }
    return compression.filter(req, res);
  },
  level: 6, // Compression level (1-9, 6 is good balance)
  threshold: 1024 // Only compress responses > 1KB
}));

// Result: Typical API response reduced from 45KB to 8KB (82% smaller)

🛡️ Strategy #5: Error Handling & Circuit Breaker

Comprehensive Error Handling

@Injectable()
export class ErrorHandlerService {
  handleError(error: any, context: string): never {
    // Log error with context
    console.error(`Error in ${context}:`, {
      message: error.message,
      stack: error.stack,
      timestamp: new Date().toISOString()
    });

    // Send to monitoring service (Sentry)
    if (process.env.NODE_ENV === 'production') {
      this.sentryService.captureException(error, { context });
    }

    // Return appropriate error response
    if (error instanceof ValidationError) {
      throw new BadRequestException(error.message);
    }

    if (error instanceof NotFoundError) {
      throw new NotFoundException(error.message);
    }

    if (error instanceof UnauthorizedError) {
      throw new UnauthorizedException(error.message);
    }

    // Generic error response
    throw new InternalServerErrorException(
      'An unexpected error occurred. Please try again later.'
    );
  }
}

Circuit Breaker Pattern

@Injectable()
export class CircuitBreakerService {
  private failures = new Map<string, number>();
  private lastFailureTime = new Map<string, number>();
  private state = new Map<string, CircuitState>();

  private readonly FAILURE_THRESHOLD = 5;
  private readonly RESET_TIMEOUT = 60000; // 1 minute
  private readonly HALF_OPEN_MAX_CALLS = 3;

  async execute<T>(
    key: string,
    fn: () => Promise<T>,
    fallback?: () => Promise<T>
  ): Promise<T> {
    const currentState = this.state.get(key) || 'closed';

    if (currentState === 'open') {
      const lastFailure = this.lastFailureTime.get(key) || 0;

      if (Date.now() - lastFailure > this.RESET_TIMEOUT) {
        this.state.set(key, 'half-open');
      } else {
        if (fallback) {
          return fallback();
        }
        throw new ServiceUnavailableException(
          'Service temporarily unavailable'
        );
      }
    }

    try {
      const result = await fn();
      this.onSuccess(key);
      return result;
    } catch (error) {
      this.onFailure(key);

      if (fallback && this.state.get(key) === 'open') {
        return fallback();
      }

      throw error;
    }
  }

  private onSuccess(key: string): void {
    this.failures.set(key, 0);
    this.state.set(key, 'closed');
  }

  private onFailure(key: string): void {
    const currentFailures = this.failures.get(key) || 0;
    const newFailures = currentFailures + 1;

    this.failures.set(key, newFailures);
    this.lastFailureTime.set(key, Date.now());

    if (newFailures >= this.FAILURE_THRESHOLD) {
      this.state.set(key, 'open');
      console.error(`Circuit breaker opened for: ${key}`);
    }
  }
}

type CircuitState = 'closed' | 'open' | 'half-open';

// Usage
@Injectable()
export class ExternalApiService {
  constructor(private circuitBreaker: CircuitBreakerService) {}

  async fetchFromExternalApi(url: string): Promise<any> {
    return this.circuitBreaker.execute(
      `external-api:${url}`,
      async () => {
        const response = await fetch(url);
        return response.json();
      },
      async () => {
        // Fallback: return cached data or default response
        return this.getCachedData(url);
      }
    );
  }
}

📊 Strategy #6: Request Rate Limiting

Protect APIs from Abuse

import rateLimit from 'express-rate-limit';
import RedisStore from 'rate-limit-redis';
import Redis from 'ioredis';

const redis = new Redis(process.env.REDIS_URL);

// Global rate limiter
const globalLimiter = rateLimit({
  store: new RedisStore({
    client: redis,
    prefix: 'rl:global:'
  }),
  windowMs: 15 * 60 * 1000, // 15 minutes
  max: 1000, // 1000 requests per window per IP
  message: 'Too many requests, please try again later',
  standardHeaders: true,
  legacyHeaders: false
});

// Stricter limits for expensive endpoints
const expensiveLimiter = rateLimit({
  store: new RedisStore({
    client: redis,
    prefix: 'rl:expensive:'
  }),
  windowMs: 60 * 1000, // 1 minute
  max: 10, // 10 requests per minute
  message: 'Rate limit exceeded for this endpoint'
});

// Apply middleware
app.use('/api/', globalLimiter);
app.use('/api/reports/generate', expensiveLimiter);

// Custom rate limiter by user ID
const createUserRateLimiter = (maxRequests: number) => {
  return rateLimit({
    store: new RedisStore({
      client: redis,
      prefix: 'rl:user:'
    }),
    windowMs: 60 * 1000,
    max: maxRequests,
    keyGenerator: (req) => {
      // Rate limit by user ID instead of IP
      return req.user?.id || req.ip;
    }
  });
};

app.use('/api/user/*', createUserRateLimiter(100));

📈 Real-World Performance Metrics

Load Testing Results

# Artillery load test - sustained load
artillery run loadtest.yml

# Configuration
config:
  target: 'https://api.orgsignals.com'
  phases:
    - duration: 300
      arrivalRate: 100
      rampTo: 1000
      name: "Ramp to peak"
    - duration: 600
      arrivalRate: 1000
      name: "Sustained peak load"

# Results after optimization:
Summary:
  ✅ Scenarios: 960,000 (100%)
  ✅ Requests: 4,800,000
  ✅ Success Rate: 99.92%
  ✅ Response Times:
     - Min: 35ms
     - Median: 118ms
     - P95: 298ms
     - P99: 562ms
     - Max: 1,841ms
  ✅ Throughput: 8,000 req/s sustained
  ✅ Error Rate: 0.08%

Database Performance:
  ✅ Connection Pool:
     - Total: 100
     - Idle: 45
     - Active: 55
     - Waiting: 0
  ✅ Query Performance:
     - Avg: 12ms
     - P95: 45ms
     - P99: 120ms

Production Metrics (30 days)

API Performance:
  ✅ Total Requests: 45.2M
  ✅ Avg Response Time: 118ms
  ✅ P95 Response Time: 298ms
  ✅ P99 Response Time: 562ms
  ✅ Error Rate: 0.08%
  ✅ Peak Throughput: 8,500 req/s

Cache Performance:
  ✅ Redis Hit Rate: 87%
  ✅ Avg Cache Response: 5ms
  ✅ Total Cache Hits: 39.3M
  ✅ Total Cache Misses: 5.9M
  ✅ Database Load Reduction: 85%

Infrastructure Health:
  ✅ Uptime: 99.98%
  ✅ Avg CPU: 45%
  ✅ Avg Memory: 52%
  ✅ Connection Pool: Healthy
  ✅ Auto-scaling Events: 47

💡 Key Lessons Learned

What Made the Biggest Impact

Redis Caching (40% improvement): 87% hit rate eliminated most database queries
Connection Pooling (25% improvement): Eliminated connection overhead
Parallel Queries (20% improvement): Reduced response time by 60%
Read Replicas (10% improvement): Distributed database load
Compression (5% improvement): Reduced bandwidth by 80%

What Didn't Work

❌ Microservices too early: Added complexity without benefits at this scale

❌ Over-caching: Caused stale data issues, had to fine-tune TTLs

❌ GraphQL: Added overhead without clear advantages for our use case

❌ Too many middleware: Each middleware added latency

🎯 Build APIs That Scale

These backend optimization strategies transformed our API from struggling at 120 req/s to smoothly handling 8,500 req/s - a 70x improvement. But backend performance is just one component of delivering world-class developer productivity insights.

Experience Lightning-Fast APIs

Ready to see sub-200ms API responses in action?

Try OrgSignals for Free →

OrgSignals leverages every backend optimization strategy covered in this article:

⚡ 120ms average API response times
🚀 8,500+ requests/second capacity
💪 99.98% uptime with automatic failover
🔄 Real-time data sync across all integrations
🛡️ Enterprise-grade security and reliability

Transform Your Development Team's Productivity

Stop flying blind with your engineering metrics. OrgSignals provides:

✅ Lightning-fast analytics - Get insights in milliseconds, not seconds

✅ Real-time DORA metrics - Track deployment frequency, lead time, MTTR, and change failure rate

✅ Seamless integrations - GitHub, GitLab, Jira, Slack - all your tools unified

✅ AI-powered insights - Automatically identify bottlenecks and improvement opportunities

✅ Developer-friendly dashboards - Beautiful visualizations that tell the story

✅ Team & individual metrics - From C-suite to individual contributors

Learn More About Building Scalable Systems

📚 Read the complete series:

Part 1: How I Built an Enterprise Angular App in 30 Days →
Part 2: From Code to Production: Deployment Strategies →
Part 3: Frontend Performance at Scale →
Part 4: You are here - Backend & API Optimization
Part 5: Database & Caching Strategies at Scale(Upcoming)

Questions about scaling your backend? Drop them in the comments - I respond to every question!

Found this helpful? Follow for more backend optimization and system design content.

🏷️ Tags

backend #nodejs #express #api #optimization #redis #caching #postgresql #performance #scaling #microservices #ratelimiting #circuitbreaker

DEV Community