DEV Community

Atlas Whoff
Atlas Whoff

Posted on

Designing for Failure: Circuit Breakers and Bulkheads in Node.js

Designing for Failure: Circuit Breakers and Bulkheads in Node.js

Distributed systems fail. The question isn't if — it's how gracefully.

Circuit Breaker Pattern

A circuit breaker wraps external calls and short-circuits after repeated failures, preventing cascade failures:

import CircuitBreaker from 'opossum';

const options = {
  timeout: 3000,         // If fn takes longer than 3s, trigger failure
  errorThresholdPercent: 50,  // Open circuit when 50% of requests fail
  resetTimeout: 30000,   // Try again after 30s
};

const breaker = new CircuitBreaker(callExternalAPI, options);

breaker.fallback(() => ({ data: [], fromCache: true }));

breaker.on('open', () => console.warn('Circuit opened — external API failing'));
breaker.on('halfOpen', () => console.info('Circuit half-open — testing recovery'));
breaker.on('close', () => console.info('Circuit closed — external API recovered'));

// Usage
const result = await breaker.fire(requestData);
Enter fullscreen mode Exit fullscreen mode

Bulkhead Pattern

Bulkheads isolate failure. Like a ship's compartments — one flooded compartment doesn't sink the ship:

import Bottleneck from 'bottleneck';

// Separate rate limiters for different external services
// A spike in Stripe calls won't impact email sending
const stripeLimiter = new Bottleneck({ maxConcurrent: 10, minTime: 100 });
const emailLimiter = new Bottleneck({ maxConcurrent: 5, minTime: 200 });
const aiLimiter = new Bottleneck({ maxConcurrent: 3, minTime: 500 });

// Wrap calls with their respective limiters
const chargeCard = stripeLimiter.wrap(stripe.charges.create.bind(stripe.charges));
const sendEmail = emailLimiter.wrap(resend.emails.send.bind(resend.emails));
const callClaude = aiLimiter.wrap(anthropic.messages.create.bind(anthropic.messages));
Enter fullscreen mode Exit fullscreen mode

Timeout + Retry Pattern

async function withRetry<T>(
  fn: () => Promise<T>,
  { attempts = 3, delay = 1000, timeout = 5000 }: RetryOptions = {}
): Promise<T> {
  for (let i = 0; i < attempts; i++) {
    try {
      return await Promise.race([
        fn(),
        new Promise<never>((_, reject) =>
          setTimeout(() => reject(new Error('Timeout')), timeout)
        ),
      ]);
    } catch (err) {
      if (i === attempts - 1) throw err;
      await new Promise(r => setTimeout(r, delay * Math.pow(2, i)));
    }
  }
  throw new Error('unreachable');
}
Enter fullscreen mode Exit fullscreen mode

Health Checks

app.get('/health', async (req, res) => {
  const checks = await Promise.allSettled([
    db.$queryRaw`SELECT 1`,
    redis.ping(),
  ]);

  const status = checks.every(c => c.status === 'fulfilled') ? 200 : 503;
  res.status(status).json({
    db: checks[0].status,
    redis: checks[1].status,
  });
});
Enter fullscreen mode Exit fullscreen mode

Resilience patterns — circuit breakers, bulkheads, health checks — are production-ready in the AI SaaS Starter Kit.

Top comments (0)