DEV Community

Cover image for Implementing a Circuit Breaker in Node.js
Ali nazari
Ali nazari

Posted on

Implementing a Circuit Breaker in Node.js

A circuit breaker prevents an application from repeatedly invoking an operation that is likely to fail (for example, a downstream service or external API).

Instead of continually waiting for timeouts and consuming resources, the circuit breaker trips after a measured failure threshold and begins rejecting requests immediately; after a cooldown it allows a small number of test requests (half-open) to check recovery.

This prevents cascading failures, reduces wasted concurrency and latency, and gives operators a clear signal when a downstream service is unhealthy.

The provided implementation encodes a classical three-state circuit breaker and a time-based sliding window implemented as short duration buckets:

  • States: Closed (normal), Open (reject requests), HalfOpen (allow test requests). (Enum CircuitState.)

  • Metrics:

buckets: HealthBucket[] — each bucket counts successes and failures and has a timestamp. The code creates 1s buckets and keeps a rolling window (30s) of buckets; it computes error rate across the window. This is a time-based sliding window implementation.

  • Config: threshold (error rate), minRequests (minimum calls before evaluation), sleepWindowMs (how long to stay Open), and lastTripTime. (Constructor defaults: 50% threshold, 10 minRequests, 30s sleep window.)

Walkthrough

Key fields and constructor:

private state: CircuitState = CircuitState.Closed
private buckets: HealthBucket[] = []

constructor(
  private threshold: number = 0.5,
  private minRequests: number = 10,
  private sleepWindowMs: number = 30000,
  private lastTripTime: number = 0
) { super(); }
Enter fullscreen mode Exit fullscreen mode

Meaning:

state holds the current breaker state.

buckets holds recent 1-second buckets with { successes, failures, timestamp }.

threshold and minRequests control sensitivity: only when the count of requests exceeds minRequests will the breaker evaluate whether failures / total > threshold.

The minRequests guard avoids tripping on tiny sample sizes (this is a common parameter in mature libraries like Resilience4j).

sleepWindowMs is the cooldown used to move Open -> HalfOpen for a trial.


execute method

public async execute<T>(task: () => Promise<T>): Promise<T> {
  this.updateState();

  if (this.state === CircuitState.Open) {
    throw new Error("Circuit is OPEN: Request rejected to protect the system.");
  }

  try {
    const result = await task();
    this.record(true);
    return result;
  } catch (error) {
    this.record(false);
    throw error;
  }
}
Enter fullscreen mode Exit fullscreen mode
  1. State refresh: updateState() is called at the start to transition Open -> HalfOpen if the sleep window has elapsed. This ensures the breaker can try recovery attempts.

  2. Fail-fast: If the circuit is Open, the method immediately throws — the service caller receives an immediate failure and the protected call is not made.

This implements the "fast fail" behavior.

  1. Instrumentation: On success or failure, record(...) is invoked to update metrics and (possibly) change state.

Note: the implementation records after the underlying promise completes. That is the correct semantic: success/failure must reflect the outcome of the protected operation.

Edge concerns

In HalfOpen, execute still allows a request through (because state is not Open), but the implementation relies on record to decide whether to Close or Open again.

There is no throttling or token bucket—every request during HalfOpen becomes a single trial.

Production breakers normally restrict trial attempts to a small number to avoid hammering the recovering system. See suggested improvements below.

record method

Relevant excerpt (paraphrased):

it gets the active bucket, increments successes or failures, checks thresholds, and handles HalfOpen special-case logic (if success -> close and clear buckets; if fail -> open and set lastTripTime).

public record(success: boolean) {
        const bucket = this.getActiveBucket();

        if (this.state === CircuitState.HalfOpen) {
            if (success) {
                this.state = CircuitState.Closed;
                this.buckets = [];
                console.log("Circuit is CLOSED System recovered!");
            } else {
                this.state = CircuitState.Open;
                this.lastTripTime = Date.now(); // Reset the clock for a new rest period
                console.warn("Test failed. Circuit back to OPEN.");
            }
            this.emit('state:changed', {
                from: CircuitState.HalfOpen,
                to: CircuitState[this.state],
                timestamp: Date.now()
            })
            return;
        }

        if (success) {
            bucket.successes++;
        } else {
            bucket.failures++
        }

        this.checkThreshold();
    }
Enter fullscreen mode Exit fullscreen mode

Behavioral details:

Handling HalfOpen: If the state is HalfOpen, the first recorded success immediately transitions to Closed and clears recorded buckets — effectively "we recovered".

A single failure during HalfOpen sets state back to Open and restarts the cooldown.

This is a simple and valid policy, but it's aggressive: one success immediately restores traffic.

Depending on your risk profile you may prefer: require N consecutive successes to close, or allow a limited number of trial calls and evaluate a small sample.

Recording in Closed: For normal operation, record increments counters in the active bucket and calls checkThreshold() to decide whether to trip.

Potential race conditions

Multiple concurrent calls to record mutate buckets and state.

Node’s single-threaded model reduces classical race conditions, but asynchronous interleaving still allows inconsistent reads and writes across multiple execute invocations.

For high-concurrency scenarios, consider light synchronization (e.g., atomic updates via a short critical section) or single-threaded eventing for metric updates.

checkThreshold method

Algorithm:

  1. If not Closed, return (only evaluate while Closed).

  2. Sum successes and failures across buckets.

  3. If totalRequest > minRequests, compute errorRate = failures / totalRequest.

  4. If errorRate > threshold, call trip().

   private checkThreshold() {
        if (this.state !== CircuitState.Closed) return;

        const totals = this.buckets.reduce((acc, b) => ({

            s: acc.s + b.successes,
            f: acc.f + b.failures

        }), { s: 0, f: 0 })

        const totalRequest = totals.s + totals.f;

        if (totalRequest > this.minRequests) {
            const errorRate = totals.f / totalRequest
            if (errorRate > this.threshold) {
                this.trip();
            }
        }
    }
Enter fullscreen mode Exit fullscreen mode

getActiveBucket method - sliding window logic

Key behavior:

  • Each call uses now = Date.now().

  • Buckets are 1s (bucketDurationMs = 1000).

  • If existing last bucket is still within its 1s window, that bucket is reused; otherwise a new bucket is pushed.

  • The buckets array is then filtered to retain only buckets within the rolling window (hardcoded windowSizeMs = 30000 — 30s), i.e., now - windowSizeMs < b.timestamp.

This is a standard time-based sliding window implemented with fixed-size time buckets. It’s readable and efficient for moderate throughput. Key choices to review:

  • bucketDurationMs controls granularity — smaller yields more precise temporal resolution but more buckets.

  • windowSizeMs controls the complete window over which you compute error rate (here hardcoded 30s; could be parameterized).

    private getActiveBucket(): HealthBucket {
        const now = Date.now();

        const bucketDurationMs = 1000;

        const lastBucket = this.buckets[this.buckets.length - 1];

        if (lastBucket && now < lastBucket.timestamp + bucketDurationMs) {
            return lastBucket;
        }

        const newBucket: HealthBucket = { failures: 0, successes: 0, timestamp: now }
        this.buckets.push(newBucket);

        const windowSizeMs = 30000;
        this.buckets = this.buckets.filter(b => now - windowSizeMs < b.timestamp)

        return newBucket

    }
Enter fullscreen mode Exit fullscreen mode

The function both returns the active bucket and prunes stale buckets; that keeps memory bounded.

time-based bucketed sliding windows are a common approach used in circuit breaker implementations and routers.


trip method

Behavior:

  • Save oldState.

  • Set state = Open, set lastTripTime = Date.now().

  • Emit state:changed with { from, to, timestamp }.

  • Log a warning.

  private trip() {
        const oldState = this.state;
        this.state = CircuitState.Open;
        this.lastTripTime = Date.now();
        console.warn("Circuit Breaker TRIP! State is now OPEN.");
        this.emit('state:changed', {
            from: oldState,
            to: CircuitState.Open,
            timestamp: Date.now()
        })
    }
Enter fullscreen mode Exit fullscreen mode

This is straightforward. Emitting an event gives you a hook for metrics, alerts or side effects (for example, increase a Prometheus counter, or trigger alerting in SRE workflows).

The class extends EventEmitter expressly for this reason.


updateState() method - Open -> HalfOpen transition

Logic:

  • If state === Open and (Date.now() - lastTripTime) > sleepWindowMs, transition to HalfOpen, emit state:changed, and log.

This implements the cooldown:

after sleepWindowMs elapses, try again. In HalfOpen, the class expects the next real request(s) to determine whether to close or open again (via record).

the whole picture

import EventEmitter from "node:events";

enum CircuitState {
    Open,
    Closed,
    HalfOpen
}

interface HealthBucket {
    successes: number;
    failures: number;
    timestamp: number; // To know when this bucket expires
}


export class CircuitBreaker extends EventEmitter {
    private state: CircuitState = CircuitState.Closed
    private buckets: HealthBucket[] = []


    constructor(
        private threshold: number = 0.5, // 50% failure rate
        private minRequests: number = 10,
        private sleepWindowMs: number = 30000, // 30s to stay Open
        private lastTripTime: number = 0
    ) {
        super();
    }


    public async execute<T>(task: () => Promise<T>): Promise<T> {
        this.updateState();

        if (this.state === CircuitState.Open) {
            throw new Error("Circuit is OPEN: Request rejected to protect the system.");
        }

        try {
            const result = await task();
            this.record(true);
            return result;
        } catch (error) {
            this.record(false);
            throw error;
        }
    }

    public record(success: boolean) {
        const bucket = this.getActiveBucket();

        if (this.state === CircuitState.HalfOpen) {
            if (success) {
                this.state = CircuitState.Closed;
                this.buckets = [];
                console.log("Circuit is CLOSED System recovered!");
            } else {
                this.state = CircuitState.Open;
                this.lastTripTime = Date.now(); // Reset the clock for a new rest period
                console.warn("Test failed. Circuit back to OPEN.");
            }
            this.emit('state:changed', {
                from: CircuitState.HalfOpen,
                to: CircuitState[this.state],
                timestamp: Date.now()
            })
            return;
        }

        if (success) {
            bucket.successes++;
        } else {
            bucket.failures++
        }

        this.checkThreshold();
    }

    // to check if the request is allowed
    public isOpen(): boolean {
        return this.state === CircuitState.Open;
    }

    private getActiveBucket(): HealthBucket {
        const now = Date.now();

        const bucketDurationMs = 1000;

        const lastBucket = this.buckets[this.buckets.length - 1];

        if (lastBucket && now < lastBucket.timestamp + bucketDurationMs) {
            return lastBucket;
        }

        const newBucket: HealthBucket = { failures: 0, successes: 0, timestamp: now }
        this.buckets.push(newBucket);

        const windowSizeMs = 30000;
        this.buckets = this.buckets.filter(b => now - windowSizeMs < b.timestamp)

        return newBucket

    }

    private checkThreshold() {
        if (this.state !== CircuitState.Closed) return;

        const totals = this.buckets.reduce((acc, b) => ({

            s: acc.s + b.successes,
            f: acc.f + b.failures

        }), { s: 0, f: 0 })

        const totalRequest = totals.s + totals.f;

        if (totalRequest > this.minRequests) {
            const errorRate = totals.f / totalRequest
            if (errorRate > this.threshold) {
                this.trip();
            }
        }
    }

    private trip() {
        const oldState = this.state;
        this.state = CircuitState.Open;
        this.lastTripTime = Date.now();
        console.warn("Circuit Breaker TRIP! State is now OPEN.");
        this.emit('state:changed', {
            from: oldState,
            to: CircuitState.Open,
            timestamp: Date.now()
        })
    }

    private updateState() {
        const oldState = this.state;
        if (this.state === CircuitState.Open && Date.now() - this.lastTripTime > this.sleepWindowMs) {
            this.state = CircuitState.HalfOpen
            console.log("Circuit is HALF-OPEN. Testing the waters...");
            this.emit('state:changed', {
                from: oldState,
                to: CircuitState.HalfOpen,
                timestamp: Date.now()
            })
        }

    }
}
Enter fullscreen mode Exit fullscreen mode

How to use this class

Small usage example (consumer side):

const breaker = new CircuitBreaker();

breaker.on('state:changed', info => {
  console.log('breaker', info);
});

async function callRemote() {
  return breaker.execute(async () => {
    // call the downstream API, e.g. fetch/axios
    return await fetchSomeApi();
  });
}
Enter fullscreen mode Exit fullscreen mode

Because execute throws immediately when the breaker is Open, callers should handle that and optionally provide fallbacks.

For example, return cached results or degrade features rather than retrying aggressively.

💡 Have questions? Drop them in the comments!

Top comments (0)