In the past few days I faced something I didn't expect.
Features breaking one after another. Endpoints not working. Failures coming in a chain one failing, that causing another to fail, and that causing another. And I had no control over any of it. No way to stop it. No way to understand why it was happening. The failures just kept coming and my system kept getting more overloaded.
I won't lie I wanted to quit. Multiple times. I hit a point where I thought maybe this is not for me.
But there's something that keeps pulling me back. Curiosity. That one feeling of I want to understand this. I want to fix this. That's the fuel that kept me going even when I was completely frustrated.
And I gave it another chance. I sat with the problem again. And this time I found something that explained exactly what was happening and gave me a way to fix it.
That thing is called a circuit breaker.
What Was Actually Happening
Before I found this solution, my system was doing something really inefficient and I didn't even realize it.
Every time a service was failing, I was still sending requests to it. The service is down, failing, completely broken and I'm still passing requests through the router to it. Every single one of them failing. My server kept trying, resources kept getting consumed, and nothing was recovering. The failures were just piling up.
What I needed was a way to say okay, this service has failed too many times. Stop sending requests to it. Give it time to recover. Then try again carefully.
That's exactly what a circuit breaker does.
The Electrical Analogy
Think about an electrical circuit breaker in your home. When too much current flows something is overloaded, something is wrong the breaker trips. It opens the circuit. Power stops flowing. This protects everything from burning out.
Then after a while, you reset it carefully. If everything is fine, the circuit closes again and power flows normally.
Code circuit breakers work exactly the same way.
Three States: This is the Core Idea
CLOSED
Normal operation. All requests are allowed through. But failures are being tracked. If the number of failures crosses the threshold you set, the circuit opens.
OPEN
The circuit has tripped. All requests are blocked immediately. No function even runs. The system waits for the cooldown period to pass. This is the healing time your server recovers, the failing service gets time to come back up.
HALF_OPEN
After the cooldown, the circuit doesn't immediately go back to CLOSED. It moves to HALF_OPEN first. A limited number of requests are allowed through just enough to test whether the service has actually recovered. If those requests succeed, the circuit closes again and everything goes back to normal. If they fail, the circuit opens again and the cooldown starts over.
The Code
export class CircuitBreaker {
constructor(failureThreshold, cooldownMs) {
this.failureThreshold = failureThreshold
this.cooldownMs = cooldownMs
this.state = "CLOSED"
this.failureCount = 0
this.lastFailureTime = null
}
openCircuit() {
this.state = "OPEN"
this.lastFailureTime = Date.now()
}
closeCircuit() {
this.state = "CLOSED"
this.failureCount = 0
this.lastFailureTime = null
}
halfOpenCircuit() {
this.state = "HALF_OPEN"
}
async execute(fn) {
if (this.state === "OPEN") {
const cooldownExpired = Date.now() - this.lastFailureTime >= this.cooldownMs
if (!cooldownExpired) {
throw new ApiError(503, "Circuit is open.")
}
this.halfOpenCircuit();
}
try {
const result = await fn()
if (this.state === "HALF_OPEN") {
this.closeCircuit()
}
return result;
} catch (error) {
if (this.state === "HALF_OPEN") {
this.openCircuit()
throw error;
}
this.failureCount++
if (this.failureCount >= this.failureThreshold) {
this.openCircuit()
}
throw error
}
}
}
Where This Actually Gets Used
This isn't just for one specific case. Any time your system depends on an external service that can fail, a circuit breaker makes sense.
Payment gateways. External APIs. Microservices talking to each other. Database connections. Third party integrations. All of these can fail temporarily. And without a circuit breaker, your system will keep hammering them even when they're down wasting resources, slowing everything else down, and making recovery harder.
What This Actually Fixed for Me
Before circuit breaker failures came in a chain. One service went down, requests kept hitting it, my system kept consuming resources on requests that were definitely going to fail, and everything got worse.
After circuit breaker the moment failures cross the threshold, the circuit opens. Requests stop hitting the failing service. The system gets breathing room. The service gets time to recover. And when it comes back, HALF_OPEN tests it carefully before trusting it again.
That frustration I had the failures I couldn't control, the system I couldn't stabilize all of it was because I had no mechanism to stop the bleeding when something went wrong.
Circuit breaker is that mechanism.
If I got something wrong or anything can be improved — please drop it in the comments. I'm still learning and I want to get this right.

Top comments (0)