DEV Community

Cover image for How a “Kind of Working” Payment Gateway Made Me Write This Circuit Breaker
Maksat Ramazan
Maksat Ramazan

Posted on

How a “Kind of Working” Payment Gateway Made Me Write This Circuit Breaker

I got bored and wrote a circuit breaker

A few years ago I was working at a fintech company. We had an integration with a payment gateway. Critical one — money literally depended on it.

One day it started behaving weird.

Not down. Not healthy. Just slow enough to destroy everything.

Requests were hanging. Threads piling up. Retries amplifying the problem. Other services started lagging. The gateway was still responding — just slowly enough to kill us.

We needed a circuit breaker. We didn't have one. So we hacked something together in production. Ugly, tightly coupled, impossible to test.

Fast forward to now — I built the circuit breaker I wish we had back then.

Show me the code

breaker, _ := circuitbreaker.New("stripe",
    cb.WithConsecutiveFailures(5),
    cb.WithOpenTimeout(30*time.Second),
)

err := breaker.Execute(ctx, func() error {
    return stripeClient.Charge(amount)
})

if errors.Is(err, cb.ErrCircuitOpen) {
    // fallback or fail fast
}
Enter fullscreen mode Exit fullscreen mode

That's it. Wrap any unreliable call — payment gateways, external APIs, legacy services, that one dependency that "kinda works".

Why not just use timeouts?

Timeouts protect your goroutine.
Circuit breakers protect your system.

When a dependency becomes slow:

  • retries multiply the load
  • queues grow
  • latency spreads
  • healthy parts fail because of one sick dependency

Circuit breaker says: "nope, not calling this for a while". System breathes again.

What's inside

Two trip strategies:

// Trip after N consecutive failures
cb.WithConsecutiveFailures(5)

// Trip when error rate exceeds threshold in time window
cb.WithSlidingWindow(time.Minute, 0.5, 10) // 50% errors, min 10 requests
Enter fullscreen mode Exit fullscreen mode

State machine:

Closed → Open → Half-Open → Closed
         ↑___________|  (probe fails)
Enter fullscreen mode Exit fullscreen mode

Distributed support:

// Share state across instances via Redis
store := redisstore.New(redisClient)
breaker, _ := cb.New("stripe", cb.WithStore(store))
Enter fullscreen mode Exit fullscreen mode

The numbers

BenchmarkExecute_Closed-8     89 ns/op    0 B/op    0 allocs/op
BenchmarkExecute_Open-8       15 ns/op    0 B/op    0 allocs/op
BenchmarkExecute_Parallel-8   45 ns/op    0 B/op    0 allocs/op
Enter fullscreen mode Exit fullscreen mode

Zero allocations in the hot path. No background goroutines. No magic.

Design constraints

Rule Why
No background workers Predictable resource usage
No framework coupling Drop into any codebase
Explicit configuration No surprises in production
Simple state machine Easy to reason about at 3 AM

~500 lines of code. One dependency (Redis client, optional).

Features

  • Consecutive failure counting
  • Sliding window with error rate threshold
  • Configurable half-open probes
  • Custom failure detection (WithIsFailure)
  • State change callbacks (WithOnStateChange)
  • Redis store for distributed systems
  • Thread-safe, race-free

Installation

go get github.com/aqylsoft/circuitbreaker
go get github.com/aqylsoft/circuitbreaker/redisstore # optional
Enter fullscreen mode Exit fullscreen mode

The real reason this exists

I still remember staring at dashboards at 3 AM, watching everything slow down because of one broken dependency, and thinking: "we should have done this properly from day one".

This is me doing it properly. Years later.


GitHub: https://github.com/aqylsoft/circuitbreaker

PRs and issues welcome. If this is useful — a star would be nice.

Top comments (0)