Circuit breakers and bulkheads are design patterns — their numbers are operational weapons. Failure ratio 50% or 30%? Max concurrent calls 25 or 100? During an outage the right answer changes hourly. Code the pattern once; tune thresholds live.
Kiponos.io separates resilience structure (in Java) from resilience parameters (in live config tree).
Pattern in code, numbers in Kiponos
public boolean allowCall(String downstream) {
var cfg = kiponos.path("resilience", downstream);
return breaker(downstream)
.failureRateThreshold(cfg.getFloat("failure_rate_threshold"))
.waitDurationInOpenState(cfg.getInt("open_seconds"))
.permittedInHalfOpen(cfg.getInt("half_open_calls"))
.tryAcquire();
}
Ops opens circuit sensitivity during brownout — dashboard edit, not redeploy.
Resilience tree
resilience/
payments-api/
failure_rate_threshold: 0.5
open_seconds: 30
half_open_calls: 5
bulkhead_max_concurrent: 40
inventory-api/
failure_rate_threshold: 0.35
open_seconds: 60
bulkhead_max_concurrent: 25
global/
force_open_all: false
Extreme: coordinated degradation
Platform SRE sets force_open_all: false normally. During regional disaster, flip selective open_seconds sky-high on non-critical downstreams — bulkhead by configuration, Java still executes pattern logic.
Performance
Breaker checks are per-call — getFloat() must be local. See rate limits article.
Getting started
- Externalize Resilience4j YAML values to
resilience/* - Incident drill: tighten
failure_rate_thresholdlive
Resources: github.com/kiponos-io/kiponos-io
Kiponos.io — resilience patterns with live numbers. Breakers that bend during the outage.
Top comments (0)