Nice — let’s build a practical, copy-pasteable guide you can use right away: patterns, pom.xml/ build.gradle snippets, application.yml examples, complete code samples (CircuitBreaker, Semaphore & ThreadPool Bulkhead, Retry, TimeLimiter, RateLimiter), combining annotations, testing tips, monitoring, tuning, and deployment notes.
Assumptions: you’re using Spring Boot with the resilience4j-spring-boot2/ resilience4j-spring-boot3 integration (Resilience4j 1.x/2.x work similarly). I'll show plain Java + Spring examples (non-reactive). If you want reactive examples later, I can add them.
Choose versions compatible with your Spring Boot. (I avoided locking to a single Boot version.)
Resilience4j configs are either per instance or global defaults. Here are examples for each annotation/feature:
Also enable actuator endpoints:
- Circuit Breaker : stop calling a failing downstream service to fail fast and let it recover.
- Bulkhead (Semaphore): protect CPU/memory by limiting concurrent calls within the same process.
- Bulkhead (ThreadPool): isolate blocking calls by running them on a dedicated thread pool.
- Retry : retry transient errors but with backoff and limited attempts.
- TimeLimiter : bound latency for async calls (integrates with Timeouts).
- RateLimiter : limit throughput to a downstream service (or limit your own outgoing).
- Combine : common pattern — RateLimiter → Bulkhead → CircuitBreaker → TimeLimiter → Retry (order depends on semantics; retries usually around network calls, but be careful to not retry in ways that worsen load).
This example demonstrates combining annotations plus fallback.
ExternalClient.java - a thin HTTP client (using RestTemplate)
ResilientService.java - apply Resilience4j annotations
Important notes about fallback signatures:
- Method name must match fallbackMethod (case-sensitive).
- Fallback method parameters must be the original method parameters plus optionally a final Throwable/Exception parameter.
- Return type must match.
AsyncResilientService.java - asynchronous pattern (TimeLimiter + ThreadPool Bulkhead)
Notes:
- @TimeLimiter works on CompletionStage / CompletableFuture.
- @Bulkhead with Type.THREADPOOL expects an async return type (e.g., CompletionStage / Future / CompletableFuture).
Order can matter. A typical ordering for outbound calls:
- RateLimiter — avoid hitting downstream too frequently.
- Bulkhead — limit concurrency so your service doesn’t exhaust resources.
- CircuitBreaker — prevent repeated calls to failing service.
- TimeLimiter — bound call latency (for async calls).
- Retry — apply retries only when appropriate (often after circuit/bulkhead/timeouts depending on the semantics you want).
But reality is nuanced. Example:
Be careful: retries can amplify load — combine with circuit breakers and backoffs.
You can configure default circuit breaker settings via a @Configuration:
(Resilience4j also supports configuring defaults via application.yml which is simpler for most teams.)
- Add micrometer-registry-prometheus and spring-boot-starter-actuator.
- Resilience4j exposes meters that Micrometer picks up. Prometheus can scrape /actuator/prometheus.
- CircuitBreaker state (OPEN/HALF_OPEN/CLOSED)
- Failure rate, slow-call rate
- Bulkhead queue sizes and rejected calls
- Retry calls count and successes/failures
- Timeouts
Unit test : mock the external client and simulate failures.
Integration test : use WireMock to simulate downstream behavior (timeouts, slow responses, 500s) and test circuit transitions and metrics.
Load test : use Gatling/jMeter to exercise Fault-injection and measure how circuit/bulkhead behave under load.
- Start conservative : permit enough calls for early testing, then tighten thresholds with real telemetry.
- MinimumNumberOfCalls : set minimumNumberOfCalls so the circuit doesn't open on a tiny sample.
- Half-open trials : allow a small number of calls to probe downstream (permittedNumberOfCallsInHalfOpenState).
- Retries : use exponential backoff (customize) and avoid retrying idempotent-unsafe operations by mistake.
- Bulkheads : prefer semaphore for low-latency operations and threadpool for blocking calls (DB, legacy blocking HTTP).
- TimeLimiter : don’t rely solely on TimeLimiter; combine with proper threadpool management to avoid exhaustion.
- Fallbacks : return cached values or degrade gracefully. Avoid heavy logic in fallback methods.
- Metrics : instrument the system and use alerts (e.g., circuit open > X minutes, failure rate > Y%).
- Observability : trace distributed calls with OpenTelemetry/Zipkin and tag traces with circuit/bulkhead outcomes.
- Retry + Bulkhead : retries inside the same process can exhaust concurrency — be careful with combining retry and semaphore bulkhead.
- Retrying non-idempotent operations : can cause side effects (e.g., duplicate payments).
- Fallback signature mismatch : causes runtime exceptions; ensure parameter order and types are correct.
- Blocking calls on main server threads : if you use threadpool bulkhead but your fallback or calling code blocks the calling thread, you may still exhaust connectors.
- Overly aggressive thresholds : opening circuits too early causes unnecessary failures.
src/main/java/com/example/resilience/ - config/ - Resilience4jConfig.java - client/ - ExternalClient.java - service/ - ResilientService.java - AsyncResilientService.java - web/ - DemoController.java src/test/java/... application.yml pom.xml
- Cache Fallback : keep a small cache (Caffeine) for last-known-good responses and return in fallbacks.
- Bulkhead metrics exporter : create a scheduled job to emit bulkhead queue metrics if you want fine-grained alerts.
- Circuit breaker event listener : subscribe to events for logging/alerts
- @CircuitBreaker(name = "myCb", fallbackMethod = "fallback")
- @Bulkhead(name = "b1", type = Bulkhead.Type.SEMAPHORE, fallbackMethod = "fb")
- @Bulkhead(name = "tpb", type = Bulkhead.Type.THREADPOOL, fallbackMethod = "fb") - method should be async (CompletableFuture)
- @Retry(name = "r1", fallbackMethod = "fb")
- @TimeLimiter(name = "tl1", fallbackMethod = "fb") - for async methods returning CompletionStage
- @RateLimiter(name = "rl1", fallbackMethod = "fb")
Add micrometer-registry-prometheus dependency and ensure /actuator/prometheus is exposed.
- ✅ Feature flags for toggling aggressive resilience settings
- ✅ End-to-end tests with injected downstream failures
- ✅ Metrics & dashboards set up (Prometheus + Grafana)
- ✅ Alerts on circuit open duration and failure rate thresholds
- ✅ Observability (tracing) to correlate client and server traces
- ✅ Documented fallback behaviors (what the system returns when degraded)
- ✅ Load testing to validate bulkhead and threadpool sizing
Originally published at https://dev.to on March 20, 2026.

Top comments (0)