DEV Community

crypto programer
crypto programer

Posted on • Originally published at Medium on

Full resiliency guide for Spring Boot microservices — using all Resilience4j annotations

Nice — let’s build a practical, copy-pasteable guide you can use right away: patterns, pom.xml/ build.gradle snippets, application.yml examples, complete code samples (CircuitBreaker, Semaphore & ThreadPool Bulkhead, Retry, TimeLimiter, RateLimiter), combining annotations, testing tips, monitoring, tuning, and deployment notes.

Assumptions: you’re using Spring Boot with the resilience4j-spring-boot2/ resilience4j-spring-boot3 integration (Resilience4j 1.x/2.x work similarly). I'll show plain Java + Spring examples (non-reactive). If you want reactive examples later, I can add them.

Choose versions compatible with your Spring Boot. (I avoided locking to a single Boot version.)

Resilience4j configs are either per instance or global defaults. Here are examples for each annotation/feature:

Also enable actuator endpoints:

  • Circuit Breaker : stop calling a failing downstream service to fail fast and let it recover.
  • Bulkhead (Semaphore): protect CPU/memory by limiting concurrent calls within the same process.
  • Bulkhead (ThreadPool): isolate blocking calls by running them on a dedicated thread pool.
  • Retry : retry transient errors but with backoff and limited attempts.
  • TimeLimiter : bound latency for async calls (integrates with Timeouts).
  • RateLimiter : limit throughput to a downstream service (or limit your own outgoing).
  • Combine : common pattern — RateLimiter → Bulkhead → CircuitBreaker → TimeLimiter → Retry (order depends on semantics; retries usually around network calls, but be careful to not retry in ways that worsen load).

This example demonstrates combining annotations plus fallback.

ExternalClient.java - a thin HTTP client (using RestTemplate)

ResilientService.java - apply Resilience4j annotations

Important notes about fallback signatures:

  • Method name must match fallbackMethod (case-sensitive).
  • Fallback method parameters must be the original method parameters plus optionally a final Throwable/Exception parameter.
  • Return type must match.

AsyncResilientService.java - asynchronous pattern (TimeLimiter + ThreadPool Bulkhead)

Notes:

  • @TimeLimiter works on CompletionStage / CompletableFuture.
  • @Bulkhead with Type.THREADPOOL expects an async return type (e.g., CompletionStage / Future / CompletableFuture).

Order can matter. A typical ordering for outbound calls:

  1. RateLimiter  — avoid hitting downstream too frequently.
  2. Bulkhead  — limit concurrency so your service doesn’t exhaust resources.
  3. CircuitBreaker  — prevent repeated calls to failing service.
  4. TimeLimiter  — bound call latency (for async calls).
  5. Retry  — apply retries only when appropriate (often after circuit/bulkhead/timeouts depending on the semantics you want).

But reality is nuanced. Example:

Be careful: retries can amplify load — combine with circuit breakers and backoffs.

You can configure default circuit breaker settings via a @Configuration:

(Resilience4j also supports configuring defaults via application.yml which is simpler for most teams.)

  • Add micrometer-registry-prometheus and spring-boot-starter-actuator.
  • Resilience4j exposes meters that Micrometer picks up. Prometheus can scrape /actuator/prometheus.
  • CircuitBreaker state (OPEN/HALF_OPEN/CLOSED)
  • Failure rate, slow-call rate
  • Bulkhead queue sizes and rejected calls
  • Retry calls count and successes/failures
  • Timeouts

Unit test : mock the external client and simulate failures.

Integration test : use WireMock to simulate downstream behavior (timeouts, slow responses, 500s) and test circuit transitions and metrics.

Load test : use Gatling/jMeter to exercise Fault-injection and measure how circuit/bulkhead behave under load.

  • Start conservative : permit enough calls for early testing, then tighten thresholds with real telemetry.
  • MinimumNumberOfCalls : set minimumNumberOfCalls so the circuit doesn't open on a tiny sample.
  • Half-open trials : allow a small number of calls to probe downstream (permittedNumberOfCallsInHalfOpenState).
  • Retries : use exponential backoff (customize) and avoid retrying idempotent-unsafe operations by mistake.
  • Bulkheads : prefer semaphore for low-latency operations and threadpool for blocking calls (DB, legacy blocking HTTP).
  • TimeLimiter : don’t rely solely on TimeLimiter; combine with proper threadpool management to avoid exhaustion.
  • Fallbacks : return cached values or degrade gracefully. Avoid heavy logic in fallback methods.
  • Metrics : instrument the system and use alerts (e.g., circuit open > X minutes, failure rate > Y%).
  • Observability : trace distributed calls with OpenTelemetry/Zipkin and tag traces with circuit/bulkhead outcomes.
  • Retry + Bulkhead : retries inside the same process can exhaust concurrency — be careful with combining retry and semaphore bulkhead.
  • Retrying non-idempotent operations : can cause side effects (e.g., duplicate payments).
  • Fallback signature mismatch : causes runtime exceptions; ensure parameter order and types are correct.
  • Blocking calls on main server threads : if you use threadpool bulkhead but your fallback or calling code blocks the calling thread, you may still exhaust connectors.
  • Overly aggressive thresholds : opening circuits too early causes unnecessary failures.
src/main/java/com/example/resilience/ - config/ - Resilience4jConfig.java - client/ - ExternalClient.java - service/ - ResilientService.java - AsyncResilientService.java - web/ - DemoController.java src/test/java/... application.yml pom.xml
Enter fullscreen mode Exit fullscreen mode
  • Cache Fallback : keep a small cache (Caffeine) for last-known-good responses and return in fallbacks.
  • Bulkhead metrics exporter : create a scheduled job to emit bulkhead queue metrics if you want fine-grained alerts.
  • Circuit breaker event listener : subscribe to events for logging/alerts
  • @CircuitBreaker(name = "myCb", fallbackMethod = "fallback")
  • @Bulkhead(name = "b1", type = Bulkhead.Type.SEMAPHORE, fallbackMethod = "fb")
  • @Bulkhead(name = "tpb", type = Bulkhead.Type.THREADPOOL, fallbackMethod = "fb") - method should be async (CompletableFuture)
  • @Retry(name = "r1", fallbackMethod = "fb")
  • @TimeLimiter(name = "tl1", fallbackMethod = "fb") - for async methods returning CompletionStage
  • @RateLimiter(name = "rl1", fallbackMethod = "fb")

Add micrometer-registry-prometheus dependency and ensure /actuator/prometheus is exposed.

  • ✅ Feature flags for toggling aggressive resilience settings
  • ✅ End-to-end tests with injected downstream failures
  • ✅ Metrics & dashboards set up (Prometheus + Grafana)
  • ✅ Alerts on circuit open duration and failure rate thresholds
  • ✅ Observability (tracing) to correlate client and server traces
  • ✅ Documented fallback behaviors (what the system returns when degraded)
  • ✅ Load testing to validate bulkhead and threadpool sizing

Originally published at https://dev.to on March 20, 2026.

Top comments (0)