Nice — let’s build a practical, copy-pasteable guide you can use right away: patterns, pom.xml/build.gradle snippets, application.yml examples, complete code samples (CircuitBreaker, Semaphore & ThreadPool Bulkhead, Retry, TimeLimiter, RateLimiter), combining annotations, testing tips, monitoring, tuning, and deployment notes.
Assumptions: you’re using Spring Boot with the
resilience4j-spring-boot2/resilience4j-spring-boot3integration (Resilience4j 1.x/2.x work similarly). I’ll show plain Java + Spring examples (non-reactive). If you want reactive examples later, I can add them.
1) Dependencies
Maven (pom.xml)
<!-- core Spring Boot dependencies omitted for brevity -->
<dependencies>
<!-- Spring Boot starter web -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<!-- Resilience4j starters -->
<dependency>
<groupId>io.github.resilience4j</groupId>
<artifactId>resilience4j-spring-boot3</artifactId><!-- or resilience4j-spring-boot2 for boot2 -->
<version>1.7.1</version> <!-- pick a compatible version for your Spring Boot -->
</dependency>
<dependency>
<groupId>io.github.resilience4j</groupId>
<artifactId>resilience4j-all</artifactId>
<version>1.7.1</version>
</dependency>
<!-- Optionally for metrics -->
<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-registry-prometheus</artifactId>
</dependency>
<!-- Spring Boot actuator to expose metrics -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
<!-- For tests -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-test</artifactId>
<scope>test</scope>
</dependency>
<!-- Optional: WireMock for integration tests -->
<dependency>
<groupId>com.github.tomakehurst</groupId>
<artifactId>wiremock-jre8</artifactId>
<scope>test</scope>
</dependency>
</dependencies>
Gradle (Kotlin DSL)
dependencies {
implementation("org.springframework.boot:spring-boot-starter-web")
implementation("io.github.resilience4j:resilience4j-spring-boot3:1.7.1")
implementation("io.github.resilience4j:resilience4j-all:1.7.1")
implementation("io.micrometer:micrometer-registry-prometheus")
implementation("org.springframework.boot:spring-boot-starter-actuator")
testImplementation("org.springframework.boot:spring-boot-starter-test")
testImplementation("com.github.tomakehurst:wiremock-jre8:2.27.2")
}
Choose versions compatible with your Spring Boot. (I avoided locking to a single Boot version.)
2) application.yml — configuration examples
Resilience4j configs are either per instance or global defaults. Here are examples for each annotation/feature:
resilience4j:
circuitbreaker:
configs:
default:
registerHealthIndicator: true
slidingWindowType: COUNT_BASED
slidingWindowSize: 20
minimumNumberOfCalls: 10
permittedNumberOfCallsInHalfOpenState: 5
waitDurationInOpenState: 30s
failureRateThreshold: 50
automaticTransitionFromOpenToHalfOpenEnabled: false
instances:
externalServiceCB:
baseConfig: default
waitDurationInOpenState: 10s
failureRateThreshold: 40
retry:
instances:
externalServiceRetry:
maxAttempts: 3
waitDuration: 500ms
retryExceptions:
- java.io.IOException
- java.util.concurrent.TimeoutException
timelimiter:
instances:
externalServiceTL:
timeoutDuration: 2s
cancelRunningFuture: true
bulkhead:
configs:
default:
maxConcurrentCalls: 10
maxWaitDuration: 0ms # for semaphore bulkhead
threadpool-default:
maxThreadPoolSize: 10
coreThreadPoolSize: 5
queueCapacity: 50
keepAliveDuration: 30s
instances:
semaphoreBulkhead:
baseConfig: default
maxConcurrentCalls: 20
threadPoolBulkhead:
baseConfig: threadpool-default
ratelimiter:
instances:
externalServiceRateLimiter:
limitForPeriod: 10
limitRefreshPeriod: 1s
timeoutDuration: 0
Also enable actuator endpoints:
management:
endpoints:
web:
exposure:
include: health,info,prometheus
endpoint:
health:
show-details: always
3) Pattern explanation — when to use what (short)
- Circuit Breaker: stop calling a failing downstream service to fail fast and let it recover.
- Bulkhead (Semaphore): protect CPU/memory by limiting concurrent calls within the same process.
- Bulkhead (ThreadPool): isolate blocking calls by running them on a dedicated thread pool.
- Retry: retry transient errors but with backoff and limited attempts.
- TimeLimiter: bound latency for async calls (integrates with Timeouts).
- RateLimiter: limit throughput to a downstream service (or limit your own outgoing).
-
Combine: common pattern —
RateLimiter→Bulkhead→CircuitBreaker→TimeLimiter→Retry(order depends on semantics; retries usually around network calls, but be careful to not retry in ways that worsen load).
4) Example: service calling external HTTP API (synchronous)
This example demonstrates combining annotations plus fallback.
ExternalClient.java — a thin HTTP client (using RestTemplate)
@Service
public class ExternalClient {
private final RestTemplate restTemplate;
public ExternalClient(RestTemplateBuilder builder) {
this.restTemplate = builder
.setReadTimeout(Duration.ofSeconds(5))
.setConnectTimeout(Duration.ofSeconds(2))
.build();
}
public String getRemoteData(String id) {
String url = "https://external.service/api/resource/" + id;
return restTemplate.getForObject(url, String.class);
}
}
ResilientService.java — apply Resilience4j annotations
@Service
public class ResilientService {
private final ExternalClient externalClient;
public ResilientService(ExternalClient externalClient) {
this.externalClient = externalClient;
}
// Use CircuitBreaker + Semaphore Bulkhead + Retry + RateLimiter
@RateLimiter(name = "externalServiceRateLimiter", fallbackMethod = "rateLimiterFallback")
@Bulkhead(name = "semaphoreBulkhead", type = Bulkhead.Type.SEMAPHORE, fallbackMethod = "bulkheadFallback")
@Retry(name = "externalServiceRetry", fallbackMethod = "retryFallback")
@CircuitBreaker(name = "externalServiceCB", fallbackMethod = "circuitFallback")
public String getData(String id) {
return externalClient.getRemoteData(id);
}
// Fallback signatures: same return type and either same args + Throwable or same args + Exception
public String circuitFallback(String id, Throwable t) {
// fallback behavior for circuit breaker
return "circuit-fallback: cached-or-default";
}
public String retryFallback(String id, Throwable t) {
return "retry-fallback: sorry";
}
public String bulkheadFallback(String id, BulkheadFullException ex) {
return "bulkhead-fallback: overloaded";
}
public String rateLimiterFallback(String id, RequestNotPermitted ex) {
return "rate-limited-fallback: try-later";
}
}
Important notes about fallback signatures:
- Method name must match
fallbackMethod(case-sensitive). - Fallback method parameters must be the original method parameters plus optionally a final
Throwable/Exceptionparameter. - Return type must match.
5) TimeLimiter (async) + ThreadPoolBulkhead example
AsyncResilientService.java — asynchronous pattern (TimeLimiter + ThreadPool Bulkhead)
@Service
public class AsyncResilientService {
private final ExternalClient externalClient;
private final ExecutorService executor = Executors.newFixedThreadPool(10);
public AsyncResilientService(ExternalClient externalClient) {
this.externalClient = externalClient;
}
// TimeLimiter expects a CompletableFuture (async)
@Bulkhead(name = "threadPoolBulkhead", type = Bulkhead.Type.THREADPOOL, fallbackMethod = "tpbFallback")
@TimeLimiter(name = "externalServiceTL", fallbackMethod = "timeLimiterFallback")
@CircuitBreaker(name = "externalServiceCB", fallbackMethod = "circuitFallback")
public CompletableFuture<String> getDataAsync(String id) {
// run blocking/rest call in CompletableFuture using executor
return CompletableFuture.supplyAsync(() -> externalClient.getRemoteData(id), executor);
}
public CompletableFuture<String> tpbFallback(String id, BulkheadFullException ex) {
return CompletableFuture.completedFuture("threadpool-bulkhead-fallback");
}
public CompletableFuture<String> timeLimiterFallback(String id, Throwable t) {
return CompletableFuture.completedFuture("time-limiter-fallback");
}
public CompletableFuture<String> circuitFallback(String id, Throwable t) {
return CompletableFuture.completedFuture("circuit-fallback-async");
}
}
Notes:
-
@TimeLimiterworks onCompletionStage/CompletableFuture. -
@BulkheadwithType.THREADPOOLexpects an async return type (e.g.,CompletionStage/Future/CompletableFuture).
6) Combining annotations — recommended order & rationale
Order can matter. A typical ordering for outbound calls:
- RateLimiter — avoid hitting downstream too frequently.
- Bulkhead — limit concurrency so your service doesn't exhaust resources.
- CircuitBreaker — prevent repeated calls to failing service.
- TimeLimiter — bound call latency (for async calls).
- Retry — apply retries only when appropriate (often after circuit/bulkhead/timeouts depending on the semantics you want).
But reality is nuanced. Example:
@RateLimiter(...)
@Bulkhead(...)
@CircuitBreaker(...)
@TimeLimiter(...)
@Retry(...)
public CompletableFuture<String> call() { ... }
Be careful: retries can amplify load — combine with circuit breakers and backoffs.
7) Example controller wiring it all together
@RestController
@RequestMapping("/api")
public class DemoController {
private final ResilientService resilientService;
private final AsyncResilientService asyncResilientService;
public DemoController(ResilientService r, AsyncResilientService ar) {
this.resilientService = r;
this.asyncResilientService = ar;
}
@GetMapping("/sync/{id}")
public ResponseEntity<String> sync(@PathVariable String id) {
return ResponseEntity.ok(resilientService.getData(id));
}
@GetMapping("/async/{id}")
public CompletableFuture<ResponseEntity<String>> async(@PathVariable String id) {
return asyncResilientService.getDataAsync(id)
.thenApply(ResponseEntity::ok);
}
}
8) Customizing defaults programmatically
You can configure default circuit breaker settings via a @Configuration:
@Configuration
public class Resilience4jConfig {
@Bean
public Customizer<io.github.resilience4j.circuitbreaker.configure.CircuitBreakerConfiguration> circuitBreakerCustomizer() {
return builder -> {
// Depending on Resilience4j Spring Boot integration you can register default configs.
};
}
}
(Resilience4j also supports configuring defaults via application.yml which is simpler for most teams.)
9) Metrics and observability
- Expose metrics:
- Add
micrometer-registry-prometheusandspring-boot-starter-actuator. - Resilience4j exposes meters that Micrometer picks up. Prometheus can scrape
/actuator/prometheus.
- Key metrics to monitor:
- CircuitBreaker state (OPEN/HALF_OPEN/CLOSED)
- Failure rate, slow-call rate
- Bulkhead queue sizes and rejected calls
- Retry calls count and successes/failures
- Timeouts
- Health indicators:
-
resilience4j.circuitbreaker.instances.*.registerHealthIndicator=truewill register health details.
- Dashboards:
- Use Grafana + Prometheus. Visualize CB states, failure rate trends, latency percentiles.
10) Tests
Unit test: mock the external client and simulate failures.
@SpringBootTest
class ResilientServiceTest {
@MockBean
ExternalClient externalClient;
@Autowired
ResilientService service;
@Test
void whenExternalFailsCircuitOpensAndFallbackUsed() {
when(externalClient.getRemoteData(anyString()))
.thenThrow(new RuntimeException("down"));
String result = service.getData("1");
assertTrue(result.contains("fallback"));
}
}
Integration test: use WireMock to simulate downstream behavior (timeouts, slow responses, 500s) and test circuit transitions and metrics.
Load test: use Gatling/jMeter to exercise Fault-injection and measure how circuit/bulkhead behave under load.
11) Tuning tips & best practices
- Start conservative: permit enough calls for early testing, then tighten thresholds with real telemetry.
-
MinimumNumberOfCalls: set
minimumNumberOfCallsso the circuit doesn't open on a tiny sample. - Half-open trials: allow a small number of calls to probe downstream (permittedNumberOfCallsInHalfOpenState).
- Retries: use exponential backoff (customize) and avoid retrying idempotent-unsafe operations by mistake.
- Bulkheads: prefer semaphore for low-latency operations and threadpool for blocking calls (DB, legacy blocking HTTP).
- TimeLimiter: don't rely solely on TimeLimiter; combine with proper threadpool management to avoid exhaustion.
- Fallbacks: return cached values or degrade gracefully. Avoid heavy logic in fallback methods.
- Metrics: instrument the system and use alerts (e.g., circuit open > X minutes, failure rate > Y%).
- Observability: trace distributed calls with OpenTelemetry/Zipkin and tag traces with circuit/bulkhead outcomes.
12) Common pitfalls
- Retry + Bulkhead: retries inside the same process can exhaust concurrency — be careful with combining retry and semaphore bulkhead.
- Retrying non-idempotent operations: can cause side effects (e.g., duplicate payments).
- Fallback signature mismatch: causes runtime exceptions; ensure parameter order and types are correct.
- Blocking calls on main server threads: if you use threadpool bulkhead but your fallback or calling code blocks the calling thread, you may still exhaust connectors.
- Overly aggressive thresholds: opening circuits too early causes unnecessary failures.
13) Example project layout (suggested)
src/main/java/com/example/resilience/
- config/
- Resilience4jConfig.java
- client/
- ExternalClient.java
- service/
- ResilientService.java
- AsyncResilientService.java
- web/
- DemoController.java
src/test/java/...
application.yml
pom.xml
14) Handy utilities & patterns
- Cache Fallback: keep a small cache (Caffeine) for last-known-good responses and return in fallbacks.
- Bulkhead metrics exporter: create a scheduled job to emit bulkhead queue metrics if you want fine-grained alerts.
- Circuit breaker event listener: subscribe to events for logging/alerts
@Component
public class CircuitBreakerEventListener {
@Autowired
public CircuitBreakerEventListener(CircuitBreakerRegistry registry) {
registry.getAllCircuitBreakers()
.forEach(cb -> cb.getEventPublisher()
.onStateTransition(event -> {
// log or alert
System.out.println("CB " + cb.getName() + " -> " + event.getStateTransition());
}));
}
}
15) Quick reference: annotation usage examples
@CircuitBreaker(name = "myCb", fallbackMethod = "fallback")@Bulkhead(name = "b1", type = Bulkhead.Type.SEMAPHORE, fallbackMethod = "fb")-
@Bulkhead(name = "tpb", type = Bulkhead.Type.THREADPOOL, fallbackMethod = "fb")— method should be async (CompletableFuture) @Retry(name = "r1", fallbackMethod = "fb")-
@TimeLimiter(name = "tl1", fallbackMethod = "fb")— for async methods returningCompletionStage @RateLimiter(name = "rl1", fallbackMethod = "fb")
16) Example: circuitbreaker + event listener + Prometheus metrics (snippet)
@Bean
public MeterRegistryCustomizer<MeterRegistry> metricsCommonTags() {
return registry -> registry.config().commonTags("application", "resilience-demo");
}
Add micrometer-registry-prometheus dependency and ensure /actuator/prometheus is exposed.
17) Final checklist before shipping to production
- ✅ Feature flags for toggling aggressive resilience settings
- ✅ End-to-end tests with injected downstream failures
- ✅ Metrics & dashboards set up (Prometheus + Grafana)
- ✅ Alerts on circuit open duration and failure rate thresholds
- ✅ Observability (tracing) to correlate client and server traces
- ✅ Documented fallback behaviors (what the system returns when degraded)
- ✅ Load testing to validate bulkhead and threadpool sizing
Top comments (0)