<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Athul Mukundan</title>
    <description>The latest articles on DEV Community by Athul Mukundan (@athulmr).</description>
    <link>https://dev.to/athulmr</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2126292%2Fb41afeda-1795-4f8d-a1c8-a349b95de20f.jpg</url>
      <title>DEV Community: Athul Mukundan</title>
      <link>https://dev.to/athulmr</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/athulmr"/>
    <language>en</language>
    <item>
      <title>From Resilient to Reactive: Why Your REST Service Needs Events</title>
      <dc:creator>Athul Mukundan</dc:creator>
      <pubDate>Mon, 04 Aug 2025 09:29:51 +0000</pubDate>
      <link>https://dev.to/athulmr/from-resilient-to-reactive-why-your-rest-service-needs-events-4222</link>
      <guid>https://dev.to/athulmr/from-resilient-to-reactive-why-your-rest-service-needs-events-4222</guid>
      <description>&lt;p&gt;Your resilient backend service survived a backend failure. Great! But what happens when 5 other services are waiting for its response and time is ticking?&lt;/p&gt;

&lt;p&gt;You have isolated faults and retries... but you are still blocking threads and cascading delays. What if your service could just... notify and move on?&lt;/p&gt;




&lt;h2&gt;
  
  
  🤔 1. Why REST Falls Short in Complex Systems?
&lt;/h2&gt;

&lt;p&gt;Using REST for service-to-service communication can be a practical choice in the early stages of an application, when system load is low and domain complexity is minimal. REST is straightforward to implement, easy to debug, and simple to reason about.&lt;/p&gt;

&lt;p&gt;However, as systems scale and interdependencies grow, REST-based synchronous communication introduces several challenges:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Tight coupling and synchronous waiting&lt;/strong&gt;: Even with resilience patterns like retries and circuit breakers in place, synchronous REST calls can lead to cascading delays. A slowdown in one service can ripple through the system, degrading overall responsiveness and throughput.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;🧵 Thread exhaustion and backpressure&lt;/strong&gt;: REST calls tie up threads while waiting for responses. Under load, this can exhaust available threads, increasing latency and triggering timeouts or service degradation.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;⛓️‍💥 Fragile service coordination&lt;/strong&gt;: When one dependent service becomes unavailable, it can prevent the entire flow from completing, even if other services are healthy and capable of progressing independently.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fz607a7s26ce16uauxilk.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fz607a7s26ce16uauxilk.png" alt="REST Drawbacks" width="800" height="466"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  🚀 2. Introducing Asynchrones Messaging
&lt;/h2&gt;

&lt;p&gt;As systems grow in complexity and scale, REST-base communication starts to show its limits. To overcome this limitations, many modern architecture shifts to &lt;strong&gt;asynchronous messaging&lt;/strong&gt;, enabling services to &lt;strong&gt;communicate without waiting&lt;/strong&gt; for one another.&lt;/p&gt;

&lt;h3&gt;
  
  
  📣 What is Asynchronous Messaging?
&lt;/h3&gt;

&lt;p&gt;Instead of making a direct call and waiting for response (like in REST), the service &lt;strong&gt;emits and event or sends a message to the message broker&lt;/strong&gt;, and &lt;strong&gt;move on&lt;/strong&gt;. The receiving service can pick up the messages &lt;strong&gt;when it's ready&lt;/strong&gt;, instantly or later without the sender needing to be blocked or needing to retry.&lt;/p&gt;

&lt;p&gt;This decouples service in time, making systems more &lt;strong&gt;resilient, scalable&lt;/strong&gt; and &lt;strong&gt;easier to evolve&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fehapcyhmhgn4qu5mjb57.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fehapcyhmhgn4qu5mjb57.png" alt="Asynchronous Messaging" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  💡 Benefits of Async Messaging:
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;🧩 Loose Coupling: Services don’t need to know about each other’s availability or even existence.&lt;/li&gt;
&lt;li&gt;🕒 Non-Blocking: The sender doesn’t wait for a response improving throughput and avoiding thread exhaustion.&lt;/li&gt;
&lt;li&gt;🔁 Automatic Retries: Many brokers provide built-in retry, dead-lettering, and ordering guarantees.&lt;/li&gt;
&lt;li&gt;📈 Elastic Scalability: Consumers can scale independently based on workload.&lt;/li&gt;
&lt;/ol&gt;




&lt;blockquote&gt;
&lt;p&gt;📚 &lt;em&gt;Missed the previous post?&lt;/em&gt;&lt;br&gt;&lt;br&gt;
👉 &lt;a href="https://dev.to/athulmr/making-rest-microservices-resilient-bulkhead-retry-circuit-breaker-in-practice-1mpl"&gt;Making REST Microservices Resilient: Bulkhead, Retry &amp;amp; Circuit Breaker in Practice&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;⏭️ &lt;em&gt;Next up in the series:&lt;/em&gt;&lt;br&gt;
We’ll dive into event choreography, where services don’t just talk, they dance. Stay tuned for real-world patterns, trade-offs, and code examples you can use right away.&lt;/p&gt;

&lt;p&gt;🧠 &lt;strong&gt;What’s your take?&lt;/strong&gt;&lt;br&gt;
Have you experienced the pain of tightly coupled REST services? Or made the shift to asynchronous messaging in your own systems?&lt;br&gt;
Drop your experience or questions in the comments. I’d love to hear how others are tackling these challenges. 👇&lt;/p&gt;

</description>
      <category>microservices</category>
      <category>programming</category>
      <category>systemdesign</category>
      <category>eventdriven</category>
    </item>
    <item>
      <title>Making REST Microservices Resilient: Bulkhead, Retry &amp; Circuit Breaker in Practice</title>
      <dc:creator>Athul Mukundan</dc:creator>
      <pubDate>Tue, 29 Jul 2025 07:18:10 +0000</pubDate>
      <link>https://dev.to/athulmr/making-rest-microservices-resilient-bulkhead-retry-circuit-breaker-in-practice-1mpl</link>
      <guid>https://dev.to/athulmr/making-rest-microservices-resilient-bulkhead-retry-circuit-breaker-in-practice-1mpl</guid>
      <description>&lt;h2&gt;
  
  
  Why REST is a good starting point?
&lt;/h2&gt;

&lt;p&gt;When starting with a microservice, REST is often the goto approach, and for a good reason.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Simplicity&lt;/strong&gt;: Its simplicity allows teams to set up communication between services easily without needing to set up complex infrastructures. &lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Faster debugging&lt;/strong&gt;: REST is also widely understood, making it easy for developers to jump in, read each other's code and debug using familiar tools like Postman.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Faster feature delivery&lt;/strong&gt;: For smaller team this mean they can deliver more features without having to have operational overload of maintaining messaging systems or brokers.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  The hidden dangers of naive REST communication.
&lt;/h2&gt;

&lt;p&gt;At first glance, REST-based communication between service feels straight forward. But as the system grows in complexity and traffic, naive implementation can quickly become a bottleneck.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Synchronus dependency chain&lt;/strong&gt;: When services relays on each other through direct REST calls, a single request can fire a long chain of downstream calls. This increases the overall response time and tightly couples services together.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Failure Propogation&lt;/strong&gt;: If one service becomes slow or unavailable, all the services depending on it may also start failing or slowing down, leading to cascading failure through out the system. &lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6r6sm39vqx43s8inwyn9.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6r6sm39vqx43s8inwyn9.gif" alt="Failure Propogation" width="500" height="355"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Long-tail latency&lt;/strong&gt;: With multiple request hops between the services, even a small delay in a service can accumulate and cause significant end-to-end latency, degrading the user experience.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Introducing Resilience Patterns.
&lt;/h2&gt;

&lt;p&gt;Introducing the resilience patterns can make your system more fault tolerant and stable.&lt;/p&gt;

&lt;h3&gt;
  
  
  🔁 Retry:
&lt;/h3&gt;

&lt;p&gt;In distributed systems, transient failures are common, a service might briefly fail due to network issues, load spikes, or resource contention. A retry mechanism allows the calling service to attempt the failed request again after a short delay, increasing the chances of eventual success.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Exponential backoff&lt;/strong&gt;: Rather than retrying immediately or at fixed intervals, exponential backoff gradually increases the wait time between attempts. This reduces pressure on overloaded services and gives them a chance to recover.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Jitter&lt;/strong&gt;: Adding randomness (jitter) to retry intervals helps avoid the thundering herd problem, where multiple services retry at the same time and cause another spike in traffic.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu8yzwd3k2utkisix5q7p.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu8yzwd3k2utkisix5q7p.png" alt="Retry" width="800" height="960"&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;io.github.resilience4j.retry.*&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;io.github.resilience4j.retry.annotation.*&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.stereotype.Service&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.concurrent.Callable&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="nd"&gt;@Service&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;PaymentService&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;Retry&lt;/span&gt; &lt;span class="n"&gt;retry&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Retry&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"paymentRetry"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
        &lt;span class="nc"&gt;RetryConfig&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;custom&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;maxAttempts&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;waitDuration&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Duration&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ofMillis&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;retryExceptions&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;RuntimeException&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;class&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;);&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;processPayment&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="nc"&gt;Callable&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;decorated&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Retry&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;decorateCallable&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;retry&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;callPaymentGateway&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

        &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;decorated&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;call&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Exception&lt;/span&gt; &lt;span class="n"&gt;ex&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s"&gt;"❌ All retries failed!"&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;callPaymentGateway&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Math&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;random&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mf"&gt;0.8&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;RuntimeException&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Gateway timeout"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s"&gt;"✅ Payment processed"&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  ⚡️ Circuit Breaker
&lt;/h3&gt;

&lt;p&gt;What if the downstream service isn’t recovering quickly? Constant retries could make things worse. A circuit breaker monitors the failure rate of requests and temporarily stops calls to a failing service once a threshold is breached. &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Open State&lt;/strong&gt;: When failures cross a defined threshold, the circuit “opens,” blocking further calls and failing fast. This prevents overloading the unhealthy service.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Half Open State&lt;/strong&gt;: After a timeout, the circuit breaker allows a few test requests through to check if the service has recovered.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Closed stat&lt;/strong&gt;: If the test requests succeed, the circuit closes and normal traffic resumes.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;this allows the requests to fail faster&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs1os8a3i6hawdcnb1iek.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs1os8a3i6hawdcnb1iek.png" alt="Circuit Breaker" width="800" height="808"&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;io.github.resilience4j.circuitbreaker.*&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;io.github.resilience4j.circuitbreaker.annotation.*&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.stereotype.Service&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.concurrent.Callable&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="nd"&gt;@Service&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ExternalApiService&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;CircuitBreaker&lt;/span&gt; &lt;span class="n"&gt;circuitBreaker&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;CircuitBreaker&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"externalApiCB"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
        &lt;span class="nc"&gt;CircuitBreakerConfig&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;custom&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;failureRateThreshold&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;               &lt;span class="c1"&gt;// % of failures to open the circuit&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;waitDurationInOpenState&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Duration&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ofSeconds&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;slidingWindowSize&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;);&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;fetchData&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="nc"&gt;Callable&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;decorated&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;CircuitBreaker&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;decorateCallable&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;circuitBreaker&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;callExternalApi&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

        &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;decorated&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;call&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;CallNotPermittedException&lt;/span&gt; &lt;span class="n"&gt;ex&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s"&gt;"⛔ Circuit is open. Using fallback!"&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Exception&lt;/span&gt; &lt;span class="n"&gt;ex&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s"&gt;"❌ API call failed!"&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;callExternalApi&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="c1"&gt;// Simulate flaky external service&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Math&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;random&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mf"&gt;0.7&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;RuntimeException&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"External service error"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s"&gt;"✅ Data from external API"&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  🧵 Bulkhead
&lt;/h3&gt;

&lt;p&gt;As systems scale, a failure in one part of your application can unexpectedly affect other unrelated areas. The bulkhead pattern helps isolate such failures so they don’t take down the whole system.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;What is it?&lt;/strong&gt; Inspired by ship design, where compartments are isolated to prevent flooding, the bulkhead pattern in software creates isolated resource pools, such as separate thread pools, for different parts of the system.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Why it’s needed:&lt;/strong&gt; Without bulkheads, a spike in a heavy task (like report generation) can exhaust shared resources like threads or memory, causing even lightweight, critical requests (like health checks or profile lookups) to fail or slow down.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Example:
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;/generate-report uses a lot of CPU and threads&lt;/li&gt;
&lt;li&gt;/get-profile is lightweight and fast&lt;/li&gt;
&lt;li&gt;Without bulkheads, if /generate-report overloads the system, it can delay or block /get-profile&lt;/li&gt;
&lt;li&gt;With bulkheads, they run in separate thread pools, so one won’t affect the other&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn3am56i4g9buyc0nliwg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn3am56i4g9buyc0nliwg.png" alt="Bulkhead" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Types of isolation:
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Thread pool isolation&lt;/strong&gt;: Allocate different thread pools per endpoint or functionality&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Queue isolation&lt;/strong&gt;: Use separate queues for different request types&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Instance isolation&lt;/strong&gt;: Scale different parts of the service independently if needed&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Benefit:
&lt;/h4&gt;

&lt;p&gt;The system degrades gracefully, even if one part is under pressure, the rest continues to work.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;io.github.resilience4j.bulkhead.*&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;io.github.resilience4j.bulkhead.annotation.*&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.stereotype.Service&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.time.Duration&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.concurrent.*&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="nd"&gt;@Service&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ReportService&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;Bulkhead&lt;/span&gt; &lt;span class="n"&gt;bulkhead&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Bulkhead&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"generateReportBulkhead"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
        &lt;span class="nc"&gt;BulkheadConfig&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;custom&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;maxConcurrentCalls&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;maxWaitDuration&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Duration&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ofMillis&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;);&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;generateReport&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="nc"&gt;Callable&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;decorated&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Bulkhead&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;decorateCallable&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bulkhead&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
                &lt;span class="n"&gt;simulateHeavyComputation&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
                &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s"&gt;"✅ Report generated!"&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
            &lt;span class="o"&gt;});&lt;/span&gt;

        &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;decorated&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;call&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;BulkheadFullException&lt;/span&gt; &lt;span class="n"&gt;ex&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s"&gt;"🚫 System busy. Try again later."&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Exception&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s"&gt;"❌ Unexpected error."&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;simulateHeavyComputation&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="kd"&gt;throws&lt;/span&gt; &lt;span class="nc"&gt;InterruptedException&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="nc"&gt;Thread&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;sleep&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2000&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// simulate heavy work&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Trade-offs &amp;amp; When to Move On
&lt;/h2&gt;

&lt;p&gt;While resilience patterns like retry, circuit breaker, and bulkhead significantly improve fault tolerance, they don’t address one fundamental issue, &lt;strong&gt;tight coupling between services&lt;/strong&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;REST still couples services tightly&lt;/strong&gt;: Each service needs to know exactly &lt;strong&gt;who&lt;/strong&gt; to call and &lt;strong&gt;when&lt;/strong&gt;. This direct dependency makes changes harder and increases coordination overhead between teams.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Error recovery is limited&lt;/strong&gt;: You can retry, back off, or fail fast, but you're still reacting to failures. There’s no natural way to &lt;strong&gt;defer processing&lt;/strong&gt; or &lt;strong&gt;recover asynchronously&lt;/strong&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Still vulnerable to partial failures and latency&lt;/strong&gt;: A chain of REST calls means if one link slows down, the entire request suffers. You’re still bound by the slowest service in the chain.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Eventually, as the number of services and their interactions grow, the limitations of synchronous REST become more obvious. That’s when teams start looking for &lt;strong&gt;asynchronous, event-driven alternatives&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Up Next: Decoupling with Choreography
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;What if services didn’t need to know about each other directly?&lt;br&gt;&lt;br&gt;
What if failures didn’t ripple through the system immediately?&lt;br&gt;&lt;br&gt;
What if we could &lt;strong&gt;decouple communication and build more autonomous services&lt;/strong&gt;?&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;In Part 2, we’ll explore &lt;strong&gt;choreography&lt;/strong&gt;, where services react to events instead of calling each other directly, unlocking a new level of flexibility, resilience, and scalability.&lt;/p&gt;

</description>
      <category>microservices</category>
      <category>restapi</category>
      <category>springboot</category>
      <category>java</category>
    </item>
  </channel>
</rss>
