<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Onur Cinar</title>
    <description>The latest articles on DEV Community by Onur Cinar (@onurcinar).</description>
    <link>https://dev.to/onurcinar</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2455507%2F3aa78de4-9412-4988-b03a-d64d419c7f0a.jpeg</url>
      <title>DEV Community: Onur Cinar</title>
      <link>https://dev.to/onurcinar</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/onurcinar"/>
    <language>en</language>
    <item>
      <title>Bringing Claude's "Dispatch" Experience to Gemini and OpenCode</title>
      <dc:creator>Onur Cinar</dc:creator>
      <pubDate>Sun, 12 Apr 2026 22:19:26 +0000</pubDate>
      <link>https://dev.to/onurcinar/bringing-claudes-dispatch-experience-to-gemini-and-opencode-3pef</link>
      <guid>https://dev.to/onurcinar/bringing-claudes-dispatch-experience-to-gemini-and-opencode-3pef</guid>
      <description>&lt;p&gt;Claude’s "Dispatch" feature nailed the mobile-to-desktop UX. Being able to pull out your phone, delegate a heavy refactoring task to your local machine, and monitor its progress asynchronously is a massive quality-of-life upgrade. &lt;/p&gt;

&lt;p&gt;But if your daily drivers are CLI-native AI tools like Gemini or OpenCode, you might feel locked out of that seamless remote workflow. Because these tools run in your terminal rather than a proprietary desktop app, they lack a native mobile bridge. &lt;/p&gt;

&lt;p&gt;You don't have to abandon your favorite CLI tools to get that experience. By combining &lt;strong&gt;&lt;a href="https://tailscale.com/" rel="noopener noreferrer"&gt;Tailscale&lt;/a&gt;&lt;/strong&gt; and the modern terminal multiplexer &lt;strong&gt;&lt;a href="https://zellij.dev/" rel="noopener noreferrer"&gt;Zellij&lt;/a&gt;&lt;/strong&gt;, you can build a universal "Dispatch" layer. &lt;/p&gt;

&lt;p&gt;The best part? You aren't just sending a fire-and-forget command. You get the exact same interactive, conversational experience on your phone as you do sitting at your mechanical keyboard.&lt;/p&gt;

&lt;p&gt;Here is how to set up your own sovereign, mobile-to-local AI command center.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Missing Link: Zellij Web + Tailscale
&lt;/h2&gt;

&lt;p&gt;The core magic of Claude Dispatch is simply a secure, persistent, remotely accessible session. We can replicate this entirely using open-source infrastructure.&lt;/p&gt;

&lt;p&gt;To bridge the gap between your smartphone browser and your local desktop terminal, we need two components:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;Tailscale:&lt;/strong&gt; This creates a secure overlay network. We will use &lt;strong&gt;Tailscale&lt;/strong&gt; to safely pipe a local port out to your devices.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Zellij:&lt;/strong&gt; This is the crucial piece. Zellij is a Rust-based terminal multiplexer with a robust &lt;strong&gt;Web Client&lt;/strong&gt;. Unlike SSH apps, which can be clunky on mobile, Zellij renders a fully responsive terminal UI directly in your mobile browser.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Setting Up Your "Dispatch" Server
&lt;/h2&gt;

&lt;p&gt;On your primary development machine—where your code, compilers, and AI tools live—you need to prepare the web session. Zellij takes privacy seriously, so the web client requires an authentication token and binds strictly to &lt;code&gt;localhost&lt;/code&gt; by default.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Create the Authentication Token
&lt;/h3&gt;

&lt;p&gt;Before starting the web UI, generate your secure login token:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;zellij web &lt;span class="nt"&gt;--create-token&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;em&gt;Make sure to copy and save the outputted token. You will need it to authenticate when you connect from your smartphone.&lt;/em&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Start the Zellij Web Server
&lt;/h3&gt;

&lt;p&gt;Next, start the web server and specify the port you want to use:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;zellij web &lt;span class="nt"&gt;--port&lt;/span&gt; 4000
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. Expose the Port via Tailscale
&lt;/h3&gt;

&lt;p&gt;Because Zellij is safely listening only on localhost, you cannot reach it from your phone yet. Instead of exposing this to the public internet, we use Tailscale Serve to proxy that local port exclusively to your private Tailnet.&lt;/p&gt;

&lt;p&gt;Run this in a new terminal tab:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;tailscale serve &lt;span class="nt"&gt;--bg&lt;/span&gt; &lt;span class="nt"&gt;--https&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;4000 localhost:4000
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  The Remote Workflow in Action
&lt;/h2&gt;

&lt;p&gt;The real power of this setup is the seamless handoff. You don't need to craft complex, single-shot prompt strings. The interaction is identical to typing directly into your desktop CLI.&lt;/p&gt;

&lt;p&gt;Imagine you are deep into building a Go project. You are sitting at your desk, iterating on some validation logic with OpenCode or Gemini open in your terminal. You realize you need to leave the house, but the task isn't done.&lt;/p&gt;

&lt;p&gt;Here is how the dispatch workflow plays out:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. The Mobile Handoff
&lt;/h3&gt;

&lt;p&gt;While waiting in line for coffee, you open Chrome or Safari on your phone and navigate to your Tailscale URL (&lt;code&gt;https://my-dev-box.domain.ts.net:4000&lt;/code&gt;). &lt;/p&gt;

&lt;p&gt;After pasting in your authentication token, you are instantly dropped right back into your active desktop terminal. You see the exact same interactive AI prompt you were looking at on your monitor moments ago.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Chat and Dispatch
&lt;/h3&gt;

&lt;p&gt;Because you are in a live, interactive session, you just talk to the CLI naturally. You type into your phone:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;"I need to head out for a bit. Can you run the tests for the checker package, figure out why the struct validation is failing, and apply the fix?"&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The AI acknowledges the request and begins its loop—reading your local files, executing &lt;code&gt;go test&lt;/code&gt;, and analyzing the output.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Detach and Walk Away
&lt;/h3&gt;

&lt;p&gt;This is the "Dispatch" moment. You simply close your mobile browser tab and put your phone in your pocket. &lt;/p&gt;

&lt;p&gt;Because Zellij is managing the session natively on your local hardware, the AI continues to run uninterrupted. It has full access to your local environment to do the heavy lifting.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Asynchronous Monitoring
&lt;/h3&gt;

&lt;p&gt;Check back 20 minutes later. Reopen the URL on your phone, and your terminal state is exactly how you left it. &lt;/p&gt;

&lt;p&gt;If the AI successfully refactored the code and the tests are green, the output is waiting for you. If it ran into a file-permission error, or if OpenCode paused to ask, &lt;em&gt;"Do you want me to commit these changes?"&lt;/em&gt;, the interactive prompt is right there in your mobile browser, patiently waiting for your reply.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. The Seamless Return to Desktop
&lt;/h3&gt;

&lt;p&gt;When you finally get back home, the magic of Zellij really shines. You don't have to sync anything, pull down remote cloud changes, or wonder what the AI did while you were gone. You simply sit down at your physical monitor, attach to the running Zellij session, and pick up exactly where you left off. The AI's responses, the shell history, and the code changes are all right there waiting for you.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why This Approach Scales
&lt;/h2&gt;

&lt;p&gt;Retrofitting your existing AI workflow with Zellij and Tailscale doesn't just mimic Claude Dispatch; it arguably surpasses it for power users.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Agnostic Architecture:&lt;/strong&gt; You aren't locked into one provider's ecosystem. You can use this exact workflow for Gemini, OpenCode, or any future terminal-based AI.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Frictionless UI:&lt;/strong&gt; You don't need a dedicated mobile app or complex SSH key management on your phone. Any modern web browser becomes a window into your live terminal.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Unrestricted Environment:&lt;/strong&gt; Your AI operates natively. It has full, unrestricted access to your actual development environment—your local databases, Docker containers, and raw file system—without needing to sync cloud workspaces.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;By adding this networking layer, you transform your standard interactive CLIs from desktop-bound tools into true asynchronous agents that travel with you, keeping you in the loop wherever you are.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>productivity</category>
      <category>cli</category>
      <category>opensource</category>
    </item>
    <item>
      <title>Prioritize Your Traffic: Priority-Aware Bulkheads in Go</title>
      <dc:creator>Onur Cinar</dc:creator>
      <pubDate>Sun, 12 Apr 2026 16:00:00 +0000</pubDate>
      <link>https://dev.to/onurcinar/prioritize-your-traffic-priority-aware-bulkheads-in-go-2ain</link>
      <guid>https://dev.to/onurcinar/prioritize-your-traffic-priority-aware-bulkheads-in-go-2ain</guid>
      <description>&lt;p&gt;Not all traffic is created equal. When your system is under heavy load, should a background cleanup task compete for the same resources as a user's checkout request? &lt;/p&gt;

&lt;p&gt;In a standard bulkhead, the answer is often "yes"—the first 10 requests get in, and the 11th is rejected, regardless of its importance. This is where &lt;strong&gt;Priority-Aware Bulkheads&lt;/strong&gt; come in.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem: The "Fairness" Trap
&lt;/h2&gt;

&lt;p&gt;Standard bulkheads are fair. They treat every request the same. But in a real-world system, fairness can be a liability:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Critical Traffic&lt;/strong&gt;: User-facing requests (e.g., "Complete Purchase", "Login") that directly impact revenue or user experience.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Standard Traffic&lt;/strong&gt;: Regular API calls (e.g., "View Profile", "Search") that are important but not immediately critical.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Low-Priority Traffic&lt;/strong&gt;: Background tasks (e.g., "Generate Report", "Sync Analytics", "Cache Warming") that can be delayed or retried later.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When your system is at 90% capacity, you want to stop accepting "Generate Report" requests to ensure there's enough room for "Complete Purchase" calls. A standard bulkhead can't do this; it will fill up with whatever arrives first.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Solution: Priority-Aware Bulkheads
&lt;/h2&gt;

&lt;p&gt;A &lt;strong&gt;Priority-Aware Bulkhead&lt;/strong&gt; uses &lt;strong&gt;Load Shedding&lt;/strong&gt; based on priority levels. It defines utilization thresholds for different types of traffic. &lt;/p&gt;

&lt;p&gt;For example:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Low Priority&lt;/strong&gt;: Allowed only if the bulkhead is less than 50% full.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Standard Priority&lt;/strong&gt;: Allowed only if the bulkhead is less than 80% full.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Critical Priority&lt;/strong&gt;: Allowed until the bulkhead is 100% full.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This ensures that your most important traffic always has a "buffer" of capacity reserved for it, even when the system is under significant pressure.&lt;/p&gt;




&lt;h2&gt;
  
  
  Implementing with Resile
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/cinar/resile" rel="noopener noreferrer"&gt;Resile&lt;/a&gt; provides a built-in &lt;code&gt;PriorityBulkhead&lt;/code&gt; that makes this pattern easy to implement.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Define Your Priorities
&lt;/h3&gt;

&lt;p&gt;Resile uses a simple &lt;code&gt;Priority&lt;/code&gt; type with three levels: &lt;code&gt;PriorityLow&lt;/code&gt;, &lt;code&gt;PriorityStandard&lt;/code&gt;, and &lt;code&gt;PriorityCritical&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="n"&gt;thresholds&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="k"&gt;map&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;resile&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Priority&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="kt"&gt;float64&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;resile&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;PriorityLow&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;      &lt;span class="m"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c"&gt;// Shed at 50% utilization&lt;/span&gt;
    &lt;span class="n"&gt;resile&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;PriorityStandard&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="m"&gt;0.8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c"&gt;// Shed at 80% utilization&lt;/span&gt;
    &lt;span class="n"&gt;resile&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;PriorityCritical&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c"&gt;// Shed only when 100% full&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;// Create a bulkhead with a capacity of 20&lt;/span&gt;
&lt;span class="n"&gt;pb&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;resile&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewPriorityBulkhead&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;20&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;thresholds&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Attach Priority to Context
&lt;/h3&gt;

&lt;p&gt;You communicate the importance of a request by attaching a priority to its &lt;code&gt;context.Context&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// Create a context with Critical priority&lt;/span&gt;
&lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;resile&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WithPriority&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Background&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;resile&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;PriorityCritical&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;// Execute the action within the priority bulkhead&lt;/span&gt;
&lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;pb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;processOrder&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. Handle Shedded Load
&lt;/h3&gt;

&lt;p&gt;When a request is rejected because its priority threshold is exceeded, Resile returns &lt;code&gt;resile.ErrShedLoad&lt;/code&gt;. If the bulkhead is physically full (100% capacity), it returns &lt;code&gt;resile.ErrBulkheadFull&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;errors&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Is&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;resile&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ErrShedLoad&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c"&gt;// This low/standard priority request was shedded to save capacity &lt;/span&gt;
    &lt;span class="c"&gt;// for higher-priority traffic.&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Why Use Priority-Aware Bulkheads?
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Protect the Critical Path&lt;/strong&gt;: Ensure that your most important business processes remain available even during traffic spikes.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Graceful Degradation&lt;/strong&gt;: Instead of a total system failure, your service gracefully degrades by dropping non-essential background work first.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Better User Experience&lt;/strong&gt;: Users performing critical actions see no slowdown, while background "noise" is managed behind the scenes.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cost Efficiency&lt;/strong&gt;: You don't need to over-provision your infrastructure to handle peak "background" load if you can simply shed it when necessary.&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Comparison: Static vs. Priority vs. Adaptive
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;Static Bulkhead&lt;/th&gt;
&lt;th&gt;Priority-Aware Bulkhead&lt;/th&gt;
&lt;th&gt;Adaptive Concurrency&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Limit Type&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Fixed (e.g., 20)&lt;/td&gt;
&lt;td&gt;Fixed + Thresholds&lt;/td&gt;
&lt;td&gt;Dynamic (Auto-tuned)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Traffic Awareness&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;None (All equal)&lt;/td&gt;
&lt;td&gt;High (Priority-based)&lt;/td&gt;
&lt;td&gt;None (All equal)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Best For&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Simple isolation&lt;/td&gt;
&lt;td&gt;Multi-tenant or Tiered apps&lt;/td&gt;
&lt;td&gt;Volatile environments&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;a href="https://dev.to/onurcinar/stop-the-domino-effect-bulkhead-isolation-in-go-5cgl"&gt;Read more about Static Bulkheads&lt;/a&gt; or &lt;a href="https://dev.to/onurcinar/beyond-static-limits-adaptive-concurrency-with-tcp-vegas-in-go-3gne"&gt;Explore Adaptive Concurrency&lt;/a&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Resilience isn't just about keeping the lights on; it's about keeping the &lt;em&gt;right&lt;/em&gt; lights on. Priority-Aware Bulkheads give you the surgical precision needed to manage your system's resources effectively during times of stress.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Check out the full example:&lt;/strong&gt; &lt;a href="https://github.com/cinar/resile/tree/main/examples/prioritybulkhead" rel="noopener noreferrer"&gt;Priority Bulkhead Example&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Learn more about Resile:&lt;/strong&gt; &lt;a href="https://github.com/cinar/resile" rel="noopener noreferrer"&gt;github.com/cinar/resile&lt;/a&gt;&lt;/p&gt;

</description>
      <category>go</category>
      <category>microservices</category>
      <category>distributedsystems</category>
      <category>backend</category>
    </item>
    <item>
      <title>Stopping the Zombie Requests: Distributed Deadline Propagation in Go</title>
      <dc:creator>Onur Cinar</dc:creator>
      <pubDate>Sat, 11 Apr 2026 15:43:15 +0000</pubDate>
      <link>https://dev.to/onurcinar/stopping-the-zombie-requests-distributed-deadline-propagation-in-go-3ccm</link>
      <guid>https://dev.to/onurcinar/stopping-the-zombie-requests-distributed-deadline-propagation-in-go-3ccm</guid>
      <description>&lt;p&gt;Imagine a common scenario in a microservice architecture: A user clicks a "Buy" button, triggering a request to &lt;strong&gt;Service A&lt;/strong&gt;. Service A calls &lt;strong&gt;Service B&lt;/strong&gt;, which in turn calls &lt;strong&gt;Service C&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Suddenly, Service A times out. The user sees an error message and refreshes the page. But &lt;strong&gt;Service B and Service C are still working&lt;/strong&gt; on the original request, consuming CPU, memory, and database connections for a result that will never be seen.&lt;/p&gt;

&lt;p&gt;These are &lt;strong&gt;Zombie Requests&lt;/strong&gt;. In a high-traffic system, they can lead to cascading failures and resource exhaustion, even if the underlying services are technically "healthy."&lt;/p&gt;

&lt;p&gt;To stop the zombies, you need &lt;strong&gt;Distributed Deadline Propagation&lt;/strong&gt;. Here is how to implement it effortlessly using &lt;a href="https://github.com/cinar/resile" rel="noopener noreferrer"&gt;Resile&lt;/a&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  What is Distributed Deadline Propagation?
&lt;/h2&gt;

&lt;p&gt;Deadlines are not just local timeouts. A deadline represents the &lt;strong&gt;absolute point in time&lt;/strong&gt; after which the entire request chain should be abandoned.&lt;/p&gt;

&lt;p&gt;Distributed Deadline Propagation is the process of:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;Tracking&lt;/strong&gt; the remaining time (the "budget") as a request moves through the system.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Communicating&lt;/strong&gt; that budget to downstream services via metadata (like HTTP headers).&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Aborting early&lt;/strong&gt; if the remaining budget is too small to realistically complete the work.&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  The Resile Way: Smart Deadlines
&lt;/h2&gt;

&lt;p&gt;Resile provides two powerful mechanisms to handle distributed deadlines: &lt;strong&gt;Early Abort&lt;/strong&gt; and &lt;strong&gt;Header Injection&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Early Abort: &lt;code&gt;WithMinDeadlineThreshold&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;Why start a request if you only have 2 milliseconds left? The network latency alone will likely exceed that, and you'll just be wasting resources.&lt;/p&gt;

&lt;p&gt;Resile's &lt;code&gt;WithMinDeadlineThreshold&lt;/code&gt; allows you to define a "safety buffer." If the remaining time in the &lt;code&gt;context.Context&lt;/code&gt; is less than this threshold, Resile will &lt;strong&gt;abort the execution immediately&lt;/strong&gt; with a &lt;code&gt;context.DeadlineExceeded&lt;/code&gt; error, before even attempting the work.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s"&gt;"context"&lt;/span&gt;
    &lt;span class="s"&gt;"time"&lt;/span&gt;
    &lt;span class="s"&gt;"github.com/cinar/resile"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;// Define a policy with a 10ms "Early Abort" threshold.&lt;/span&gt;
&lt;span class="n"&gt;policy&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;resile&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewPolicy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;resile&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WithRetry&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;3&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;resile&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WithMinDeadlineThreshold&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;10&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Millisecond&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;// If ctx has only 5ms left, this returns context.DeadlineExceeded instantly.&lt;/span&gt;
&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;policy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Do&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;apiClient&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;FetchData&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Header Injection: &lt;code&gt;InjectDeadlineHeader&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;To propagate the deadline to downstream services, you need to "inject" the remaining time into your outgoing requests. Resile provides a transport-agnostic &lt;code&gt;InjectDeadlineHeader&lt;/code&gt; function that supports both standard HTTP and gRPC.&lt;/p&gt;

&lt;h4&gt;
  
  
  For REST/HTTP:
&lt;/h4&gt;

&lt;p&gt;You can inject the remaining milliseconds into a custom header (e.g., &lt;code&gt;X-Request-Timeout&lt;/code&gt;).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Client&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;FetchData&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewRequestWithContext&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"GET"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"http://service-b/data"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c"&gt;// Inject the remaining milliseconds into the header.&lt;/span&gt;
    &lt;span class="n"&gt;resile&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;InjectDeadlineHeader&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Header&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"X-Request-Timeout"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;httpClient&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Do&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c"&gt;// ...&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  For gRPC:
&lt;/h4&gt;

&lt;p&gt;Resile natively supports the standard &lt;code&gt;Grpc-Timeout&lt;/code&gt; header format, ensuring compatibility with the gRPC ecosystem.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Client&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;FetchData&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;md&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;New&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;map&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;{})&lt;/span&gt;

    &lt;span class="c"&gt;// Inject using the gRPC-specific format (e.g., "100m" for 100ms).&lt;/span&gt;
    &lt;span class="n"&gt;resile&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;InjectDeadlineHeader&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;md&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"Grpc-Timeout"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewOutgoingContext&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;md&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;grpcClient&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;GetData&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;pb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Request&lt;/span&gt;&lt;span class="p"&gt;{})&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Why This Matters for Resilience
&lt;/h2&gt;

&lt;p&gt;Without distributed deadlines, your system is vulnerable to &lt;strong&gt;Resource Exhaustion Attacks&lt;/strong&gt;—not from malicious actors, but from your own retries and slow dependencies.&lt;/p&gt;

&lt;p&gt;By implementing propagation:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;You save money:&lt;/strong&gt; You're not paying for cloud compute that produces "zombie" results.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;You prevent meltdowns:&lt;/strong&gt; Downstream services are protected from "retry storms" that they can't possibly satisfy in time.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;You improve UX:&lt;/strong&gt; Failures happen faster (Fail-Fast), allowing the UI to react or switch to a fallback immediately.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://dev.to/onurcinar/preventing-microservice-meltdowns-adaptive-retries-and-circuit-breakers-in-go-30ho"&gt;Read more: Preventing Meltdowns: How Adaptive Retries Protect Your Downstream&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Comparison: Static vs. Distributed Deadlines
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;Static Timeouts&lt;/th&gt;
&lt;th&gt;Distributed Deadlines (Resile)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Scope&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Single Service&lt;/td&gt;
&lt;td&gt;Entire Request Chain&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Awareness&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Blind to upstream delays&lt;/td&gt;
&lt;td&gt;Aware of the total "time budget"&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Efficiency&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;High waste (Zombie requests)&lt;/td&gt;
&lt;td&gt;Zero waste (Early Abort)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Protocol&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Internal only&lt;/td&gt;
&lt;td&gt;HTTP/gRPC compatible&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Resilience isn't just about making things "work"; it's about knowing when to &lt;strong&gt;stop working&lt;/strong&gt;. &lt;/p&gt;

&lt;p&gt;Distributed Deadline Propagation is the "social contract" of a microservice architecture. It ensures that every service in the chain is working towards a common goal—and respects the reality that sometimes, time simply runs out.&lt;/p&gt;

&lt;p&gt;With Resile, implementing this complex pattern becomes a matter of a few lines of configuration.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Explore Resile on GitHub:&lt;/strong&gt; &lt;a href="https://github.com/cinar/resile" rel="noopener noreferrer"&gt;github.com/cinar/resile&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;How are you handling request budgets in your distributed systems? Let's discuss!&lt;/p&gt;

</description>
      <category>go</category>
      <category>microservices</category>
      <category>distributedsystems</category>
      <category>performance</category>
    </item>
    <item>
      <title>Native Chaos Engineering: Testing Resilience with Fault &amp; Latency Injection</title>
      <dc:creator>Onur Cinar</dc:creator>
      <pubDate>Fri, 03 Apr 2026 14:51:48 +0000</pubDate>
      <link>https://dev.to/onurcinar/native-chaos-engineering-testing-resilience-with-fault-latency-injection-83</link>
      <guid>https://dev.to/onurcinar/native-chaos-engineering-testing-resilience-with-fault-latency-injection-83</guid>
      <description>&lt;p&gt;You’ve implemented retries, circuit breakers, and timeouts. Your application is now "resilient." But how do you know these policies actually work? Waiting for a production meltdown to verify your configuration is a high-stakes gamble. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Native Chaos Engineering&lt;/strong&gt; in Resile allows you to synthetically induce failure and latency directly into your application's execution path, ensuring your resilience policies are battle-tested before they're ever needed in production.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem: "Dark Code" in Resilience Policies
&lt;/h2&gt;

&lt;p&gt;Resilience policies—like retries and circuit breakers—are often "dark code." These are execution paths that are rarely traversed under normal operating conditions. Because they only trigger during failure, they are notoriously difficult to test and prone to:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;Buggy Configurations&lt;/strong&gt;: A retry limit that is too high, or a circuit breaker threshold that never trips.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Unintended Side Effects&lt;/strong&gt;: A retry loop that accidentally consumes all available database connections.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Silent Failures&lt;/strong&gt;: A fallback strategy that actually panics because it hasn't been executed in months.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Traditional chaos engineering tools often operate at the infrastructure layer (e.g., killing pods or dropping network packets). While powerful, these tools can be difficult to set up in local development or staging environments and often lack the granularity to test specific application-level logic.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Solution: Fault &amp;amp; Latency Injection
&lt;/h2&gt;

&lt;p&gt;Resile provides a &lt;strong&gt;Chaos Injector&lt;/strong&gt; middleware that can be integrated directly into any execution policy. By injecting synthetic faults (errors) and latency (delays) with configurable probabilities, you can simulate various failure scenarios without touching your infrastructure.&lt;/p&gt;

&lt;h3&gt;
  
  
  Key Features:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Deterministic Randomness&lt;/strong&gt;: Uses Go 1.22's &lt;code&gt;math/rand/v2&lt;/code&gt; for efficient and predictable random number generation.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Context-Aware&lt;/strong&gt;: Latency injection strictly respects &lt;code&gt;context.Context&lt;/code&gt; cancellation. If your request times out while Resile is injecting chaos latency, it exits immediately.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Zero Dependencies&lt;/strong&gt;: Just like the rest of the Resile core, the chaos package depends only on the Go standard library.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Granular Control&lt;/strong&gt;: Configure error and latency probabilities independently for fine-tuned simulation.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Practical Usage
&lt;/h2&gt;

&lt;p&gt;Integrating chaos into your existing Resile policies is as simple as adding the &lt;code&gt;WithChaos&lt;/code&gt; option.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Basic Chaos Configuration
&lt;/h3&gt;

&lt;p&gt;You can define a chaos configuration that injects a 10% error rate and adds 100ms of latency to 20% of requests.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s"&gt;"github.com/cinar/resile"&lt;/span&gt;
    &lt;span class="s"&gt;"github.com/cinar/resile/chaos"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;// Configure chaos injection&lt;/span&gt;
&lt;span class="n"&gt;cfg&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;chaos&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Config&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;ErrorProbability&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;   &lt;span class="m"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;                    &lt;span class="c"&gt;// 10% chance of failure&lt;/span&gt;
    &lt;span class="n"&gt;InjectedError&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;      &lt;span class="n"&gt;errors&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;New&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"chaos!"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;   &lt;span class="c"&gt;// The error to return&lt;/span&gt;
    &lt;span class="n"&gt;LatencyProbability&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="m"&gt;0.2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;                    &lt;span class="c"&gt;// 20% chance of latency&lt;/span&gt;
    &lt;span class="n"&gt;LatencyDuration&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;    &lt;span class="m"&gt;100&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Millisecond&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c"&gt;// Delay to inject&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;// Apply it to an execution&lt;/span&gt;
&lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;resile&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DoErr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;action&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="n"&gt;resile&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WithRetry&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;3&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;resile&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WithChaos&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cfg&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Testing Your Circuit Breaker
&lt;/h3&gt;

&lt;p&gt;Chaos injection is exceptionally useful for verifying that your circuit breaker trips under pressure. By setting a high &lt;code&gt;ErrorProbability&lt;/code&gt;, you can force the breaker to transition from &lt;code&gt;Closed&lt;/code&gt; to &lt;code&gt;Open&lt;/code&gt; in a controlled environment.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="n"&gt;cb&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;circuit&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;New&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;circuit&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Config&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;WindowSize&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;           &lt;span class="m"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;FailureRateThreshold&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="m"&gt;50.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="c"&gt;// Force 80% error rate to trip the breaker quickly&lt;/span&gt;
&lt;span class="n"&gt;cfg&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;chaos&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Config&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;ErrorProbability&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="m"&gt;0.8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;InjectedError&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;    &lt;span class="n"&gt;errors&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;New&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"synthetic failure"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="m"&gt;20&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;resile&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DoErr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;action&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
        &lt;span class="n"&gt;resile&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WithCircuitBreaker&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cb&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;resile&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WithChaos&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cfg&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Circuit Breaker State: %v&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;State&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="c"&gt;// Should be Open&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Configuration Reference
&lt;/h2&gt;

&lt;p&gt;The &lt;code&gt;chaos.Config&lt;/code&gt; struct provides the following options:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Field&lt;/th&gt;
&lt;th&gt;Type&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;ErrorProbability&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;float64&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;The probability of injecting an error (0.0 to 1.0).&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;InjectedError&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;error&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;The error to be returned when an error is injected.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;LatencyProbability&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;float64&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;The probability of injecting latency (0.0 to 1.0).&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;LatencyDuration&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;time.Duration&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;The duration of the latency to be injected.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Best Practices
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Environment Gating&lt;/strong&gt;: Never enable chaos injection in production unless you are performing a planned game day. Use environment variables to gate the configuration:&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"ENABLE_CHAOS"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s"&gt;"true"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;opts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;opts&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;resile&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WithChaos&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;loadChaosCfg&lt;/span&gt;&lt;span class="p"&gt;()))&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Observability&lt;/strong&gt;: Ensure your &lt;code&gt;Instrumenter&lt;/code&gt; (like &lt;code&gt;slog&lt;/code&gt; or &lt;code&gt;OTel&lt;/code&gt;) is active. This allows you to see the injected errors and latencies in your logs and traces, making it easier to verify how your application responds.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Start Small&lt;/strong&gt;: Begin with low probabilities (e.g., 1-2%) to identify subtle race conditions or timeout issues before increasing the "blast radius."&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Resilience is not a "set it and forget it" feature. It requires continuous verification. By bringing chaos engineering directly into your application's execution policies, Resile empowers you to build systems that aren't just theoretically resilient, but practically battle-hardened.&lt;/p&gt;

&lt;p&gt;For more information and advanced usage, visit the &lt;a href="https://github.com/cinar/resile" rel="noopener noreferrer"&gt;github.com/cinar/resile&lt;/a&gt; project.&lt;/p&gt;

</description>
      <category>go</category>
      <category>testing</category>
      <category>sre</category>
      <category>distributedsystems</category>
    </item>
    <item>
      <title>Beyond Static Limits: Adaptive Concurrency with TCP-Vegas in Go</title>
      <dc:creator>Onur Cinar</dc:creator>
      <pubDate>Thu, 02 Apr 2026 19:11:51 +0000</pubDate>
      <link>https://dev.to/onurcinar/beyond-static-limits-adaptive-concurrency-with-tcp-vegas-in-go-3gne</link>
      <guid>https://dev.to/onurcinar/beyond-static-limits-adaptive-concurrency-with-tcp-vegas-in-go-3gne</guid>
      <description>&lt;p&gt;Traditional concurrency limits (like bulkheads) are static. You pick a number—say, 10 concurrent requests— and hope for the best. But in the dynamic world of cloud infrastructure, "10" might be too conservative when the network is fast, or dangerously high when a downstream service starts to queue.&lt;/p&gt;

&lt;p&gt;Static limits require manual tuning, which is often done &lt;em&gt;after&lt;/em&gt; an outage has already happened. To build truly resilient systems, we need &lt;strong&gt;Adaptive Concurrency Control&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Here is how to implement dynamic concurrency limits in Go using &lt;a href="https://github.com/cinar/resile" rel="noopener noreferrer"&gt;Resile&lt;/a&gt;, inspired by the TCP-Vegas congestion control algorithm.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem: The "Fixed-Limit" Trap
&lt;/h2&gt;

&lt;p&gt;Imagine your service talks to a database. You've set a bulkhead limit of 50 concurrent connections. &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Scenario A (Normal):&lt;/strong&gt; Database latency is 10ms. 50 concurrent requests mean you're handling 5,000 RPS. Everything is fine.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scenario B (Degraded):&lt;/strong&gt; Database latency spikes to 500ms due to a background maintenance task. Your 50 "slots" are now filled with slow requests. Your throughput drops to 100 RPS, and new incoming requests start to pile up in your own service's memory, eventually leading to a cascade of failures.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In Scenario B, 50 is &lt;strong&gt;too many&lt;/strong&gt;. You're holding onto resources that are essentially waiting on a bottleneck. You should have reduced your concurrency limit to prevent your own service from becoming part of the problem.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Solution: Little's Law &amp;amp; TCP-Vegas
&lt;/h2&gt;

&lt;p&gt;Adaptive Concurrency uses two core principles:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Little's Law (&lt;em&gt;L = λW&lt;/em&gt;):&lt;/strong&gt; The number of items in a system (&lt;em&gt;L&lt;/em&gt;) is equal to the arrival rate (&lt;em&gt;λ&lt;/em&gt;) multiplied by the average time an item spends in the system (&lt;em&gt;W&lt;/em&gt;).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;TCP-Vegas AIMD:&lt;/strong&gt; An Additive Increase, Multiplicative Decrease (AIMD) logic based on Round-Trip Time (RTT).&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  How it works:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Baseline:&lt;/strong&gt; The algorithm tracks the minimum RTT (the fastest the system can possibly go).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Additive Increase:&lt;/strong&gt; If current latency is close to the baseline (no queuing detected), it cautiously increases the concurrency limit by 1.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multiplicative Decrease:&lt;/strong&gt; If latency spikes above a threshold (e.g., 1.5 x baseline), it assumes queuing is happening downstream and immediately slashes the concurrency limit by 20%.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This allows your service to automatically "breathe" with the network. It expands to use available capacity when things are fast and contracts instantly to protect itself when things slow down.&lt;/p&gt;




&lt;h2&gt;
  
  
  Implementing with Resile
&lt;/h2&gt;

&lt;p&gt;Resile makes it trivial to add adaptive concurrency to your Go services.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// 1. Create a shared AdaptiveLimiter.&lt;/span&gt;
&lt;span class="c"&gt;// This should be shared across multiple calls to the same resource.&lt;/span&gt;
&lt;span class="n"&gt;al&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;resile&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewAdaptiveLimiter&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c"&gt;// 2. Use it in your policy.&lt;/span&gt;
&lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;resile&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewPolicy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;resile&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WithAdaptiveLimiterInstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;al&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;// 3. Execute your action.&lt;/span&gt;
&lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DoErr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;callDownstreamService&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;errors&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Is&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;resile&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ErrShedLoad&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c"&gt;// The limiter has dynamically reduced the limit and shed this request&lt;/span&gt;
    &lt;span class="c"&gt;// to protect the system.&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Why "TCP-Vegas"?
&lt;/h3&gt;

&lt;p&gt;Unlike other congestion control algorithms (like TCP-Reno) that wait for packet loss to react, TCP-Vegas reacts to &lt;strong&gt;latency changes&lt;/strong&gt;. This is perfect for microservices where "packet loss" usually means a timed-out request or a 503 error—both of which we want to avoid &lt;em&gt;before&lt;/em&gt; they happen.&lt;/p&gt;




&lt;h2&gt;
  
  
  Zero-Configuration Resilience
&lt;/h2&gt;

&lt;p&gt;One of the biggest benefits of Adaptive Concurrency is that it requires &lt;strong&gt;zero manual configuration&lt;/strong&gt;. You don't need to know if your database can handle 50 or 500 connections. The &lt;code&gt;AdaptiveLimiter&lt;/code&gt; will discover the optimal limit in real-time.&lt;/p&gt;

&lt;p&gt;It even handles "Network Drift." Over time, the minimum baseline RTT is gradually decayed, allowing the system to recalibrate if you migrate your database to a faster region or if the network topology changes.&lt;/p&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Resilience isn't just about surviving failures; it's about &lt;strong&gt;adapting&lt;/strong&gt; to them. By moving from static bulkheads to adaptive concurrency, you're building a system that can intelligently protect itself from cascading failures while maximizing throughput during "peace time."&lt;/p&gt;

&lt;p&gt;Check out the &lt;a href="https://github.com/cinar/resile/tree/main/examples/adaptiveconcurrency" rel="noopener noreferrer"&gt;Adaptive Concurrency Example&lt;/a&gt; in the Resile repository to see it in action.&lt;/p&gt;

</description>
      <category>go</category>
      <category>distributedsystems</category>
      <category>sre</category>
      <category>microservices</category>
    </item>
    <item>
      <title>Respecting Boundaries: Precise Rate Limiting in Go</title>
      <dc:creator>Onur Cinar</dc:creator>
      <pubDate>Tue, 24 Mar 2026 13:00:00 +0000</pubDate>
      <link>https://dev.to/onurcinar/respecting-boundaries-precise-rate-limiting-in-go-lca</link>
      <guid>https://dev.to/onurcinar/respecting-boundaries-precise-rate-limiting-in-go-lca</guid>
      <description>&lt;p&gt;Traffic spikes are a double-edged sword. On one hand, you’re busy! On the other, those spikes can overwhelm your services or exceed your downstream quotas. &lt;/p&gt;

&lt;p&gt;Whether you're protecting your own database from an unexpected burst or respecting a third-party API’s strict 100 requests-per-second (RPS) limit, you need a precise way to shape your traffic.&lt;/p&gt;

&lt;p&gt;Enter the &lt;strong&gt;Token Bucket Rate Limiter&lt;/strong&gt; in &lt;a href="https://github.com/cinar/resile" rel="noopener noreferrer"&gt;Resile&lt;/a&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem: Unbounded Traffic
&lt;/h2&gt;

&lt;p&gt;In a distributed environment, your clients don't know about each other. If 50 different microservice instances all decide to call a downstream API at the same time, the aggregate traffic can easily exceed the capacity of the target system. &lt;/p&gt;

&lt;p&gt;When you exceed these limits, you'll often see:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;HTTP 429 (Too Many Requests)&lt;/strong&gt;: Downstream services start rejecting you.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cascading Latency&lt;/strong&gt;: The target system slows down for &lt;em&gt;everyone&lt;/em&gt; because it's processing too many requests at once.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cost Overruns&lt;/strong&gt;: Many cloud providers and SaaS APIs charge significant premiums for exceeding agreed-upon quotas.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  The Solution: The Token Bucket Algorithm
&lt;/h2&gt;

&lt;p&gt;The &lt;strong&gt;Token Bucket&lt;/strong&gt; is a classic algorithm used for traffic shaping. &lt;/p&gt;

&lt;p&gt;Imagine a bucket that refills with "tokens" at a constant rate (e.g., 100 tokens per second). Every request must consume a token from the bucket. If the bucket is empty, the request is rejected immediately. This allows for small "bursts" (filling the bucket) while maintaining a precise long-term average rate.&lt;/p&gt;

&lt;h3&gt;
  
  
  Implementing with Resile:
&lt;/h3&gt;

&lt;p&gt;Resile makes adding rate limiting to your executions simple.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// Allow 100 requests per second.&lt;/span&gt;
&lt;span class="c"&gt;// If the limit is exceeded, it fails fast with resile.ErrRateLimitExceeded.&lt;/span&gt;
&lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;resile&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DoErr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;action&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="n"&gt;resile&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WithRateLimiter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Second&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Rate Limiting vs. Adaptive Retries
&lt;/h3&gt;

&lt;p&gt;Wait, doesn't Resile already have &lt;code&gt;AdaptiveBucket&lt;/code&gt;? What's the difference?&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;AdaptiveBucket&lt;/strong&gt; is &lt;em&gt;success-based&lt;/em&gt;. It tracks how many requests are succeeding vs. failing and throttles &lt;em&gt;retries&lt;/em&gt; accordingly. It's designed specifically to prevent "retry storms" when a service is failing.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;RateLimiter&lt;/strong&gt; is &lt;em&gt;time-based&lt;/em&gt;. It enforces a strict, constant quota of requests over a time interval. It’s designed for general traffic shaping and quota management.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For maximum protection, you can even use them together!&lt;/p&gt;




&lt;h2&gt;
  
  
  Shared Rate Limiters
&lt;/h2&gt;

&lt;p&gt;Often, you want to enforce a global rate limit across your entire service instance. You can create a shared &lt;code&gt;RateLimiter&lt;/code&gt; and pass it to multiple executions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// Shared rate limiter for a specific API key or downstream service&lt;/span&gt;
&lt;span class="n"&gt;limiter&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;resile&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewRateLimiter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Second&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;// Each call will consume tokens from the same shared bucket.&lt;/span&gt;
&lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;resile&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DoErr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;myAction&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="n"&gt;resile&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WithRateLimiterInstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;limiter&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Observability: Seeing the Shaping
&lt;/h2&gt;

&lt;p&gt;Knowing &lt;em&gt;when&lt;/em&gt; and &lt;em&gt;why&lt;/em&gt; your traffic is being throttled is essential for operational visibility. &lt;/p&gt;

&lt;p&gt;If you use Resile's telemetry integrations (like &lt;code&gt;slog&lt;/code&gt; or &lt;code&gt;OpenTelemetry&lt;/code&gt;), you'll get automatic visibility into these events. The &lt;code&gt;OnRateLimitExceeded&lt;/code&gt; event is triggered whenever a request is rejected by the rate limiter, allowing you to monitor your quota utilization in real-time.&lt;/p&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Rate limiting is not just about saying "no"; it's about being a good citizen in a distributed ecosystem. By respecting boundaries and shaping your traffic at the source, you protect both your own service and the systems you depend on.&lt;/p&gt;

&lt;p&gt;Resile provides a production-grade rate limiter that integrates seamlessly into your resilience policies, giving you fine-grained control over your traffic flow.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Learn more about Resile:&lt;/strong&gt; &lt;a href="https://github.com/cinar/resile" rel="noopener noreferrer"&gt;github.com/cinar/resile&lt;/a&gt;&lt;/p&gt;

</description>
      <category>go</category>
      <category>microservices</category>
      <category>sre</category>
      <category>devops</category>
    </item>
    <item>
      <title>Stop the Domino Effect: Bulkhead Isolation in Go</title>
      <dc:creator>Onur Cinar</dc:creator>
      <pubDate>Sun, 22 Mar 2026 17:42:19 +0000</pubDate>
      <link>https://dev.to/onurcinar/stop-the-domino-effect-bulkhead-isolation-in-go-5cgl</link>
      <guid>https://dev.to/onurcinar/stop-the-domino-effect-bulkhead-isolation-in-go-5cgl</guid>
      <description>&lt;p&gt;In a distributed system, failure is inevitable. But a failure in one part of your system shouldn't bring down everything else. &lt;/p&gt;

&lt;p&gt;Imagine your Go service depends on three different downstream APIs: Payments, Inventory, and Recommendations. Suddenly, the Recommendations API starts taking 30 seconds to respond. If your service doesn't have isolation, your goroutines will start piling up waiting for Recommendations. Eventually, you'll hit your process limit, and even the critical Payments API calls will start failing because there are no resources left to handle them.&lt;/p&gt;

&lt;p&gt;This is the &lt;strong&gt;Domino Effect&lt;/strong&gt;, and the &lt;strong&gt;Bulkhead Pattern&lt;/strong&gt; is how you stop it.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem: Resource Exhaustion
&lt;/h2&gt;

&lt;p&gt;When one dependency slows down, it consumes resources:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Goroutines&lt;/strong&gt;: Blocked waiting for a response.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Memory&lt;/strong&gt;: Each blocked goroutine carries a stack.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;File Descriptors/Sockets&lt;/strong&gt;: Open connections to the slow service.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Without a bulkhead, a single slow dependency can "starve" the rest of your application, leading to a total system collapse.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Solution: The Bulkhead Pattern
&lt;/h2&gt;

&lt;p&gt;Named after the partitioned sections of a ship's hull, a &lt;strong&gt;Bulkhead&lt;/strong&gt; isolates failures. If one section of the ship is flooded, the others remain buoyant. In software, we achieve this by limiting the number of concurrent executions allowed for a specific resource or dependency.&lt;/p&gt;

&lt;h3&gt;
  
  
  Implementing with Resile:
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://github.com/cinar/resile" rel="noopener noreferrer"&gt;Resile&lt;/a&gt; makes it trivial to add bulkhead isolation to any operation.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// Allow only 10 concurrent calls to this specific operation.&lt;/span&gt;
&lt;span class="c"&gt;// If an 11th call comes in, it fails fast with resile.ErrBulkheadFull.&lt;/span&gt;
&lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;resile&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DoErr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;action&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="n"&gt;resile&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WithBulkhead&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;10&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Using a Shared Bulkhead
&lt;/h3&gt;

&lt;p&gt;Often, you want to limit concurrency across multiple different call sites that hit the same downstream service. You can create a shared &lt;code&gt;Bulkhead&lt;/code&gt; instance for this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// Create a shared bulkhead for the "Inventory Service"&lt;/span&gt;
&lt;span class="n"&gt;inventoryBulkhead&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;resile&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewBulkhead&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;20&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;// Call Site A&lt;/span&gt;
&lt;span class="n"&gt;resile&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DoErr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;fetchItem&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;resile&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WithBulkheadInstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;inventoryBulkhead&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="c"&gt;// Call Site B&lt;/span&gt;
&lt;span class="n"&gt;resile&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DoErr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;updateStock&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;resile&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WithBulkheadInstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;inventoryBulkhead&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;By sharing the instance, you ensure that the &lt;em&gt;total&lt;/em&gt; concurrency hitting the Inventory Service never exceeds 20, regardless of which part of your code is making the call.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why "Fail-Fast" Matters
&lt;/h2&gt;

&lt;p&gt;When a bulkhead is full, Resile immediately returns &lt;code&gt;resile.ErrBulkheadFull&lt;/code&gt;. &lt;/p&gt;

&lt;p&gt;This is much better than waiting for a timeout. By failing fast, you:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Preserve Resources&lt;/strong&gt;: You don't spawn another goroutine or open another connection.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Provide Immediate Feedback&lt;/strong&gt;: Your upstream callers get an error instantly and can decide how to handle it (e.g., show a cached result or a "service busy" message).&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Observability: Monitoring the Walls
&lt;/h2&gt;

&lt;p&gt;You need to know when your bulkheads are working. If a bulkhead is frequently full, it might mean your downstream service is struggling, or you need to re-evaluate your capacity limits.&lt;/p&gt;

&lt;p&gt;If you use Resile's telemetry integrations (like &lt;code&gt;slog&lt;/code&gt; or &lt;code&gt;OpenTelemetry&lt;/code&gt;), you'll get automatic alerts when a bulkhead saturates. The &lt;code&gt;OnBulkheadFull&lt;/code&gt; event is triggered every time a request is rejected due to capacity limits.&lt;/p&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Bulkheads are a fundamental building block of resilient systems. By isolating your dependencies, you ensure that a local fire doesn't become a global conflagration.&lt;/p&gt;

&lt;p&gt;Resile provides a clean, "Go-native" way to implement bulkheads without complex boilerplate, allowing you to focus on your business logic while keeping your system stable.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Explore Resile on GitHub:&lt;/strong&gt; &lt;a href="https://github.com/cinar/resile" rel="noopener noreferrer"&gt;github.com/cinar/resile&lt;/a&gt;&lt;/p&gt;

</description>
      <category>go</category>
      <category>microservices</category>
      <category>backend</category>
      <category>distributedsystems</category>
    </item>
    <item>
      <title>Infinite Data Processing in Go: Building Resilient Data Pipes with Channels</title>
      <dc:creator>Onur Cinar</dc:creator>
      <pubDate>Wed, 18 Mar 2026 16:00:00 +0000</pubDate>
      <link>https://dev.to/onurcinar/infinite-data-processing-in-go-building-resilient-data-pipes-with-channels-46d5</link>
      <guid>https://dev.to/onurcinar/infinite-data-processing-in-go-building-resilient-data-pipes-with-channels-46d5</guid>
      <description>&lt;p&gt;When building data-intensive applications, we usually start with the most obvious approach: loading data into a slice or array, iterating over it to process the data, and returning the result. This batch-processing mindset works great—until the data never stops coming.&lt;/p&gt;

&lt;p&gt;Whether you are dealing with live IoT telemetry, continuous log tailing, or real-time financial market feeds, you quickly run into the problem of "infinite" data. If you try to append an endless stream of stock ticks to a &lt;code&gt;[]float64&lt;/code&gt;, your application will inevitably consume all available memory and crash. &lt;/p&gt;

&lt;p&gt;To handle infinite data gracefully, you need to shift your architecture from batch processing to stream processing. In Go, we have the perfect built-in primitive for this: &lt;strong&gt;Channels&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Power of Channels as Data Pipes
&lt;/h2&gt;

&lt;p&gt;Go channels are often taught primarily as a way to synchronize goroutines, but they are also incredibly powerful as sequential data pipes. By treating channels as standard inputs and outputs, you can build decoupled, memory-efficient pipelines where data flows through a series of transformations continuously.&lt;/p&gt;

&lt;p&gt;When redesigning my open-source technical analysis library, &lt;strong&gt;&lt;a href="https://github.com/cinar/indicator" rel="noopener noreferrer"&gt;cinar/indicator&lt;/a&gt;&lt;/strong&gt;, for its v2 release, I faced exactly this challenge. In algorithmic trading, systems need to react instantly to live market feeds without accumulating massive memory overhead. Transitioning the library's core architecture from slice-based arrays to stream-based Go channels solved this elegantly.&lt;/p&gt;

&lt;p&gt;Let's look at how to build a continuous data pipe, and some of the tricky edge cases you'll encounter along the way.&lt;/p&gt;

&lt;h2&gt;
  
  
  Building a Pipeline Stage
&lt;/h2&gt;

&lt;p&gt;Imagine we want to calculate a Simple Moving Average (SMA) over a live stream of data. Instead of taking a slice, our function will accept a read-only channel as its input and return a read-only channel as its output.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;package&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s"&gt;"fmt"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;// SimpleMovingAverage acts as a pipe: it reads from 'input', processes, and writes to 'output'&lt;/span&gt;
&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;SimpleMovingAverage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;input&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="k"&gt;chan&lt;/span&gt; &lt;span class="kt"&gt;float64&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;period&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="k"&gt;chan&lt;/span&gt; &lt;span class="kt"&gt;float64&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;output&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="nb"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;chan&lt;/span&gt; &lt;span class="kt"&gt;float64&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;go&lt;/span&gt; &lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="c"&gt;// Ensure the output channel is closed when the input stream ends&lt;/span&gt;
        &lt;span class="k"&gt;defer&lt;/span&gt; &lt;span class="nb"&gt;close&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;output&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 

        &lt;span class="n"&gt;window&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="nb"&gt;make&lt;/span&gt;&lt;span class="p"&gt;([]&lt;/span&gt;&lt;span class="kt"&gt;float64&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;period&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;sum&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="m"&gt;0.0&lt;/span&gt;

        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;val&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="k"&gt;range&lt;/span&gt; &lt;span class="n"&gt;input&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;window&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;window&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;val&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;sum&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;val&lt;/span&gt;

            &lt;span class="c"&gt;// Keep the window size fixed&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;window&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;period&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="n"&gt;sum&lt;/span&gt; &lt;span class="o"&gt;-=&lt;/span&gt; &lt;span class="n"&gt;window&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
                &lt;span class="n"&gt;window&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;window&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;

            &lt;span class="c"&gt;// Only emit a value once we have enough data points&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;window&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;period&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="n"&gt;output&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt; &lt;span class="n"&gt;sum&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="kt"&gt;float64&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;period&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}()&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;output&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Because the processing happens inside its own goroutine, the function returns the &lt;code&gt;output&lt;/code&gt; channel immediately. The goroutine stays alive, eagerly waiting for new data to arrive on the &lt;code&gt;input&lt;/code&gt; channel.&lt;/p&gt;

&lt;h2&gt;
  
  
  Handling Stream Complexities with Helpers
&lt;/h2&gt;

&lt;p&gt;Once you start relying heavily on channels, you run into a few structural challenges. To make working with channels just as easy as working with slices, &lt;code&gt;cinar/indicator&lt;/code&gt; includes a robust &lt;code&gt;helper&lt;/code&gt; package. &lt;/p&gt;

&lt;p&gt;If you are building your own stream-based application, you can leverage these helpers directly from the library rather than reinventing the wheel.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. The Branching Problem
&lt;/h3&gt;

&lt;p&gt;A major gotcha with Go channels: once a value is read from a channel, it's gone. What if you want to calculate an SMA &lt;em&gt;and&lt;/em&gt; a Relative Strength Index (RSI) from the exact same live price ticker? You can't have two consumers read from one channel without them stealing data from each other.&lt;/p&gt;

&lt;p&gt;To solve this, the library provides &lt;strong&gt;&lt;code&gt;helper.Duplicate&lt;/code&gt;&lt;/strong&gt;. This function takes one input channel and "fans it out" into multiple identical output channels. This allows you to safely branch your data stream to multiple independent technical indicators simultaneously without race conditions or data loss.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// Branching one price stream into three identical streams&lt;/span&gt;
&lt;span class="n"&gt;priceStreams&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;helper&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Duplicate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;livePrices&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;smaStream&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;indicator&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SMA&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;priceStreams&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="m"&gt;14&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;rsiStream&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;indicator&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RSI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;priceStreams&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="m"&gt;14&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;macdStream&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;indicator&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MACD&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;priceStreams&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="m"&gt;2&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="m"&gt;12&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;26&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;9&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Lookbacks and Sliding Windows
&lt;/h3&gt;

&lt;p&gt;Many data processing algorithms require looking back at the last N periods. Instead of managing a sliding window manually inside every single function (like we did in the basic SMA example above), the library uses &lt;strong&gt;&lt;code&gt;helper.Buffered&lt;/code&gt;&lt;/strong&gt;. This provides a clean abstraction to maintain a rolling state over a continuous channel, vastly simplifying the development of complex logic.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Bridging the Gap: Slices vs. Streams
&lt;/h3&gt;

&lt;p&gt;The rest of the world often still speaks in batches. You might be downloading historical CSV data for backtesting, or you might need to output an array for a charting UI. To bridge this gap, the &lt;code&gt;helper&lt;/code&gt; package includes utilities to fluidly move between paradigms:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;helper.SliceToChan&lt;/code&gt;&lt;/strong&gt;: Converts a static historical array into a simulated live data stream. It spins up a goroutine, pushes every element from the slice into a channel, and closes it. It's perfect for feeding historical backtests into a live-stream architecture.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;helper.ChanToSlice&lt;/code&gt;&lt;/strong&gt;: The inverse operation. It drains a stream back into an array, which is incredibly useful for writing unit tests or rendering charts.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Chaining the Pipes Together
&lt;/h2&gt;

&lt;p&gt;Because all the indicators and helpers in &lt;code&gt;cinar/indicator&lt;/code&gt; take channels and return channels, they are highly composable. We can chain them together like Unix command-line pipes (&lt;code&gt;|&lt;/code&gt;). &lt;/p&gt;

&lt;p&gt;Here is what it looks like to wire up an application using these concepts:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;package&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s"&gt;"fmt"&lt;/span&gt;
    &lt;span class="s"&gt;"[github.com/cinar/indicator/v2/helper](https://github.com/cinar/indicator/v2/helper)"&lt;/span&gt;
    &lt;span class="s"&gt;"[github.com/cinar/indicator/v2/trend](https://github.com/cinar/indicator/v2/trend)"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c"&gt;// 1. We start with a static slice of historical data&lt;/span&gt;
    &lt;span class="n"&gt;historicalPrices&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="kt"&gt;float64&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="m"&gt;10.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;12.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;14.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;13.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;15.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;18.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;19.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;17.0&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="c"&gt;// 2. Bridge the gap: convert the slice to a live stream&lt;/span&gt;
    &lt;span class="n"&gt;marketTicks&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;helper&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SliceToChan&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;historicalPrices&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c"&gt;// 3. Pipe the ticks into a 3-period SMA processor from the library&lt;/span&gt;
    &lt;span class="n"&gt;smaStream&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;trend&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Sma&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;marketTicks&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c"&gt;// 4. Drain the output stream &lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;avg&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="k"&gt;range&lt;/span&gt; &lt;span class="n"&gt;smaStream&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"New SMA tick processed: %.2f&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;avg&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Why This Architecture Wins
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Memory Efficiency:&lt;/strong&gt; We only store the exact amount of data needed at any given moment. The Go garbage collector easily cleans up the rest, meaning we can process a continuous websocket stream for months without memory leaks.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Backpressure Handling:&lt;/strong&gt; Go channels are blocking by nature. If a complex compound strategy at the end of the pipeline is too slow, the channels will naturally fill up, pausing the producers further up the chain until it catches up. &lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Decoupling:&lt;/strong&gt; Each pipeline stage is completely isolated. The indicator doesn't know if the data is coming from a historical Tiingo repository, an Alpaca websocket, or a mock unit test. It just reads from &lt;code&gt;&amp;lt;-chan T&lt;/code&gt; and writes to &lt;code&gt;&amp;lt;-chan T&lt;/code&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Try It Out
&lt;/h2&gt;

&lt;p&gt;By treating channels as native data streams and relying on robust helper utilities, you can build highly resilient, concurrent pipelines capable of processing truly infinite data sets. &lt;/p&gt;

&lt;p&gt;If you are building financial tools, real-time dashboards, or are just looking to explore a Go codebase that relies heavily on generics and channel-based streaming, I highly recommend leveraging the &lt;strong&gt;&lt;a href="https://github.com/cinar/indicator" rel="noopener noreferrer"&gt;cinar/indicator&lt;/a&gt;&lt;/strong&gt; library. It comes batteries-included with all the helpers and technical indicators you need to get started with stream processing in Go.&lt;/p&gt;

&lt;p&gt;How are you handling continuous data streams in your applications? Let me know in the comments!&lt;/p&gt;

</description>
      <category>go</category>
      <category>datascience</category>
      <category>opensource</category>
      <category>architecture</category>
    </item>
    <item>
      <title>Self-Healing State Machines: Resilient State Transitions in Go</title>
      <dc:creator>Onur Cinar</dc:creator>
      <pubDate>Mon, 16 Mar 2026 13:00:00 +0000</pubDate>
      <link>https://dev.to/onurcinar/self-healing-state-machines-resilient-state-transitions-in-go-3e0</link>
      <guid>https://dev.to/onurcinar/self-healing-state-machines-resilient-state-transitions-in-go-3e0</guid>
      <description>&lt;p&gt;Distributed systems are inherently stateful. Whether you're managing a database connection pool, a multi-step payment workflow, or a complex IoT device lifecycle, you need to transition between states reliably.&lt;/p&gt;

&lt;p&gt;Standard state machines (FSMs) are great for logic, but they are often brittle. What happens if a transition involves a network call that fails? Most developers end up wrapping their &lt;code&gt;machine.Transition()&lt;/code&gt; calls in manual retry loops, cluttering their business logic and losing visibility into &lt;em&gt;why&lt;/em&gt; a transition failed.&lt;/p&gt;

&lt;p&gt;Inspired by Erlang's &lt;code&gt;gen_statem&lt;/code&gt; behavior, &lt;strong&gt;Resile&lt;/strong&gt; introduces &lt;code&gt;resile.StateMachine&lt;/code&gt;: a standardized, resilient state machine where every transition is inherently protected by resilience policies.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Resile Way: One-Line Resilience for Transitions
&lt;/h2&gt;

&lt;p&gt;With &lt;code&gt;resile.StateMachine&lt;/code&gt;, you don't just define how to move from State A to State B. You define a &lt;strong&gt;Resilient Transition&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Here is how you implement a self-healing connection manager:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="s"&gt;"github.com/cinar/resile"&lt;/span&gt;

&lt;span class="c"&gt;// 1. Define your State, Data, and Events&lt;/span&gt;
&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;State&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;
&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;Disconnected&lt;/span&gt; &lt;span class="n"&gt;State&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"Disconnected"&lt;/span&gt;
    &lt;span class="n"&gt;Connected&lt;/span&gt;    &lt;span class="n"&gt;State&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"Connected"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;Event&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;
&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;Connect&lt;/span&gt; &lt;span class="n"&gt;Event&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"Connect"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;// 2. Define the Transition Logic&lt;/span&gt;
&lt;span class="n"&gt;transition&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt; &lt;span class="n"&gt;State&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="n"&gt;Data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt; &lt;span class="n"&gt;Event&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rs&lt;/span&gt; &lt;span class="n"&gt;resile&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RetryState&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;State&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;Disconnected&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;Connect&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="c"&gt;// This transition involves a network call.&lt;/span&gt;
        &lt;span class="c"&gt;// If it fails, Resile will automatically retry it &lt;/span&gt;
        &lt;span class="c"&gt;// using the configured backoff and jitter.&lt;/span&gt;
        &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;apiClient&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Connect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Endpoint&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s"&gt;""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;Connected&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;// 3. Initialize the Resilient State Machine&lt;/span&gt;
&lt;span class="n"&gt;sm&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;resile&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewStateMachine&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;Disconnected&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="n"&gt;Data&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;Endpoint&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"api.example.com"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt; 
    &lt;span class="n"&gt;transition&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;resile&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WithMaxAttempts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;3&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;resile&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WithBaseDelay&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;100&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Millisecond&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;// 4. Handle events safely&lt;/span&gt;
&lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;sm&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Handle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Connect&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  What happens under the hood?
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;When you call &lt;code&gt;sm.Handle(ctx, event)&lt;/code&gt;, Resile enters a execution envelope.&lt;/li&gt;
&lt;li&gt;It executes your &lt;code&gt;transition&lt;/code&gt; function.&lt;/li&gt;
&lt;li&gt;If the &lt;code&gt;transition&lt;/code&gt; returns an error, Resile applies your retry policy (e.g., Exponential Backoff with Jitter).&lt;/li&gt;
&lt;li&gt;Only when the &lt;code&gt;transition&lt;/code&gt; succeeds does the &lt;code&gt;StateMachine&lt;/code&gt; update its internal state and data.&lt;/li&gt;
&lt;li&gt;If the retries are exhausted, the &lt;code&gt;StateMachine&lt;/code&gt; remains in its previous state, ensuring consistency.&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Why "Self-Healing"?
&lt;/h2&gt;

&lt;p&gt;Most state machine implementations are "fire and forget" or "fail and stop." A &lt;strong&gt;Self-Healing&lt;/strong&gt; state machine assumes that transitions are risky and provides the infrastructure to recover from those risks automatically.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Automatic Retries&lt;/strong&gt;: No more manual loops inside your state logic.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Circuit Breakers&lt;/strong&gt;: If a specific transition (e.g., to a "Maintenance" state) is failing repeatedly, the circuit breaker can trip to prevent overwhelming the system.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Context Awareness&lt;/strong&gt;: If the transition is part of a timed-out request, the state machine cancels the transition attempt immediately, preventing goroutine leaks.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Observability&lt;/strong&gt;: Every transition attempt—including retries—is tracked by Resile's telemetry hooks. You can see exactly how many times your machine "struggled" to reach the &lt;code&gt;Connected&lt;/code&gt; state.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Observability: Tracking State Success
&lt;/h2&gt;

&lt;p&gt;By using Resile's OpenTelemetry or &lt;code&gt;slog&lt;/code&gt; integrations, you get deep insights into your state machine's health:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Attempts per Transition&lt;/strong&gt;: See which events are causing the most retries.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Transition Latency&lt;/strong&gt;: Measure how long it takes to move from one state to another, including backoff time.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Failure Patterns&lt;/strong&gt;: Identify if a specific state is a "dead end" due to persistent errors.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Resilience isn't just for simple API calls. By bringing resilience to the core of your stateful logic, you build systems that are not only more robust but also significantly easier to debug and monitor.&lt;/p&gt;

&lt;p&gt;Stop writing manual retry loops around your state changes. Let &lt;code&gt;resile.StateMachine&lt;/code&gt; handle the complexity of the "unreliable world" while you focus on the logic of your application.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Give Resile a star on GitHub:&lt;/strong&gt; &lt;a href="https://github.com/cinar/resile" rel="noopener noreferrer"&gt;github.com/cinar/resile&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;How are you managing state transitions in your Go microservices? Let's discuss!&lt;/p&gt;

</description>
      <category>go</category>
      <category>microservices</category>
      <category>programming</category>
      <category>distributedsystems</category>
    </item>
    <item>
      <title>Preventing Microservice Meltdowns: Adaptive Retries and Circuit Breakers in Go</title>
      <dc:creator>Onur Cinar</dc:creator>
      <pubDate>Sun, 15 Mar 2026 22:28:33 +0000</pubDate>
      <link>https://dev.to/onurcinar/preventing-microservice-meltdowns-adaptive-retries-and-circuit-breakers-in-go-30ho</link>
      <guid>https://dev.to/onurcinar/preventing-microservice-meltdowns-adaptive-retries-and-circuit-breakers-in-go-30ho</guid>
      <description>&lt;p&gt;We’ve all been there. A downstream database has a momentary blip. Your service instances, being "resilient," immediately start retrying their failed requests. &lt;/p&gt;

&lt;p&gt;Suddenly, the database isn't just "having a blip" anymore—it’s being hammered by a self-inflicted DDoS attack from its own clients. This is the &lt;strong&gt;Retry Storm&lt;/strong&gt; (or Thundering Herd), and it’s one of the most common ways distributed systems experience total meltdowns.&lt;/p&gt;

&lt;p&gt;Standard exponential backoff protects individual services, but it doesn't protect the &lt;em&gt;cluster&lt;/em&gt;. To do that, you need a layered defense-in-depth approach.&lt;/p&gt;

&lt;p&gt;Here is how to prevent microservice meltdowns in Go using &lt;a href="https://github.com/cinar/resile" rel="noopener noreferrer"&gt;Resile&lt;/a&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem: Aggregate Load
&lt;/h2&gt;

&lt;p&gt;Imagine you have 100 instances of your API. Each instance is configured to retry 3 times. If the database slows down, you suddenly have &lt;strong&gt;300 extra requests&lt;/strong&gt; hitting it exactly when it's struggling to recover.&lt;/p&gt;

&lt;p&gt;Even with jitter, the aggregate load can be enough to keep the database in a "failed" state indefinitely. To solve this, we need two patterns working together: &lt;strong&gt;Adaptive Retries&lt;/strong&gt; and &lt;strong&gt;Circuit Breakers&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  1. Adaptive Retries (The Token Bucket)
&lt;/h2&gt;

&lt;p&gt;Inspired by Google's SRE book and AWS SDKs, &lt;strong&gt;Adaptive Retries&lt;/strong&gt; use a client-side token bucket to "fail fast" locally.&lt;/p&gt;

&lt;p&gt;The logic is simple:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Every &lt;strong&gt;success&lt;/strong&gt; adds a small amount of "credit" to your bucket.&lt;/li&gt;
&lt;li&gt;Every &lt;strong&gt;retry&lt;/strong&gt; consumes a significant amount of credit.&lt;/li&gt;
&lt;li&gt;If the bucket is empty, Resile &lt;strong&gt;stops retrying immediately&lt;/strong&gt; and fails fast locally.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This ensures that if a downstream service is fundamentally degraded, your fleet of clients will automatically throttle their retry pressure at the source, giving the service breathing room to recover.&lt;/p&gt;

&lt;h3&gt;
  
  
  Implementing with Resile:
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// Share this bucket across multiple executions or even your entire service&lt;/span&gt;
&lt;span class="n"&gt;bucket&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;resile&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DefaultAdaptiveBucket&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;resile&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DoErr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;action&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="n"&gt;resile&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WithAdaptiveBucket&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bucket&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  2. Circuit Breakers (The Kill Switch)
&lt;/h2&gt;

&lt;p&gt;While retries assume "eventual success," a &lt;strong&gt;Circuit Breaker&lt;/strong&gt; assumes "statistical failure." &lt;/p&gt;

&lt;p&gt;If a service fails 5 times in a row, the breaker "trips" (opens). For the next 30 seconds, every call to that service will fail &lt;strong&gt;instantly&lt;/strong&gt; without even trying to hit the network. This protects your downstream infrastructure from useless traffic and saves your local resources (threads, memory, sockets).&lt;/p&gt;

&lt;h3&gt;
  
  
  Layering it in Resile:
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="s"&gt;"github.com/cinar/resile/circuit"&lt;/span&gt;

&lt;span class="c"&gt;// Create a breaker: Trip after 5 failures, wait 30s to retry&lt;/span&gt;
&lt;span class="n"&gt;cb&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;circuit&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;New&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;circuit&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Config&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;FailureThreshold&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="m"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;ResetTimeout&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;     &lt;span class="m"&gt;30&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Second&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;resile&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DoErr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;action&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="n"&gt;resile&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WithCircuitBreaker&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cb&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  The Ultimate Defense: Layered Resilience
&lt;/h2&gt;

&lt;p&gt;The real power of Resile comes from combining these patterns. You can layer Retries, Circuit Breakers, and Adaptive Buckets into a single execution strategy.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;resile&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DoErr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;action&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;resile&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WithMaxAttempts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;3&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;           &lt;span class="c"&gt;// Layer 1: Handle random blips&lt;/span&gt;
    &lt;span class="n"&gt;resile&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WithCircuitBreaker&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cb&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;      &lt;span class="c"&gt;// Layer 2: Stop hitting a dead service&lt;/span&gt;
    &lt;span class="n"&gt;resile&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WithAdaptiveBucket&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bucket&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;  &lt;span class="c"&gt;// Layer 3: Prevent cluster-wide storms&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In this setup:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Retries&lt;/strong&gt; handle the "one-off" network glitches.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The Circuit Breaker&lt;/strong&gt; stops you from wasting time on a service that is clearly down.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The Adaptive Bucket&lt;/strong&gt; ensures that even if the breaker hasn't tripped yet, you won't overwhelm the system with aggregate retry load.&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Observability: Seeing the Shield in Action
&lt;/h2&gt;

&lt;p&gt;Protecting your system is great, but &lt;em&gt;knowing&lt;/em&gt; you’re being protected is better. &lt;/p&gt;

&lt;p&gt;If you use Resile's &lt;code&gt;slog&lt;/code&gt; or &lt;code&gt;OpenTelemetry&lt;/code&gt; integrations, you'll see exactly when these shields activate. Your logs will show &lt;code&gt;retry.throlled=true&lt;/code&gt; when the adaptive bucket kicks in, or your traces will show a &lt;code&gt;circuit.open&lt;/code&gt; error when the breaker prevents a call.&lt;/p&gt;

&lt;p&gt;This visibility is crucial for SREs to understand &lt;em&gt;why&lt;/em&gt; traffic is failing and &lt;em&gt;how&lt;/em&gt; the system is self-healing.&lt;/p&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Building resilient microservices isn't just about making individual calls "smarter." It's about ensuring that your entire architecture can survive a storm without collapsing under its own weight.&lt;/p&gt;

&lt;p&gt;By combining opinionated retries, circuit breakers, and adaptive throttling, Resile gives you a production-grade resilience engine that scales with your infrastructure.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Try Resile today:&lt;/strong&gt; &lt;a href="https://github.com/cinar/resile" rel="noopener noreferrer"&gt;github.com/cinar/resile&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;How do you prevent "retry storms" in your Go clusters? Let's discuss in the comments!&lt;/p&gt;

</description>
      <category>go</category>
      <category>microservices</category>
      <category>sre</category>
      <category>devops</category>
    </item>
    <item>
      <title>Beating Tail Latency: A Guide to Request Hedging in Go Microservices</title>
      <dc:creator>Onur Cinar</dc:creator>
      <pubDate>Sat, 14 Mar 2026 03:03:35 +0000</pubDate>
      <link>https://dev.to/onurcinar/beating-tail-latency-a-guide-to-request-hedging-in-go-microservices-p81</link>
      <guid>https://dev.to/onurcinar/beating-tail-latency-a-guide-to-request-hedging-in-go-microservices-p81</guid>
      <description>&lt;p&gt;In distributed systems, we often talk about "The Long Tail." &lt;/p&gt;

&lt;p&gt;You might have a service where 95% of requests finish in under 100ms. But that last 1% (the P99 latency)? Those requests might take 2 seconds or more. In a microservice architecture where one user action triggers 10 different service calls, that one slow dependency will bottleneck the entire user experience.&lt;/p&gt;

&lt;p&gt;Standard retries don't help here. Why? Because a "Tail Latency" request hasn't failed yet—it’s just &lt;em&gt;slow&lt;/em&gt;. &lt;/p&gt;

&lt;p&gt;Waiting for a 2-second timeout to trigger a retry is a waste of time. To beat the long tail, you need &lt;strong&gt;Request Hedging&lt;/strong&gt; (also known as Speculative Retries).&lt;/p&gt;

&lt;p&gt;Here is how to implement it safely in Go using &lt;a href="https://github.com/cinar/resile" rel="noopener noreferrer"&gt;Resile&lt;/a&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  What is Request Hedging?
&lt;/h2&gt;

&lt;p&gt;The concept is simple but powerful: If a request is taking longer than usual (say, longer than the P95 latency), don't kill it. Instead, &lt;strong&gt;start a second, identical request in parallel.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Whichever request finishes first, you take its result and cancel the other one.&lt;/p&gt;

&lt;p&gt;This "speculative" approach drastically reduces P99 latency because the mathematical probability of &lt;em&gt;two&lt;/em&gt; identical requests hitting the "long tail" simultaneously is extremely low.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Complexity of Manual Hedging
&lt;/h2&gt;

&lt;p&gt;Implementing hedging manually in Go is a nightmare of goroutine management:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;You need a &lt;code&gt;select&lt;/code&gt; block with a timer.&lt;/li&gt;
&lt;li&gt;You need to coordinate between two (or more) goroutines.&lt;/li&gt;
&lt;li&gt;You must ensure that once one succeeds, the others are cancelled immediately to save resources.&lt;/li&gt;
&lt;li&gt;You have to handle race conditions where both might succeed at the exact same millisecond.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Most developers end up with hundreds of lines of brittle boilerplate code to handle just one hedged call.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Resile Way: &lt;code&gt;DoHedged&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Resile&lt;/strong&gt; makes request hedging as simple as a single function call. It handles the goroutine lifecycle, context cancellation, and race conditions for you.&lt;/p&gt;

&lt;p&gt;Here is how you fetch data with a 100ms hedging delay:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="s"&gt;"github.com/cinar/resile"&lt;/span&gt;

&lt;span class="c"&gt;// data is automatically inferred as *User&lt;/span&gt;
&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;resile&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DoHedged&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;User&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;apiClient&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;GetUser&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;userID&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;},&lt;/span&gt; 
    &lt;span class="n"&gt;resile&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WithMaxAttempts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;3&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;resile&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WithHedgingDelay&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;100&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Millisecond&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  What happens under the hood?
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Resile starts the first request.&lt;/li&gt;
&lt;li&gt;It waits for 100ms (&lt;code&gt;HedgingDelay&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;If the first request hasn't finished, it starts a &lt;strong&gt;second&lt;/strong&gt; request.&lt;/li&gt;
&lt;li&gt;As soon as one returns a successful result, Resile &lt;strong&gt;cancels the context&lt;/strong&gt; of the other request and returns the data to you.&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Picking the Right Hedging Delay
&lt;/h2&gt;

&lt;p&gt;The "magic" of hedging lies in the delay. &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Too short:&lt;/strong&gt; You double your traffic unnecessarily, putting extra load on your downstream services.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Too long:&lt;/strong&gt; You don't gain much latency benefit.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Pro-Tip:&lt;/strong&gt; A good rule of thumb is to set your &lt;code&gt;HedgingDelay&lt;/code&gt; to your &lt;strong&gt;P95 or P99 latency&lt;/strong&gt;. This ensures you only "hedge" the slowest 1-5% of requests, providing a massive latency win with minimal extra load.&lt;/p&gt;




&lt;h2&gt;
  
  
  Observability: Tracking the "Speculative" Wins
&lt;/h2&gt;

&lt;p&gt;If you're using Resile's OpenTelemetry integration (&lt;code&gt;telemetry/resileotel&lt;/code&gt;), you can actually see these wins in your distributed traces. &lt;/p&gt;

&lt;p&gt;Each hedged attempt is recorded as a sub-span. When a hedged request wins, you'll see the first span get cancelled and the second one succeed—providing clear proof that hedging saved your user from a 2-second wait.&lt;/p&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Request hedging used to be a technique reserved for companies with massive infrastructure teams. With Resile, it’s a tool that every Go developer can use to build snappier, more resilient microservices.&lt;/p&gt;

&lt;p&gt;By moving from "Wait and Retry" to "Hedge and Win," you can turn your long-tail latency into a competitive advantage.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Give Resile a star on GitHub:&lt;/strong&gt; &lt;a href="https://github.com/cinar/resile" rel="noopener noreferrer"&gt;github.com/cinar/resile&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;How are you handling tail latency in your Go services? Let's discuss in the comments!&lt;/p&gt;

</description>
      <category>go</category>
      <category>microservices</category>
      <category>performance</category>
      <category>distributedsystems</category>
    </item>
    <item>
      <title>Python's Stamina for Go: Bringing Ergonomic Resilience to Gophers</title>
      <dc:creator>Onur Cinar</dc:creator>
      <pubDate>Tue, 10 Mar 2026 02:04:41 +0000</pubDate>
      <link>https://dev.to/onurcinar/pythons-stamina-for-go-bringing-ergonomic-resilience-to-gophers-1lf2</link>
      <guid>https://dev.to/onurcinar/pythons-stamina-for-go-bringing-ergonomic-resilience-to-gophers-1lf2</guid>
      <description>&lt;p&gt;If you've ever worked in the Python ecosystem, you've likely encountered &lt;a href="https://github.com/jd/tenacity" rel="noopener noreferrer"&gt;tenacity&lt;/a&gt; or its opinionated wrapper, &lt;a href="https://github.com/hynek/stamina" rel="noopener noreferrer"&gt;stamina&lt;/a&gt;. They make retrying transient failures feel like magic: a single decorator, sensible production defaults (exponential backoff + jitter), and built-in observability.&lt;/p&gt;

&lt;p&gt;The Go ecosystem has powerful tools, but they often require a lot of boilerplate, use reflection, or lack the "Correct by Default" philosophy that makes &lt;code&gt;stamina&lt;/code&gt; so great.&lt;/p&gt;

&lt;p&gt;That's why I built &lt;strong&gt;&lt;a href="https://github.com/cinar/resile" rel="noopener noreferrer"&gt;Resile&lt;/a&gt;&lt;/strong&gt;. It’s a love letter to Python's ergonomics, written in idiomatic, type-safe Go 1.18+.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Ergonomic Gap
&lt;/h2&gt;

&lt;p&gt;In Python, retrying an API call with &lt;code&gt;stamina&lt;/code&gt; looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;stamina&lt;/span&gt;

&lt;span class="nd"&gt;@stamina.retry&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;on&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;httpx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;HTTPError&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;attempts&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_data&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;httpx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://api.example.com&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Before Go 1.18, achieving this level of simplicity was nearly impossible. You either had to write verbose &lt;code&gt;for&lt;/code&gt; loops or use libraries that relied on &lt;code&gt;interface{}&lt;/code&gt; and reflection—which meant losing type safety and slowing down your code.&lt;/p&gt;

&lt;p&gt;With the arrival of &lt;strong&gt;Generics&lt;/strong&gt;, the game changed. We can now have the best of both worlds: Python-level ergonomics with Go’s compile-time safety.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Resile Way: One-Line Resilience
&lt;/h2&gt;

&lt;p&gt;Resile uses Go 1.18+ Type Parameters to wrap your logic in a resilience envelope. Here is how you fetch data with Resile:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="s"&gt;"github.com/cinar/resile"&lt;/span&gt;

&lt;span class="c"&gt;// data is automatically inferred as *User&lt;/span&gt;
&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;resile&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Do&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;User&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;apiClient&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;GetUser&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;userID&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="n"&gt;resile&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WithMaxAttempts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;3&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It looks and feels like a simple function call, but under the hood, Resile is doing the heavy lifting:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;AWS Full Jitter Backoff&lt;/strong&gt;: Spreading out retries to protect your database.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Context-Awareness&lt;/strong&gt;: Cancelling retries immediately if the request times out.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Memory Safety&lt;/strong&gt;: Using managed timers to prevent goroutine leaks.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Why "Correct by Default" Matters
&lt;/h2&gt;

&lt;p&gt;One of the best things about &lt;code&gt;stamina&lt;/code&gt; is that it makes the &lt;em&gt;right&lt;/em&gt; thing the &lt;em&gt;easy&lt;/em&gt; thing. Resile follows this philosophy strictly:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;Exponential Backoff is the Default&lt;/strong&gt;: You don't have to configure it; it's there from attempt one.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Jitter is Not Optional&lt;/strong&gt;: Resile forces randomization to prevent "thundering herd" outages.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Zero Dependencies&lt;/strong&gt;: The core of Resile depends only on the Go standard library. No bloated dependency graphs.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;No Reflection&lt;/strong&gt;: Unlike many older Go retry libraries, Resile uses static type parameters. This means zero runtime overhead and zero chance of a "type mismatch" panic.&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Batteries Included (But Removable)
&lt;/h2&gt;

&lt;p&gt;Just like &lt;code&gt;stamina&lt;/code&gt; integrates with &lt;code&gt;structlog&lt;/code&gt; and &lt;code&gt;Prometheus&lt;/code&gt;, Resile provides optional sub-packages for modern observability:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;&lt;code&gt;telemetry/resileslog&lt;/code&gt;&lt;/strong&gt;: High-performance structured logging with Go 1.21’s &lt;code&gt;slog&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;&lt;code&gt;telemetry/resileotel&lt;/code&gt;&lt;/strong&gt;: Full OpenTelemetry tracing. See every retry attempt as a sub-span in your Jaeger or Honeycomb dashboard.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Adaptive Retries&lt;/strong&gt;: A client-side token bucket (inspired by Google's SRE book) to prevent your fleet from killing a degraded service.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Fast CI/CD: The "Stamina" Secret
&lt;/h2&gt;

&lt;p&gt;A hidden feature of &lt;code&gt;stamina&lt;/code&gt; that I absolutely loved was the ability to globally disable wait times in unit tests. &lt;/p&gt;

&lt;p&gt;Resile brings this to Go through &lt;strong&gt;Context Overrides&lt;/strong&gt;. You can make your 30-second retry loop execute in 1 millisecond during tests without changing a single line of business logic:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;TestService&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;testing&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c"&gt;// This context tells Resile to skip all sleep timers&lt;/span&gt;
    &lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;resile&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WithTestingBypass&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Background&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;

    &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;myService&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;PerformTask&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c"&gt;// Retries instantly!&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;We don't have to choose between Go's performance and Python's developer experience. By leveraging modern Go features like Generics and &lt;code&gt;slog&lt;/code&gt;, we can build tools that are both powerful and a joy to use.&lt;/p&gt;

&lt;p&gt;If you’ve been missing &lt;code&gt;stamina&lt;/code&gt; in your Go projects, give &lt;strong&gt;Resile&lt;/strong&gt; a try.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Check it out on GitHub:&lt;/strong&gt; &lt;a href="https://github.com/cinar/resile" rel="noopener noreferrer"&gt;github.com/cinar/resile&lt;/a&gt;&lt;/p&gt;

</description>
      <category>go</category>
      <category>python</category>
      <category>backend</category>
      <category>programming</category>
    </item>
  </channel>
</rss>
