izma anwar

Posted on Apr 5

Implementing the Retry Pattern Using Polly in .NET Applications

#architecture #csharp #dotnet #microservices

The Problem — Why Your Microservices Fail Under Pressure
Picture this: you have an email microservice responsible for retrieving emails from a database, processing them, and delivering them to users.
Suddenly, the processing service becomes temporarily unavailable — a high traffic spike, a fleeting network glitch. Without a resilience strategy, your users immediately receive error messages.
This is where the Retry Pattern comes in.

What Is the Retry Pattern?
The Retry pattern is a resilience strategy designed to handle transient failures by automatically re-attempting an unsuccessful operation a defined number of times — instead of instantly marking it as a failure.
It empowers your application to gracefully navigate:

🌐 Network timeouts
⚡ Sporadic service unavailability
🔄 Temporary database connection drops
📡 Intermittent external API failures

Introducing Polly
Polly is a .NET resilience and transient-fault-handling library that lets you express policies like Retry, Circuit Breaker, and Timeout in a fluent, thread-safe manner.

Let's explore different policies.

🔁 Wait and Retry Policy
The most commonly used policy. It retries a failed operation with an optional delay between attempts.

Basic Retry on Exception
csharpRetryPolicy retryIfException =
    Policy.Handle<Exception>().Retry(3);

retryIfException.Execute(
    someBusinessLogic.DoSomethingThatMightThrowException);
Retry When Result Is Unexpected
csharpRetryPolicy<bool> retryPolicyNeedsTrueResponse =
    Policy.HandleResult<bool>(b => b != true).Retry(3);

bool result = retryPolicyNeedsTrueResponse.Execute(
    () => someBusinessLogic.GetBool());

Re-authenticate Before Retrying
Useful for APIs that return 401 Unauthorized with expired tokens:

csharpRetryPolicy<HttpResponseMessage> httpRetryWithReauthorizationPolicy =
    Policy.HandleResult<HttpResponseMessage>(r => !r.IsSuccessStatusCode)
        .RetryAsync(3, onRetry: (response, retryCount) =>
        {
            if (response.Result.StatusCode == HttpStatusCode.Unauthorized)
            {
                PerformReauthorization(); // Refresh your token here
            }
        });

Wait and Retry with Exponential Backoff ⭐
This is the one you'll use most in production. It introduces an exponential delay between retries — giving the failing service time to recover:

csharpRetryPolicy<HttpResponseMessage> httpRetryPolicy = Policy
    .HandleResult<HttpResponseMessage>(r => !r.IsSuccessStatusCode)
    .Or<HttpRequestException>()
    .WaitAndRetryAsync(3, retryAttempt =>
        TimeSpan.FromSeconds(Math.Pow(2, retryAttempt) / 2));

HttpResponseMessage httpResponseMessage = await
    httpRetryPolicy.ExecuteAsync(
        () => httpClient.GetAsync(remoteEndpoint));

💡 Pro tip: Exponential backoff (2s → 4s → 8s) prevents your retry storm from hammering an already struggling service.

⚡ Circuit Breaker Policy
The Circuit Breaker is like an electrical fuse for your microservices.
When too many failures are detected, the circuit "opens" and immediately rejects all requests — preventing cascading failures across your entire system. After a defined break duration, it allows one test request through. If that succeeds, the circuit "closes" again.
Basic Circuit Breaker
Opens after 2 consecutive failures, stays open for 60 seconds:

csharpCircuitBreakerPolicy<HttpResponseMessage> basicCircuitBreakerPolicy = Policy
    .HandleResult<HttpResponseMessage>(r => !r.IsSuccessStatusCode)
    .CircuitBreakerAsync(2, TimeSpan.FromSeconds(60));

HttpResponseMessage response =
    await basicCircuitBreakerPolicy.ExecuteAsync(
        () => _httpClient.GetAsync(remoteEndpoint));

Advanced Circuit Breaker ⭐ (Used in Production)
More intelligent — breaks based on a failure rate threshold within a time window:

csharpCircuitBreakerPolicy<HttpResponseMessage> advancedCircuitBreakerPolicy = Policy
    .HandleResult<HttpResponseMessage>(r => !r.IsSuccessStatusCode)
    .AdvancedCircuitBreakerAsync(
        0.01,                      // 1% failure rate threshold
        TimeSpan.FromSeconds(60),  // Sampling window
        1000,                      // Minimum throughput before evaluating
        TimeSpan.FromSeconds(10)   // How long to stay open
    );

HttpResponseMessage response =
    await advancedCircuitBreakerPolicy.ExecuteAsync(
        () => _httpClient.GetAsync(remoteEndpoint));

How it works: If there's a 1% failure rate within a 60-second window (with at least 1,000 requests), the circuit opens for 10 seconds. After that, one test request is allowed through.

✅ Real-world use case: At LexCheck, this Advanced Circuit Breaker prevented a cascade failure during a traffic spike — stopping one struggling upstream service from taking down our entire email processing pipeline.

🛡️ Fallback Policy
Sometimes retries simply won't fix the problem. The Fallback policy lets you define a graceful degradation path — returning a default value or executing an alternative action:

csharpFallbackPolicy fallback = Policy.Handle<Exception>()
    .Fallback(() => PageAnAdmin()); // Notify admin if all else fails

fallback.Execute(() => GetInventory());
Fallback actions could include:

📧 Sending an alert to your on-call engineer
📊 Returning cached/stale data
🔃 Triggering a service restart
📝 Logging the failure for analysis

🔗 Policy Wraps — Combining Multiple Policies
The real power of Polly comes from combining policies. A Policy Wrap lets you stack them together:

csharpvar wrapPolicy = Policy.Wrap(
    fallbackPolicy,   // Outermost: catch everything
    retryPolicy,      // Middle: retry before giving up
    timeoutPolicy     // Innermost: enforce time limits
);

wrapPolicy.Execute(() => SomeMethod());

Recommended wrap order for HTTP calls:
Fallback → Circuit Breaker → Retry → Timeout
This ensures timeouts trigger retries, retries trigger the circuit breaker, and if all fails — the fallback kicks in.

⏱️ Timeout Policy
Not all HTTP clients or services have built-in timeouts. The Timeout policy lets you define the maximum acceptable duration for any operation:

csharpTimeoutPolicy timeoutPolicy =
    Policy.Timeout(1, TimeoutStrategy.Pessimistic, OnTimeout);

var result = timeoutPolicy.Execute(() => ComplexAndSlowCode());

TimeoutStrategy.Pessimistic uses a cancellation token to gracefully terminate long-running operations.

💾 Cache Policy
Polly's Cache policy stores the results of previous requests — either in-memory or in a distributed cache (like Redis). On a duplicate request, Polly returns the cached result instead of making another call:

csharpvar memoryCache = new MemoryCache(new MemoryCacheOptions());
var memoryCacheProvider = new MemoryCacheProvider(memoryCache);

CachePolicy<int> cachePolicy =
    Policy.Cache<int>(memoryCacheProvider, TimeSpan.FromSeconds(10));

var result = cachePolicy.Execute(
    context => QueryRemoteService(id),
    new Context($"QRS-{id}"));

Polly supports multiple TTL strategies:

Relative — 10 seconds from when it was cached
Absolute — expires at a fixed point in time
Sliding — resets on each access
Result — TTL extracted from the result itself (perfect for auth tokens!)

🧱 Bulkhead Isolation
Named after the watertight compartments in ship hulls — Bulkhead Isolation limits how many concurrent requests a specific operation can consume, preventing one slow upstream service from starving your entire application:

csharp// Max 3 concurrent executions, queue up to 6 waiting
BulkheadPolicy bulkheadPolicy = Policy.Bulkhead(3, 6);

var result = bulkheadPolicy.Execute(() => ResourceHeavyRequest());

Once the execution slots (3) and queue slots (6) are full, additional requests are rejected immediately — keeping your application responsive for everything else.

When Should You Use the Retry Pattern?
✅ Use Retry for:

HTTP calls to external APIs
Database connection attempts
Network socket operations
Message queue interactions
Any operation where failures are transient and idempotent

❌ Avoid Retry for:

Non-idempotent operations — inserting a payment record twice = duplicate charges
Permanent failures — a 404 Not Found won't be fixed by retrying
Validation errors — bad input data won't become valid after 3 retries

Complete Resilience Pipeline Example
Here's a production-ready pipeline combining all key policies for an HTTP service:

csharp// 1. Timeout — abort if takes more than 10 seconds
var timeoutPolicy = Policy.TimeoutAsync<HttpResponseMessage>(10);

// 2. Retry — try up to 3 times with exponential backoff
var retryPolicy = Policy
    .HandleResult<HttpResponseMessage>(r => !r.IsSuccessStatusCode)
    .Or<HttpRequestException>()
    .WaitAndRetryAsync(3, attempt =>
        TimeSpan.FromSeconds(Math.Pow(2, attempt)));

// 3. Circuit Breaker — open on 50% failure rate
var circuitBreakerPolicy = Policy
    .HandleResult<HttpResponseMessage>(r => !r.IsSuccessStatusCode)
    .AdvancedCircuitBreakerAsync(0.5,
        TimeSpan.FromSeconds(30), 20, TimeSpan.FromSeconds(15));

// 4. Fallback — return default on total failure
var fallbackPolicy = Policy
    .HandleResult<HttpResponseMessage>(r => !r.IsSuccessStatusCode)
    .Or<BrokenCircuitException>()
    .FallbackAsync(new HttpResponseMessage(HttpStatusCode.OK)
    {
        Content = new StringContent("{ \"fallback\": true }")
    });

// 5. Wrap them all together
var resiliencePolicy = Policy.WrapAsync(
    fallbackPolicy,
    circuitBreakerPolicy,
    retryPolicy,
    timeoutPolicy
);

// Execute your HTTP call with full resilience
HttpResponseMessage response = await resiliencePolicy.ExecuteAsync(
    () => httpClient.GetAsync("/api/resource"));

Conclusion
Building resilient .NET applications doesn't have to be complicated. Polly gives you a clean, expressive API to handle the messy reality of distributed systems — where things will fail.
Here's a quick summary of when to use each policy:

Policy	Use When
Retry	Transient, self-healing failures
Circuit Breaker	Upstream service is consistently failing
Fallback	You need graceful degradation
Policy Wrap	Combining multiple strategies
Timeout	Operations have no built-in time limit
Cache	Repeated read requests to slow services
Bulkhead	Isolating resource-heavy operations

What's Next?

📖 Official Polly Documentation
🔗 Microsoft's guide on HTTP resilience with Polly

Have questions or want to share how you're using Polly in production? Drop a comment below! 👇
If this helped you, consider sharing it with your team — resilience patterns save production systems at the worst possible moments.

About the Author
I'm Izma Anwar, a Senior .NET & Full Stack Engineer with 6+ years building enterprise-grade microservices, cloud-native applications, and scalable backend systems. I write about .NET, C#, AWS, Azure, and software architecture.
🔗 LinkedIn | 🌐 izmaanwar.com | 💻 GitHub

DEV Community

Implementing the Retry Pattern Using Polly in .NET Applications

Top comments (0)