Preparing for the IETF RateLimit Header Standard in .NET

#dotnet #csharp #api #webdev

Running an integration test or a production job when everything starts failing with 429 Too Many Requests is frustrating. The typical response is to add some Task.Delay calls, maybe exponential backoff, and hope for the best.

But what if APIs told you exactly when to slow down, before you hit the limit?

A new standard for rate limit headers

There's an IETF draft standard that defines how APIs should communicate rate limits via HTTP headers. It's still early days, but adoption is starting: Cloudflare added support in September 2025, and more providers are likely to follow as the standard matures.

The headers look like this:

RateLimit-Policy: "default";q=100;w=60
RateLimit: "default";r=15;t=23

This tells you: "You get 100 requests per 60-second window. You have 15 remaining. The window resets in 23 seconds."

That's useful information. But .NET doesn't have built-in support for parsing these headers or acting on them. Most code just waits until it hits a 429, then reacts.

Proactive instead of reactive

RateLimitHeaders is a .NET library that parses these IETF-standard headers and can automatically back off when quota gets low, before hitting a 429.

The idea is simple: if the API tells you that you have 5% of your quota left and the window resets in 30 seconds, don't fire off 50 more requests. Slow down. Spread them out.

Getting started

Install the package:

dotnet add package RateLimitHeaders

The easiest way to use it is with IHttpClientFactory:

builder.Services.AddHttpClient("GitHubApi", client =>
{
    client.BaseAddress = new Uri("https://api.github.com");
})
.AddRateLimitAwareHandler(options =>
{
    options.EnableProactiveThrottling = true;
    options.QuotaLowThreshold = 0.1; // Start slowing down at 10% remaining
});

Your HTTP client will now:

Parse rate limit headers from every response
Track your remaining quota
Automatically introduce small delays when quota gets low
Prevent 429s in the first place

Reading rate limit info directly

Sometimes you want to make decisions based on the rate limit info yourself:

var response = await httpClient.GetAsync("/api/resource");

if (response.TryGetRateLimitInfo(out var info))
{
    Console.WriteLine($"Remaining: {info.Remaining}/{info.Quota}");
    Console.WriteLine($"Resets in: {info.ResetSeconds}s");

    if (info.IsQuotaLow(0.1))
    {
        // Maybe queue this work for later instead
    }
}

Polly integration

If you're using Polly for resilience, there's a separate integration package:

dotnet add package RateLimitHeaders.Polly

It adds an AddRateLimitHeaders() extension to the resilience pipeline builder:

var pipeline = new ResiliencePipelineBuilder<HttpResponseMessage>()
    .AddRateLimitHeaders(options =>
    {
        options.EnableProactiveThrottling = true;
        options.OnQuotaLow = args =>
        {
            logger.LogWarning("Quota low: {Remaining}%", args.RemainingPercentage * 100);
            return ValueTask.CompletedTask;
        };
    })
    .AddRetry(new RetryStrategyOptions<HttpResponseMessage>())
    .Build();

Rate limit awareness sits alongside your retry and circuit breaker strategies. Proactive throttling prevents 429s; Polly handles recovery when things still go wrong.

Why bother?

If you already have retry logic, why add this?

Retries cost time. Every 429 means waiting for a retry delay. Proactive throttling spreads that delay out so you never stop completely.
Some APIs penalize repeat offenders. Hit rate limits too often and some APIs will temporarily ban you or reduce your quota.
Better observability. The callbacks give you visibility into your rate limit consumption before it becomes a problem.
The standard is coming. As more APIs adopt the IETF headers, your code is already prepared.