What is Rate Limiting?
Rate limiting controls how many requests a client can make to your API within a specific time window. ASP.NET Core 7+ ships with built-in rate limiting middleware, so you don't need any third-party packages. It protects your API from abuse (DoS attacks), ensures fair usage among clients, controls infrastructure costs, and keeps the service stable under load.
Setup
using System.Threading.RateLimiting;
using Microsoft.AspNetCore.RateLimiting;
These two namespaces are essential. System.Threading.RateLimiting contains the core algorithms and options classes. Microsoft.AspNetCore.RateLimiting contains the middleware and the [EnableRateLimiting] / [DisableRateLimiting] attributes.
Registering the Middleware
builder.Services.AddRateLimiter(options => { ... });
This registers the rate limiter service with the DI container. Inside the lambda you define one or more policies — each policy is a named rule that you can apply to endpoints later.
app.UseRateLimiter();
This plugs the middleware into the pipeline. It must be called before app.MapControllers() so it intercepts requests before they reach your controllers.
Policy 1 — Fixed Window Limiter ("DefaultPolicy")
options.AddFixedWindowLimiter("DefaultPolicy", limiterOptions =>
{
limiterOptions.Window = TimeSpan.FromMinutes(1);
limiterOptions.PermitLimit = 100;
limiterOptions.QueueProcessingOrder = QueueProcessingOrder.OldestFirst;
limiterOptions.QueueLimit = 10;
});
The idea: Time is divided into fixed, non-overlapping windows (e.g. 0:00–1:00, 1:00–2:00, ...). Each window gets a fresh counter. The moment the window ends, the counter resets completely regardless of when requests arrived inside it.
Line by line:
-
Window = TimeSpan.FromMinutes(1)— each time window lasts 1 minute. When the minute ends, the counter resets to zero. -
PermitLimit = 100— at most 100 requests are allowed within each 1-minute window. Request #101 gets rejected with HTTP 429 Too Many Requests. -
QueueProcessingOrder = QueueProcessingOrder.OldestFirst— when the limit is hit, excess requests can be queued. This says: serve the oldest waiting request first (FIFO). The alternative isNewestFirst(LIFO). -
QueueLimit = 10— at most 10 requests can wait in the queue. If the queue is also full, the request is immediately rejected without waiting.
The problem with Fixed Window: If 100 requests arrive in the last 5 seconds of minute 1, and another 100 arrive in the first 5 seconds of minute 2, you get 200 requests in a 10-second span — a burst — because both windows reset independently. This is why Sliding Window exists.
Policy 2 — Sliding Window Limiter ("SlidingWindow")
options.AddSlidingWindowLimiter("SlidingWindow", limiterOptions =>
{
limiterOptions.Window = TimeSpan.FromMinutes(1);
limiterOptions.PermitLimit = 100;
limiterOptions.QueueProcessingOrder = QueueProcessingOrder.OldestFirst;
limiterOptions.QueueLimit = 10;
limiterOptions.SegmentsPerWindow = 6;
limiterOptions.AutoReplenishment = true;
});
The idea: Instead of one big window, the window is split into smaller segments. As each segment expires, the requests that were counted in that segment become available again. The window "slides" forward segment by segment, so limits are enforced more smoothly and bursts at window boundaries are prevented.
Line by line:
-
Window = TimeSpan.FromMinutes(1)— the total window is still 1 minute. -
PermitLimit = 100— still 100 requests per window overall. -
SegmentsPerWindow = 6— the 1-minute window is split into 6 segments of 10 seconds each. Every 10 seconds, the oldest segment "falls off" and its request count is freed up. -
AutoReplenishment = true— the replenishment (freeing up expired segments) happens automatically in the background. If you set this tofalse, you'd have to callTryReplenish()manually, which is rare. -
QueueProcessingOrderandQueueLimit— same meaning as Fixed Window above.
Concrete example: At second 0 you send 100 requests. At second 10, segment 1 expires, freeing 10 slots (100/6 ≈ 16 per segment, but proportionally). The counter decrements gradually rather than resetting all at once — much fairer and burst-resistant.
Policy 3 — Concurrency Limiter ("Concurrency")
options.AddConcurrencyLimiter("Concurrency", limiterOptions =>
{
limiterOptions.PermitLimit = 50;
limiterOptions.QueueProcessingOrder = QueueProcessingOrder.OldestFirst;
limiterOptions.QueueLimit = 100;
});
The idea: This doesn't care about time windows at all. It limits how many requests are being processed simultaneously at any given moment. Think of it like a semaphore — a permit is acquired when a request enters and released when it finishes.
Line by line:
-
PermitLimit = 50— at most 50 requests can be actively running at the same time. If a 51st request comes in while all 50 slots are busy, it either queues or gets rejected. -
QueueLimit = 100— up to 100 requests can wait in line for a slot to free up. -
QueueProcessingOrder = QueueProcessingOrder.OldestFirst— the oldest queued request gets the next freed slot.
When to use it: Ideal for protecting expensive operations like DB-heavy endpoints or file processing, where you care about CPU/connection pool exhaustion rather than request rate over time.
Policy 4 — Per-User Policy ("ApiUserPolicy")
options.AddPolicy("ApiUserPolicy", httpContext =>
RateLimitPartition.GetFixedWindowLimiter(
partitionKey: httpContext.User.Identity?.Name ?? "anonymous",
factory: _ => new FixedWindowRateLimiterOptions
{
Window = TimeSpan.FromMinutes(1),
PermitLimit = 1000,
AutoReplenishment = true
}));
The idea: This is a partitioned policy — meaning each user gets their own independent rate limit counter. User A's requests don't affect User B's quota. This is how real-world APIs (like GitHub's API) work: each authenticated user has their own limit.
Line by line:
-
RateLimitPartition.GetFixedWindowLimiter(...)— creates a Fixed Window limiter, but partitioned per key rather than global. -
partitionKey: httpContext.User.Identity?.Name ?? "anonymous"— the partition key is the authenticated username. If the user is not authenticated, all unauthenticated users share the key"anonymous"(meaning they share one limit together — a deliberate design to pressure them into authenticating). -
PermitLimit = 1000— authenticated users get a generous limit of 1000 requests/minute, suitable for API consumers with tokens. -
AutoReplenishment = true— the window resets automatically without manual intervention.
Policy 5 — Per-IP Policy ("IpPolicy")
csharp
options.AddPolicy("IpPolicy", httpContext =>
RateLimitPartition.GetSlidingWindowLimiter(
partitionKey: httpContext.Connection.RemoteIpAddress?.ToString() ?? "unknown",
factory: _ => new SlidingWindowRateLimiterOptions
{
Window = TimeSpan.FromMinutes(1),
PermitLimit = 100,
SegmentsPerWindow = 6,
AutoReplenishment = true
}));
The idea: Same partitioned concept as above, but the partition key is the client's IP address instead of the username. This is useful for public endpoints where users aren't authenticated — you limit by IP to prevent one machine from hammering your API.
Line by line:
-
partitionKey: httpContext.Connection.RemoteIpAddress?.ToString() ?? "unknown"— extracts the caller's IP address as a string (e.g."192.168.1.5"). If somehow the IP can't be determined, falls back to"unknown". - Uses a Sliding Window internally (same as Policy 2) — smoother enforcement, no burst problem at boundaries.
-
PermitLimit = 100withSegmentsPerWindow = 6— each IP gets 100 requests/minute, enforced per 10-second segments.
Applying Policies to Endpoints
There are two ways:
Via attribute on a Controller action:
[HttpGet]
[EnableRateLimiting(policyName: "DefaultPolicy")]
public async Task<IActionResult> Get(...) { ... }
The [EnableRateLimiting] attribute on the Get action tells the middleware: apply "DefaultPolicy" to this specific endpoint. Other actions in the same controller (like GetById, Post, Put, Delete) have no attribute, so they are not rate-limited by default.
Via Minimal API:
app.MapGet("/api/products-mn", async (...) =>
{
...
}).RequireRateLimiting("DefaultPolicy");
For Minimal APIs, you chain .RequireRateLimiting("policyName") on the endpoint definition. The second Minimal API endpoint (/api/products-mn/{productId:int}) has no .RequireRateLimiting() call, so it's unrestricted.
What Happens When the Limit is Exceeded?
When a request is rejected (limit hit + queue full), the middleware automatically returns HTTP 429 Too Many Requests. You can customize the rejection response globally using options.OnRejected:
options.OnRejected = async (context, cancellationToken) =>
{
context.HttpContext.Response.StatusCode = 429;
await context.HttpContext.Response.WriteAsync("Too many requests. Please slow down.");
};
Summary of All Policies
| Policy | Algorithm | Limit | Partition By |
|---|---|---|---|
| DefaultPolicy | Fixed Window | 100 req/min | Global |
| SlidingWindow | Sliding Window | 100 req/min | Global |
| Concurrency | Concurrency | 50 simultaneous | Global |
| ApiUserPolicy | Fixed Window | 1000 req/min | Per username |
| IpPolicy | Sliding Window | 100 req/min | Per IP address |

Top comments (0)