What is RateLimiting?
Rate limiting is a technique that controls how frequently an action (like an API request) can be performed within a specific time frame to prevent overuse or abuse of a system.
Diff terms Diff meaning RateLimiting vs Throttling
In practice, the terms “throttling” and “rate limiting” are often used interchangeably, but they focus on slightly different aspects of controlling traffic:
- Rate limiting typically sets a hard cap on how many requests or actions can occur in a given time window. If the limit is exceeded, additional requests are rejected or delayed until the next interval.
- Throttling is generally more dynamic: it slows down or adjusts the flow of requests (rather than outright blocking them) so that the system can stay within safe operating limits and maintain performance. Throttling often happens in real time, deciding how fast or slow requests should be allowed through.
Layers of Kubernetes for RateLimiting
- Virtual Service A VirtualService is an Istio custom resource (CRD) used in Kubernetes environments to define how network traffic is routed to services within a service mesh. It allows you to configure rules for routing, traffic splitting, retries, timeouts, and more, providing fine-grained control over how requests reach and flow between the underlying services.
- Sidecar A sidecar in Kubernetes refers to a supplementary container running alongside the main application container within the same Pod. It shares resources like storage and network with the main container and typically provides auxiliary features—such as logging, monitoring, or proxying—that extend or enhance the primary application's functionality without altering the core application code.
What to choose and when for RateLimiting?
-
VirtualService
- Limits Requests at a Kubernetes Cluster Level
- Serve a ratelimiting response even before it reaches your app environment
- Centralized/Global Control, If you want to enforce traffic policies across multiple pods or services in a consistent way, configuring rate limits at the VirtualService level gives you a higher-level, more global enforcement
-
Sidecar
- Allow requests until your sidecar container
- Throttling of requests can be done here
- Granular/Per-Pod Control If you need more nuanced control (for instance, different rate limits per pod or specialized logic that depends on local metrics), implementing rate limiting at the sidecar (e.g., using Envoy filters) can give you that fine-grained approach.
- Local Enforcement Rate limiting at the sidecar level can ensure that local spikes in traffic are handled directly at each pod, rather than having to route all decisions through a centralized policy. This can reduce latency and give more immediate control over local resources.
How?
VirtualService
- Return an HTTP error code using fault injection
- You can configure a VirtualService rule that responds with an error (for example, HTTP 503) for 100% of requests, without forwarding them to a real destination. This approach “blackholes” the traffic by immediately aborting the request
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: blackhole-example
spec:
hosts:
- my-service.default.svc.cluster.local
http:
- fault:
abort:
httpStatus: 503 # The HTTP status code to return
percentage:
value: 100 # 100% of traffic is aborted
route:
- destination:
host: my-service # Typically references a real service, but traffic won't reach it
- Omit the route destination altogether
- Another way is to define a rule that simply does not specify a valid route. Envoy (underlying the Istio data plane) will not have any valid upstream cluster to forward requests to, resulting in a blackhole behavior
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: blackhole-example
spec:
hosts:
- my-service.default.svc.cluster.local
http:
- match:
- uri:
prefix: "/"
route: [] # No routes => effectively blackholed
- Blackhole X% of Traffic
- Below is a simple example showing how you can “blackhole” (abort) a percentage of traffic at the VirtualService level in Istio, while allowing the remaining requests to be routed normally. This uses fault injection (with abort) to return an HTTP error for a certain percentage of requests, and routes the rest to the service.
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: partial-blackhole
spec:
hosts:
- my-service.default.svc.cluster.local
http:
- name: "partial-blackhole-rule"
fault:
abort:
httpStatus: 503 # Return HTTP 503 for "blackholed" requests
percentage:
value: 10 # Blackhole 10% of requests
route:
- destination:
host: my-service # 90% of traffic goes here
- Path based blackholing X% of traffic
- Below is an example VirtualService that “blackholes” a certain percentage of requests only for a specific path (e.g., "/blackhole") and routes the rest of the traffic normally. In this example, half of the requests (50%) to "/blackhole" get an immediate 503 response, while all other requests go through without fault injection.
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: partial-blackhole-by-path
spec:
hosts:
- my-service.default.svc.cluster.local
http:
- name: blackhole-specific-path
match:
- uri:
prefix: "/blackhole"
fault:
abort:
httpStatus: 503
percentage:
value: 50 # 50% of requests to /blackhole are "blackholed"
route:
- destination:
host: my-service
- name: other-traffic
route:
- destination:
host: my-service
Will Discuss Sidecar based throttling and Ratelimiting in a separate post
Top comments (0)