DEV Community

James Lee
James Lee

Posted on

Traffic Management in Istio: Circuit Breaking & Rate Limiting with DestinationRule

One of Istio's most powerful features is the ability to configure circuit breaking and rate limiting declaratively — without touching a single line of application code. Everything is controlled through a single CRD: DestinationRule.

In this article, I'll break down exactly how connectionPool (rate limiting) and outlierDetection (circuit breaking) work, with real YAML examples.


The Entry Point: TrafficPolicy in DestinationRule

Both circuit breaking and rate limiting are configured under the trafficPolicy field of a DestinationRule:

apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: my-service-dr
spec:
  host: my-service
  trafficPolicy:
    connectionPool:    # ← Rate limiting config
      tcp:
        maxConnections: 100
      http:
        http1MaxPendingRequests: 100
        http2MaxRequests: 1000
    outlierDetection:  # ← Circuit breaking config
      consecutiveErrors: 5
      interval: 10s
      baseEjectionTime: 30s
      maxEjectionPercent: 10
Enter fullscreen mode Exit fullscreen mode

Part 1: Rate Limiting with connectionPool

connectionPool controls how many concurrent connections and requests are allowed to reach an upstream service. It has two sub-sections: tcp and http.

TCP Settings

Field Description Default
maxConnections Max number of HTTP1/TCP connections to the destination unlimited
connectTimeout TCP connection timeout 10s

⚠️ Important: maxConnections only limits HTTP/1.1 and TCP connections. It does not affect HTTP/2, because HTTP/2 multiplexes all requests over a single connection.

HTTP Settings

Field Description Default
http1MaxPendingRequests Max queued HTTP/1.1 requests waiting to be processed 1024
http2MaxRequests Max concurrent HTTP/2 requests 1024
maxRequestsPerConnection How many requests can reuse one TCP connection. Set to 1 to disable keepalive unlimited

Full connectionPool Example

trafficPolicy:
  connectionPool:
    tcp:
      maxConnections: 100
      connectTimeout: 3s
    http:
      http1MaxPendingRequests: 100
      http2MaxRequests: 1000
      maxRequestsPerConnection: 10
Enter fullscreen mode Exit fullscreen mode

What this does: Limits the service to 100 concurrent TCP connections, queues at most 100 pending HTTP/1.1 requests, and allows up to 1000 concurrent HTTP/2 requests. Each TCP connection can serve up to 10 requests before being recycled.


Part 2: Circuit Breaking with outlierDetection

outlierDetection implements the circuit breaker pattern at the Envoy level. When a service instance starts returning errors, Istio automatically ejects it from the load balancing pool.

How It Works

Request → Envoy → Load Balancer Pool
                   ├── Instance A (healthy) ✅
                   ├── Instance B (healthy) ✅
                   └── Instance C (5xx × 5) ❌ → Ejected for 30s
Enter fullscreen mode Exit fullscreen mode

When Instance C is ejected, traffic is only routed to A and B. After baseEjectionTime, Instance C gets a chance to re-enter the pool.

Configuration Fields

Field Description Default
consecutiveErrors Number of consecutive 5xx errors (502/503/504) before ejection 5
interval Time window for counting errors 10s
baseEjectionTime Minimum ejection duration. Actual time = baseEjectionTime × ejection count 30s
maxEjectionPercent Max % of instances that can be ejected at once 10%
minHealthPercent If healthy instances drop below this %, circuit breaking is disabled entirely 50%

Full outlierDetection Example

trafficPolicy:
  outlierDetection:
    consecutiveErrors: 5
    interval: 10s
    baseEjectionTime: 30s
    maxEjectionPercent: 10
    minHealthPercent: 50
Enter fullscreen mode Exit fullscreen mode

Key Design Decisions Worth Understanding

1. Progressive Ejection Time

The actual ejection time is not fixed — it grows with each ejection:

1st ejection: 30s × 1 = 30s
2nd ejection: 30s × 2 = 60s
3rd ejection: 30s × 3 = 90s
Enter fullscreen mode Exit fullscreen mode

This is intentional: a repeatedly failing instance gets longer and longer timeouts, giving it more time to recover.

2. The Safety Net: maxEjectionPercent + minHealthPercent

These two fields work together to prevent a cascade failure from taking down your entire service:

  • maxEjectionPercent: 10 — Even if 50 instances are all returning 5xx errors, only 10% will be ejected at any one time.
  • minHealthPercent: 50 — If healthy instances drop below 50%, Istio disables outlier detection entirely and allows all instances (including unhealthy ones) to receive traffic.

Why disable circuit breaking when too many instances are unhealthy?
Because at that point, the problem is likely systemic (e.g., a bad deployment, downstream dependency failure). Ejecting more instances would only make things worse. It's better to let all instances handle traffic and surface the real error.


Putting It All Together

Here's a production-ready DestinationRule combining both features:

apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: reviews-dr
spec:
  host: reviews
  trafficPolicy:
    connectionPool:
      tcp:
        maxConnections: 100
        connectTimeout: 3s
      http:
        http1MaxPendingRequests: 100
        http2MaxRequests: 1000
        maxRequestsPerConnection: 10
    outlierDetection:
      consecutiveErrors: 5
      interval: 10s
      baseEjectionTime: 30s
      maxEjectionPercent: 10
      minHealthPercent: 50
Enter fullscreen mode Exit fullscreen mode

Summary

Feature Config Field Mechanism
Rate Limiting connectionPool.tcp Limit TCP connections
Rate Limiting connectionPool.http Limit concurrent HTTP requests
Circuit Breaking outlierDetection.consecutiveErrors Eject on N consecutive errors
Safety Net outlierDetection.maxEjectionPercent Cap max ejected instances
Safety Net outlierDetection.minHealthPercent Disable CB when cluster is too unhealthy

The beauty of Istio's approach: all of this happens at the proxy layer. Your application code doesn't need to implement Hystrix, Resilience4j, or any circuit breaker library. The infrastructure handles it.


💻 Explore the full implementation:
github.com/muzinan123/servicemesh

📖 Next in this series: Full Observability in Istio: Metrics + Distributed Tracing

Top comments (0)