Suave Bajaj

Posted on Sep 20 • Edited on Sep 26

Kubernetes Autoscaling with KEDA: Event-Driven Scaling for Queues and Microservices – Part 1

#kubernetes #keda #devops #cloud

KEDA & Event-Driven Autoscaling: Solving Queue Backlogs in Kubernetes

If you've ever run Kafka or RabbitMQ consumers in Kubernetes, you know the pain: your consumers are either underutilized or overloaded, and CPU-based HPA just doesn’t cut it.

Enter KEDA — Kubernetes-based Event-Driven Autoscaling.
KEDA Github

What is KEDA?

KEDA is a lightweight autoscaler that scales your Kubernetes workloads based on external events or metrics, such as:

Queue length in Kafka or RabbitMQ
Prometheus metrics
Cloud events (Azure, AWS, GCP)

Unlike HPA which scales on CPU/memory, KEDA lets your pods scale dynamically based on actual workload.

Important: KEDA itself runs as a Deployment in your Kubernetes cluster.

It’s not a StatefulSet or a special Deployment of your app.

It consumes CPU/memory like any other Deployment depending on the number of ScaledObjects and triggers you define.

This means if you define many queues, KEDA’s own Deployment will grow in resource usage.

How KEDA Works

KEDA introduces a new Kubernetes object called a ScaledObject:

ScaledObject → Deployment mapping: tells KEDA which workload to scale.
Each ScaledObject can have one or more triggers.
Scaling logic:
- OR logic across triggers (if any trigger exceeds threshold → scale).
- Cannot independently scale subsets of pods within the same Deployment.

Example: RabbitMQ ScaledObject

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: orders-consumer-scaler
spec:
  scaleTargetRef:
    name: orders-consumer
  minReplicaCount: 1
  maxReplicaCount: 10
  triggers:
  - type: rabbitmq
    metadata:
      queueName: orders-queue
      host: amqp://guest:guest@rabbitmq:5672/
      queueLength: "100"

Explanation:

orders-consumer Deployment will scale 1→10 pods depending on queue length.
Scaling happens if the queue length > 100.

KEDA & Event-Driven Autoscaling: Solving Queue Backlogs in Kubernetes

If you've ever run Kafka or RabbitMQ consumers in Kubernetes, you know the pain: your consumers are either underutilized or overloaded, and CPU-based HPA just doesn’t cut it.

Enter KEDA — Kubernetes-based Event-Driven Autoscaling.

How KEDA Determines the Number of Pods

KEDA computes desired replicas based on the queue metric and thresholds:

Threshold value (queueLength)
- The metric KEDA monitors (queue depth).
Min and Max Replicas

minReplicaCount: 1
maxReplicaCount: 10

**Desired replicas calculation (simplified):

desiredReplicas = ceil(currentQueueLength / queueLengthThreshold)

**Scaling applied via HPA

KEDA updates the Deployment replicas between minReplicaCount and maxReplicaCount

Visual Example

Imagine orders-queue has a threshold of 100 messages:

Queue Depth → Desired Replicas → Deployment Pods
------------------------------------------------
50          → 1                 → 1 pod (minReplicaCount)
120         → 2                 → 2 pods
250         → 3                 → 3 pods
1050        → 11                → 10 pods (maxReplicaCount)

Diagram: Queue Depth → Desired Replicas → Deployment Scaling

[Queue Depth: 250]
        │
        ▼
[Threshold: 100 per pod]
        │
        ▼
[Desired Replicas: ceil(250/100) = 3]
        │
        ▼
[orders-consumer Deployment scales to 3 pods]

Limitations of KEDA

While KEDA is powerful, it has some critical limitations:

N queues → N consumers is not supported natively

You cannot define multiple queues with different consumer Deployments in a single ScaledObject.

Scaling logic is OR across triggers; all pods scale together.

Not ideal for microservices with multiple queues and workflows

Example: if you have 10 queues with different processing logic/endpoints, KEDA alone cannot scale them independently.

You’d need multiple Deployments + multiple ScaledObjects, which becomes cumbersome.

Resource consumption

KEDA itself is a Deployment; each ScaledObject adds CPU/memory usage.

Scaling many queues can increase the resource footprint on the cluster.

Kubectl Example

After creating multiple ScaledObjects:

kubectl get scaledobjects

Example output:

NAME                       SCALETARGETKIND      SCALETARGETNAME     MIN   MAX   TRIGGERS
orders-consumer-scaler     apps/v1.Deployment   orders-consumer      1    10    rabbitmq
payments-consumer-scaler   apps/v1.Deployment   payments-consumer    1     5    rabbitmq
kafka-consumer-scaler      apps/v1.Deployment   kafka-consumer       1    15    kafka

Each ScaledObject targets a single Deployment.

Multiple triggers for the same Deployment scale all pods together (OR logic).

Takeaways

KEDA is great for simple event-driven scaling.
Each ScaledObject → one Deployment.
OR logic for multiple triggers → not suitable for microservices with multiple queues and workflows.
KEDA computes replicas automatically using the metric and threshold, scaling Deployment pods between min and max.
Resource footprint increases with more ScaledObjects and triggers.

Part 2 will show how to build a custom autoscaler that can handle multiple queues, multiple consumers, and independent scaling logic for complex microservices.

Pro Tip: Start with KEDA for simple queue scaling. For complex, multi-queue microservices, a custom HPA/KEDA solution is usually required.

DEV Community

Kubernetes Autoscaling with KEDA: Event-Driven Scaling for Queues and Microservices – Part 1

KEDA & Event-Driven Autoscaling: Solving Queue Backlogs in Kubernetes

What is KEDA?

How KEDA Works

Example: RabbitMQ ScaledObject

Explanation:

KEDA & Event-Driven Autoscaling: Solving Queue Backlogs in Kubernetes

How KEDA Determines the Number of Pods

Visual Example

Limitations of KEDA

Resource consumption

Takeaways

Top comments (0)