DEV Community

Cover image for Kubernetes Autoscaling with Custom Scaler: Event-Driven Scaling for Queues and Microservices – Part 2
Suave Bajaj
Suave Bajaj

Posted on

Kubernetes Autoscaling with Custom Scaler: Event-Driven Scaling for Queues and Microservices – Part 2

In Part 1, we explored KEDA and how it scales your consumers based on queue depth.

But what if:

  • You have N queues → M consumers
  • Each queue has different thresholds, min, and max replicas
  • Each consumer has a different workflow / endpoint

KEDA alone can’t handle this. That’s where a custom autoscaler comes in.


Problem Statement

Example scenario:

Queue Consumer Deployment Threshold Min Pods Max Pods
x-queue-1 consumer-x-1 100 1 5
x-queue-2 consumer-x-2 50 2 6
y-queue consumer-y 200 1 10
  • Each queue has its own processing logic.
  • Scaling decisions must be independent.
  • Min/Max replicas can be defined dynamically in a database for flexibility.

Architecture Overview

┌───────────────────────┐
│     Producers          │
│  (Apps push messages   │
│   to queues)           │
└─────────┬─────────────┘
          │
          ▼
 ┌─────────────────┐
 │  Message Broker  │
 │  (Kafka/Rabbit)  │
 └─────────┬────────┘
           │
┌──────────┴──────────┐
▼                     ▼
┌─────────────────────┐       ┌─────────────────────┐
│  Python Exporter    │       │ Prometheus Metrics  │
│ - Reads queue depth │◄─────►│ storage             │
│ - Reads min/max     │       └─────────────────────┘
│   from DB           │
│ - Applies scaling   │
│   logic per queue   │
│ - Calls Kubernetes  │
│   API to scale      │
└─────────┬───────────┘
          │
          ▼
┌─────────────────────────────┐
│ Consumer Deployments         │
│ - pod-deployment-1          │
│ - pod-deployment-2          │
│ - pod-deployment-y          │
└─────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

Python Example: Custom Scaling with DB Min/Max

from kubernetes import client, config
import requests
import sqlite3  # Example DB; replace with your real DB

# Load in-cluster config
config.load_incluster_config()
apps_v1 = client.AppsV1Api()

# Connect to database containing min/max per consumer
conn = sqlite3.connect('consumer_scaling.db')
cursor = conn.cursor()

# Read scaling config for all consumers
cursor.execute("SELECT consumer_name, queue_name, threshold, min_pods, max_pods FROM scaling_config")
scaling_rules = cursor.fetchall()

# Prometheus endpoint
prometheus_url = "http://prometheus:9090/api/v1/query"

for consumer_name, queue_name, threshold, min_pods, max_pods in queue_config:
    # Query current queue depth
    query = f'sum(queue_depth{{queue="{queue_name}"}})'
    resp = requests.get(prometheus_url, params={"query": query}).json()
    queue_depth = float(resp["data"]["result"][0]["value"][1])

    # Calculate desired replicas using custom logic
    desired_replicas = max(min_pods, min(max_pods, int(queue_depth / threshold)))

    # Scale Deployment
    scale = client.V1Scale(spec=client.V1ScaleSpec(replicas=desired_replicas))
    apps_v1.replace_namespaced_deployment_scale(
        name=consumer_name,
        namespace="default",
        body=scale
    )

    print(f"{consumer_name}: queue={queue_depth}, threshold={threshold}, scaled to {desired_replicas} pods")
Enter fullscreen mode Exit fullscreen mode

Notes

  • threshold is per queue.
  • min_pods and max_pods are read from a database, making it dynamic.
  • You can extend logic to include weighted scaling, multiple metrics, or cooldown periods.

Advantages of Custom Autoscaler

  • Independent scaling per queue
  • Dynamic min/max replicas from DB — no hardcoding
  • Multi-metric decisions (queue depth, CPU, DB lag, etc.)
  • Advanced logic (cooldowns, weighted scaling, prioritization)
  • Can handle N queues → M consumers mapping flexibly

Trade-offs

Feature KEDA Custom Autoscaler
Easy setup ❌ (Python + DB + K8s API)
Independent queue scaling
Multi-metric logic Limited
DB-driven min/max
Reliability ✅ Battle-tested ⚠️ Managed by you

✅ Takeaways

  1. KEDA is great for simple queue scaling.
  2. For complex microservices with multiple queues, custom autoscaler gives full control.
  3. Using a database for min/max replicas allows dynamic, production-ready scaling policies.
  4. Your custom autoscaler can evolve into a custom HPA/KEDA tailored to your architecture.

💡 Pro Tip:

Start with KEDA for simple cases. Move to a custom autoscaler with DB-defined min/max for multi-queue microservices with complex workflows.

Top comments (0)