Kubernetes Autoscaling with Custom Scaler: Event-Driven Scaling for Queues and Microservices – Part 2

#kubernetes #autoscaling #devops #cloud

In Part 1, we explored KEDA and how it scales your consumers based on queue depth.

But what if:

You have N queues → M consumers
Each queue has different thresholds, min, and max replicas
Each consumer has a different workflow / endpoint

KEDA alone can’t handle this. That’s where a custom autoscaler comes in.

Problem Statement

Example scenario:

Queue	Consumer Deployment	Threshold	Min Pods	Max Pods
x-queue-1	consumer-x-1	100	1	5
x-queue-2	consumer-x-2	50	2	6
y-queue	consumer-y	200	1	10

Each queue has its own processing logic.
Scaling decisions must be independent.
Min/Max replicas can be defined dynamically in a database for flexibility.

Architecture Overview

┌───────────────────────┐
│     Producers          │
│  (Apps push messages   │
│   to queues)           │
└─────────┬─────────────┘
          │
          ▼
 ┌─────────────────┐
 │  Message Broker  │
 │  (Kafka/Rabbit)  │
 └─────────┬────────┘
           │
┌──────────┴──────────┐
▼                     ▼
┌─────────────────────┐       ┌─────────────────────┐
│  Python Exporter    │       │ Prometheus Metrics  │
│ - Reads queue depth │◄─────►│ storage             │
│ - Reads min/max     │       └─────────────────────┘
│   from DB           │
│ - Applies scaling   │
│   logic per queue   │
│ - Calls Kubernetes  │
│   API to scale      │
└─────────┬───────────┘
          │
          ▼
┌─────────────────────────────┐
│ Consumer Deployments         │
│ - pod-deployment-1          │
│ - pod-deployment-2          │
│ - pod-deployment-y          │
└─────────────────────────────┘

Python Example: Custom Scaling with DB Min/Max

from kubernetes import client, config
import requests
import sqlite3  # Example DB; replace with your real DB

# Load in-cluster config
config.load_incluster_config()
apps_v1 = client.AppsV1Api()

# Connect to database containing min/max per consumer
conn = sqlite3.connect('consumer_scaling.db')
cursor = conn.cursor()

# Read scaling config for all consumers
cursor.execute("SELECT consumer_name, queue_name, threshold, min_pods, max_pods FROM scaling_config")
scaling_rules = cursor.fetchall()

# Prometheus endpoint
prometheus_url = "http://prometheus:9090/api/v1/query"

for consumer_name, queue_name, threshold, min_pods, max_pods in queue_config:
    # Query current queue depth
    query = f'sum(queue_depth{{queue="{queue_name}"}})'
    resp = requests.get(prometheus_url, params={"query": query}).json()
    queue_depth = float(resp["data"]["result"][0]["value"][1])

    # Calculate desired replicas using custom logic
    desired_replicas = max(min_pods, min(max_pods, int(queue_depth / threshold)))

    # Scale Deployment
    scale = client.V1Scale(spec=client.V1ScaleSpec(replicas=desired_replicas))
    apps_v1.replace_namespaced_deployment_scale(
        name=consumer_name,
        namespace="default",
        body=scale
    )

    print(f"{consumer_name}: queue={queue_depth}, threshold={threshold}, scaled to {desired_replicas} pods")

Notes

threshold is per queue.
min_pods and max_pods are read from a database, making it dynamic.
You can extend logic to include weighted scaling, multiple metrics, or cooldown periods.

Advantages of Custom Autoscaler

Independent scaling per queue
Dynamic min/max replicas from DB — no hardcoding
Multi-metric decisions (queue depth, CPU, DB lag, etc.)
Advanced logic (cooldowns, weighted scaling, prioritization)
Can handle N queues → M consumers mapping flexibly

Trade-offs

Feature	KEDA	Custom Autoscaler
Easy setup	✅	❌ (Python + DB + K8s API)
Independent queue scaling	❌	✅
Multi-metric logic	Limited	✅
DB-driven min/max	❌	✅
Reliability	✅ Battle-tested	⚠️ Managed by you

✅ Takeaways

KEDA is great for simple queue scaling.
For complex microservices with multiple queues, custom autoscaler gives full control.
Using a database for min/max replicas allows dynamic, production-ready scaling policies.
Your custom autoscaler can evolve into a custom HPA/KEDA tailored to your architecture.

💡 Pro Tip:

Start with KEDA for simple cases. Move to a custom autoscaler with DB-defined min/max for multi-queue microservices with complex workflows.