DEV Community: Yoshik Karnawat

Kubernetes Autoscaling: Never Manually Scale Again

Yoshik Karnawat — Sun, 17 Aug 2025 11:54:04 +0000

As a Site Reliability Engineer who has managed countless production Kubernetes clusters, I've learned that one of the biggest challenges teams face is properly sizing their applications. Too little resources? Your app crashes under load. Too much? You're burning money unnecessarily.

That's where Kubernetes autoscaling comes to the rescue.

What is Autoscaling and Why Should You Care?

Imagine you're running an online store. During normal hours, you might need 3 servers to handle traffic. But during Black Friday sales, you suddenly need 20 servers. Then afterward, you scale back down to save costs.

Kubernetes autoscaling does exactly this - but automatically. No more 3 AM wake-up calls to manually add servers during traffic spikes.

There are two main types of autoscaling in Kubernetes:

Horizontal Pod Autoscaler (HPA): Adds more copies of your application
Vertical Pod Autoscaler (VPA): Gives more power (CPU/memory) to existing copies

Think of HPA as hiring more cashiers during busy hours, while VPA is like giving your existing cashiers faster computers.

Horizontal Pod Autoscaler (HPA): Adding More Workers

How HPA Works

HPA continuously monitors your application's resource usage. When CPU or memory usage gets too high, it automatically creates more pod copies to handle the load. When traffic decreases, it scales back down.

Here's the process:

Monitor: HPA checks your app's metrics every 15 seconds
Calculate: It determines if more or fewer pods are needed
Scale: It adds or removes pods accordingly
Wait: It waits for the system to stabilize before making another decision

Setting Up HPA

Let's say you have a web application that should scale up when CPU usage exceeds 70%. Here's how you set it up:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: web-app-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: web-app
  minReplicas: 2      # Never go below 2 pods
  maxReplicas: 10     # Never exceed 10 pods
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70  # Scale when CPU > 70%

Real-World Example

I once worked with an e-commerce company that saw traffic spikes during lunch hours (12-2 PM). Without HPA, their app would slow down terribly during these periods, causing customers to abandon their shopping carts.

After implementing HPA:

Before lunch rush: 3 pods running (normal load)
During lunch rush: Automatically scaled to 8 pods
After lunch rush: Gradually scaled back to 3 pods

Result? Page load times stayed consistent, and they processed 40% more orders during peak hours.

HPA Best Practices

1. Set Resource Requests
Your pods MUST have CPU and memory requests defined. HPA can't work without them.

resources:
  requests:
    cpu: 100m        # Required for HPA
    memory: 128Mi
  limits:
    cpu: 500m
    memory: 512Mi

2. Don't Set Min Replicas Too Low
Always keep at least 2 replicas running. If your single pod crashes, your entire service goes down.

3. Monitor Scaling Events
Use these commands to see what HPA is doing:

kubectl get hpa
kubectl describe hpa web-app-hpa

4. Avoid Flapping
If your app scales up and down too frequently, increase the stabilization window:

behavior:
  scaleDown:
    stabilizationWindowSeconds: 300  # Wait 5 minutes before scaling down

Vertical Pod Autoscaler (VPA): Giving More Power

How VPA Works

While HPA adds more workers, VPA makes existing workers more powerful. It monitors your application's actual resource usage over time and automatically adjusts CPU and memory allocations.

VPA has three modes:

Off: Only provides recommendations
Initial: Sets resources only when pods are created
Auto: Automatically updates running pods (requires pod restart)

Setting Up VPA

Here's a basic VPA configuration:

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: web-app-vpa
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: web-app
  updatePolicy:
    updateMode: "Auto"
  resourcePolicy:
    containerPolicies:
    - containerName: web-app
      minAllowed:
        cpu: 100m
        memory: 128Mi
      maxAllowed:
        cpu: 2000m
        memory: 2Gi

When to Use VPA

VPA is perfect for:

Database workloads: Databases often need more memory rather than more instances
Data processing applications: These might need varying amounts of CPU and memory
Legacy applications: Apps that can't easily scale horizontally

VPA Limitations to Know

1. Pod Restarts Required
VPA needs to restart pods to apply new resource settings. Plan for this.

2. Not Suitable for Stateful Apps
Avoid VPA for databases or other stateful applications where pod restarts cause issues.

3. Still in Beta
VPA is less mature than HPA. Test thoroughly before production use.

Can You Use HPA and VPA Together?

Short answer: Generally no, not on the same metrics.

If both HPA and VPA try to scale based on CPU usage, they'll fight each other:

HPA adds more pods because CPU is high
VPA increases CPU allocation because usage is high
This creates confusion and unpredictable behavior

Safe combination: Use HPA for CPU scaling and VPA only for memory optimization:

# HPA handles CPU scaling
metrics:
- type: Resource
  resource:
    name: cpu
    target:
      type: Utilization
      averageUtilization: 70

---
# VPA handles only memory
resourcePolicy:
  containerPolicies:
  - containerName: web-app
    controlledResources: ["memory"]  # Only memory, not CPU

Choosing the Right Autoscaling Strategy

Here's my decision framework after years of production experience:

Use HPA When:

Your app is stateless (web servers, APIs)
You have variable traffic patterns (daily/weekly spikes)
Your app can handle multiple instances
You need fast scaling response (seconds to minutes)

Use VPA When:

Your app has unpredictable resource needs
You're running batch jobs or data processing
You have stateful applications that can't scale horizontally
You want to optimize resource costs over time

Use Neither When:

Your app has steady, predictable load
Resource requirements are well-known and stable
You prefer manual control over scaling decisions

Getting Started: Your First Autoscaling Setup

Step 1: Install Metrics Server

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

Step 2: Create a Simple HPA

kubectl autoscale deployment web-app --cpu-percent=70 --min=2 --max=10

Step 3: Generate Some Load

kubectl run load-test --image=busybox --rm -it --restart=Never -- /bin/sh
# Inside the pod, run:
while true; do wget -q -O- http://web-app-service; done

Step 4: Watch It Scale

kubectl get hpa --watch
kubectl get pods --watch

Conclusion

Kubernetes autoscaling isn't just a nice-to-have feature - it's essential for running resilient, cost-effective applications at scale. HPA helps you handle traffic spikes automatically, while VPA ensures you're not wasting resources.

Start simple:

Implement HPA for your stateless web applications
Set conservative scaling thresholds initially
Monitor and adjust based on real usage patterns
Consider VPA for resource optimization once you're comfortable

Remember, autoscaling is as much about saving money as it is about handling load. Done right, it keeps your applications responsive while optimizing costs - letting you sleep better at night knowing your systems can handle whatever comes their way.

DevOps Skills Alone Aren’t Enough

Yoshik Karnawat — Sat, 09 Aug 2025 04:25:37 +0000

I've been an SRE engineer for over two years now. SRE stands for Site Reliability Engineering. It's about keeping production systems reliable and efficient.

But here's the thing - it's not just about learning fancy terms. You know, SLA, SLO, MTTR, service reliability. That stuff matters, but the real work is keeping systems stable and making processes better.

Since college, I believed one thing: skills beat paper.

I thought you didn't need good grades or certifications. Just be good at what you do. That's it.

Spoiler alert: I was completely wrong.

My College Mindset

Back then, I was all about side projects. I'd spin up my own VMs. Set up Caddy servers. Configure Cloudflare. Build stuff that actually worked.

This felt real. This felt valuable.

Why would I need a piece of paper when I could show actual results? My GitHub was full of projects. My servers were running smoothly. I could troubleshoot problems and fix them.

That had to be enough. Right?

Press enter or click to view image in full size
DevOps Skills Alone Aren't Enough

The Reality Check Hit Different

I was wrong. Dead wrong.

And I learned this through a series of painful rejections and missed opportunities.

Picture this: You're confident in your abilities. You apply for roles you know you can crush. Your technical skills are solid. Your projects speak for themselves. But somehow, you never make it past the initial screening.

That was my reality for months.

What I Discovered (And Why It Matters to You)

Skills give you the knowledge to work on real-world systems. But certifications? They give you something entirely different: credibility.

Think about it from a hiring manager's perspective. They get hundreds of resumes. Everyone claims to know Kubernetes. Everyone says they understand monitoring. Everyone talks about incident response.

How do they separate the real engineers from the wannabes?

Certifications act as a filter. They validate claims. They prove you didn't just tinker with Docker on weekends - you understand the concepts well enough to pass rigorous exams.

Here's a question for you: How many times have you been overlooked despite having the right skills? Drop a comment below - I bet your experience mirrors mine.

The $400 Game-Changer

Let me share something that'll probably surprise you. The CKAD certification costs around $400.

My old thinking: "Four hundred dollars for paper that expires? That's ridiculous."

My new reality: This "expensive paper" transformed my career trajectory.

Here's what actually happens:

Immediate salary impact: Most engineers see 10–20% salary increases after cloud-native certifications. On an $80,000 salary, that's $8,000–16,000 annually. The math is simple - you're profitable from month one.

Visibility explosion: Recruiters hunt for specific certifications. Without CKAD, you're invisible in Kubernetes job searches. With it, your phone starts ringing.

Promotion acceleration: When two engineers compete for advancement, guess who wins? The one with verified credentials.

The career boost from that $400 investment? It's exponentially higher than you'd imagine.

Stop Overthinking the Investment

Forget the $400 cost. Ignore the three-year expiry.

Focus on this instead:

The role you'll land because of that certification
The salary jump that follows
The confidence surge in technical conversations
The opportunities that suddenly appear

That expiry date? It's actually brilliant. It keeps you current in a field that evolves at breakneck speed. An eternal certification would be worthless within two years anyway.

The Insurance Policy You Didn't Know You Needed

Think of certifications as career insurance. You pay a small premium upfront to protect against massive potential losses.

What losses? Missing opportunities because you lack credentials.

Every month without certifications costs you:

Better job prospects
Higher compensation
Career advancement
Professional recognition

The opportunity cost of staying uncertified far exceeds any certification fee.

My Transformation Story

Now I balance both worlds. Side projects remain crucial - they build real skills. But certifications validate those skills to decision-makers.

My SRE focus areas:

AWS/GCP/Azure cloud platforms
Kubernetes ecosystem (CKA, CKAD)
Monitoring and observability tools

Your Turn: What's Holding You Back?

I'm curious - what's your biggest barrier to getting certified? Is it the cost? Time constraints? Impostor syndrome?

Share in the comments. I've probably faced the same obstacle and can offer some perspective.

The Truth About Our Industry

We work in a field where HR departments filter resumes with keyword searches. Where hiring managers need quick validation methods. Where automated systems decide who gets interviews.

You can fight the system or work within it. Fighting is noble but ineffective. Working within it gets results.

Build skills through hands-on projects. Prove those skills through certifications. This combination is unbeatable.

Here's What Changes When You Get Certified

The external validation shifts everything:

Conversations with senior engineers become peer discussions
Salary negotiations start from higher baselines
Career opportunities multiply exponentially
Confidence permeates every technical interaction

That $400 investment doesn't just buy a certificate. It purchases career transformation.

My Final Thoughts (And a Challenge for You)

I started this journey believing skills trumped credentials. I was half right - skills matter enormously. But credentials amplify those skills in ways I never anticipated.

Don't make my mistake. Don't let pride or cost concerns slow your career progression. The boost from strategic certifications will surprise you.

Here's my challenge: Pick one certification relevant to your career goals. Budget for it this month. Schedule the exam within 90 days.

Your future self will thank you for taking action today instead of waiting for the "perfect moment" that never comes.

What certification will you tackle first? Drop your choice in the comments.

Thanks for reading!

Linux Observability: Troubleshooting Made Simple

Yoshik Karnawat — Wed, 06 Aug 2025 05:02:11 +0000

No jargon, no complexity, real command line solutions.

Whether you're keeping systems running smoothly, managing deployments, or building backend services, these 10 commands will help you during 3 AM outages.

1. iostat

Real-time disk performance statistics that show which storage devices are your performance bottlenecks.

iostat -x 1

Key indicators:

%util - If this is consistently above 80%, your disk is a bottleneck
await - Average wait time for I/O requests (milliseconds)
iowait - CPU time spent waiting for disk operations

2. vmstat

A comprehensive view of system resource usage including memory, CPU, and I/O activity.

vmstat 1

Key indicators:

si/so - Swap in/out activity (any consistent values here mean memory pressure)
wa - I/O wait percentage (high values indicate disk bottlenecks)
r - Number of processes waiting for CPU time

3. lsof

Every open file, socket, and network connection on your system, along with which process owns it.

lsof -i :8080

Use cases:

Port conflicts: Find what's already using a port before your app starts
Memory leaks: Track file descriptor leaks with lsof -p
Network debugging: See all network connections with lsof -i

Find the biggest file handle users:

lsof | awk '{print $2}' | sort | uniq -c | sort -nr | head -10

This one-liner shows which processes are using the most file handles, crucial for debugging file descriptor exhaustion.

4. sar

Historical system performance data that helps you understand performance patterns over time.

sar -u 1 10

Track different metrics:

CPU usage: sar -u shows user, system, and idle time percentages
Memory: sar -r displays memory utilization trends
Network: sar -n DEV shows network interface statistics

Review historical data:

sar -f /var/log/sysstat/sa...

Unlike real-time tools, sar lets you see what happened during that 3AM performance spike when nobody was watching.

5. iotop

Which specific processes are generating disk I/O, sorted by actual usage.

iotop -o

Use cases:

Identify I/O hogs instantly without guessing
Track total read/write per process in real-time
Find runaway processes that are thrashing your disks

The -o flag shows only processes actually doing I/O, filtering out the noise.

6. strace

Every system call your process makes - the ultimate debugging microscope.

strace -f -e trace=file

Advanced use:

Track file access: -e trace=file shows only file-related calls
Monitor network: -e trace=network for socket operations
Time analysis: -T shows time spent in each system call

For running processes:

strace -p  -f -o /tmp/trace.log

This captures the behavior of a running process and all its children, writing to a file for later analysis.

7. netstat with ss

Detailed network socket information and connection states.

ss -tulpn

Advanced use:

Find connection states: ss -o state established
Memory usage per socket: ss -m
Process information: ss -p shows which process owns each connection

Track connection problems:

ss -s

This summary shows socket statistics including how many connections are in different states.

8. dstat

Combined CPU, disk, network, and memory statistics in a single, colorful display.

dstat -cdngy

Flag breakdown:

-c - CPU stats
-d - Disk stats
-n - Network stats
-g - Page stats
-y - System stats

Custom intervals:

dstat --top-cpu --top-io --top-mem 5

This shows the top processes consuming CPU, I/O, and memory every 5 seconds.

9. pidstat

Detailed resource usage for individual processes over time.

pidstat -u -r -d 1

Track specific processes:

pidstat -p  1

Why it's better than ps:

Shows trends over time, not just snapshots
Per-thread statistics with -t flag
Historical data when combined with sar

10. perf

Deep CPU performance analysis including cache misses, branch predictions, and instruction efficiency.

perf top

Advanced profiling:

perf record -g 
perf report

System-wide analysis:

perf stat -a sleep 10

This runs system-wide performance counters for 10 seconds, showing you efficiency metrics like instructions per cycle and cache hit rates.

Thanks for reading.