DEV Community

Sumit Roy
Sumit Roy

Posted on

Zero-Downtime VM to Kubernetes Migration with Istio: A Complete Production Guide

I was troubleshooting a failed migration for one of my previous projects, watching our legacy service crash as we tried moving it from VMs to Kubernetes. The traditional 'maintenance window and hope' approach wasn't working.

That's when I discovered something magical: hybrid deployments with Istio service mesh. The ability to run applications on both VMs and Kubernetes simultaneously, gradually shifting traffic with zero downtime.

So, I asked myself:
"Can I migrate legacy applications from VMs to Kubernetes without any service interruption, while having full control over traffic routing and the ability to instantly rollback?"

The answer: Absolutely. Using k8s + Istio + WorkloadEntry + Canary Deployments.

Here i have stimulated the exact approach on my local

🎯 What You'll Learn

  • Set up a hybrid VM + Kubernetes deployment using Istio service mesh
  • Register VM applications in Kubernetes service discovery with WorkloadEntry
  • Implement canary deployments with intelligent traffic splitting
  • Master production-grade migration strategies with instant rollback capabilities
  • Handle real-world migration challenges and troubleshooting

πŸ› οΈ Tech Stack

  • k3d - Lightweight Kubernetes cluster for local development
  • Istio - Service mesh for traffic management and observability
  • WorkloadEntry - Register VM workloads in Kubernetes service registry
  • ServiceEntry - Define external services in the mesh
  • VirtualService - Advanced traffic routing and canary deployments
  • Node.js - Sample application (easily replaceable with any tech stack)

πŸ“š The Migration Challenge

Why Traditional Migration Fails

Most organizations attempt migrations like this:

  1. Maintenance Window β†’ Schedule downtime (expensive!)
  2. Pray and Deploy β†’ Deploy new version, hope nothing breaks
  3. All or Nothing β†’ 100% traffic shift immediately
  4. Panic Mode β†’ When things go wrong, scramble to rollback

Result: Sleepless nights, angry customers, and failed projects.

The Istio Solution

Instead of switching instantly, we create a hybrid architecture:

  • VM Application serves 80% of traffic initially
  • Kubernetes Application serves 20% of traffic (canary)
  • Gradual Migration β†’ Shift from 80/20 β†’ 50/50 β†’ 20/80 β†’ 0/100
  • Instant Rollback β†’ One command reverts all traffic to VM

πŸ§‘β€πŸ’» Building Our Migration Lab

Let's simulate a real-world scenario where we migrate a Node.js API from a VM to Kubernetes.

Step 1: Create the "Legacy" VM Application

First, let's build our legacy application that's currently running on a VM:

mkdir migration-demo
cd migration-demo

Create a simple Node.js API

cat > app.js << 'EOF'
const express = require('express');
const os = require('os');
const app = express();
const port = 3000;

app.get('/', (req, res) => {
    res.json({
        message: 'Hello from Migration Demo!',
        hostname: os.hostname(),
        platform: 'VM',
        timestamp: new Date().toISOString(),
        version: 'v1.0'
    });
});

app.get('/health', (req, res) => {
    res.json({ status: 'healthy' });
});

app.listen(port, '0.0.0.0', () => {
    console.log(`App running on port ${port}`);
});
EOF
Enter fullscreen mode Exit fullscreen mode

Create package.json

cat > package.json << 'EOF'
{
  "name": "migration-demo",
  "version": "1.0.0",
  "main": "app.js",
  "scripts": {
    "start": "node app.js"
  },
  "dependencies": {
    "express": "^4.18.2"
  }
}
EOF
Enter fullscreen mode Exit fullscreen mode

Install and run our "VM" application:

# Install Node.js and dependencies
`npm install`

# Start the VM application
`npm start &`

# Test it's working
`curl http://localhost:3000`
Enter fullscreen mode Exit fullscreen mode

You should see:

{
  "message": "Hello from Migration Demo!",
  "hostname": "your-machine",
  "platform": "VM",
  "timestamp": "2024-01-27T...",
  "version": "v1.0"
}
Enter fullscreen mode Exit fullscreen mode

This is our legacy application running on the "VM".

Step 2: Containerize for Kubernetes

Now let's prepare the same application for Kubernetes:

bash
Enter fullscreen mode Exit fullscreen mode

Create Dockerfile

cat > Dockerfile << 'EOF'
FROM node:18-alpine

WORKDIR /app
COPY package*.json ./
RUN npm install --only=production
COPY . .
EXPOSE 3000
CMD ["npm", "start"]
EOF

# Build Docker image
docker build -t migration-demo:v1.0 .
Enter fullscreen mode Exit fullscreen mode

Step 3: Set Up Kubernetes Cluster

bash

Create k3d cluster with port mappings

k3d cluster create migration-cluster \
--port "8080:80@loadbalancer" \
--port "8443:443@loadbalancer" \
--agents 2

Load our image into the cluster

k3d image import migration-demo:v1.0 -c migration-cluster

Verify cluster

kubectl get nodes


### Step 4: **Install Istio Service Mesh**

  bash
# Install Istio
`istioctl install --set values.defaultRevision=default -y`

# Enable automatic sidecar injection
`kubectl label namespace default istio-injection=enabled`

# Verify Istio is running
`kubectl get pods -n istio-system`
Enter fullscreen mode Exit fullscreen mode

Wait until all Istio pods show Running status.

πŸš€ The Magic: Hybrid Deployment Setup

This is where it gets exciting! We're going to register our VM application with Istio so both VM and Kubernetes versions can coexist.

Step 5: Deploy Kubernetes Version

bash

cat > k8s-deployment.yaml << 'EOF'
apiVersion: apps/v1
kind: Deployment
metadata:
  name: migration-demo-k8s
  labels:
    app: migration-demo
    version: k8s
spec:
  replicas: 2
  selector:
    matchLabels:
      app: migration-demo
      version: k8s
  template:
    metadata:
      labels:
        app: migration-demo
        version: k8s
    spec:
      containers:
      - name: migration-demo
        image: migration-demo:v1.0
        ports:
        - containerPort: 3000
        env:
        - name: PLATFORM
          value: "Kubernetes"
        livenessProbe:
          httpGet:
            path: /health
            port: 3000
          initialDelaySeconds: 30
          periodSeconds: 10
        readinessProbe:
          httpGet:
            path: /health
            port: 3000
          initialDelaySeconds: 5
          periodSeconds: 5
---
apiVersion: v1
kind: Service
metadata:
  name: migration-demo-service
spec:
  ports:
  - port: 3000
    name: http
  selector:
    app: migration-demo
EOF
Enter fullscreen mode Exit fullscreen mode

kubectl apply -f k8s-deployment.yaml

Wait for pods to be ready:

kubectl get pods
Should show:
migration-demo-k8s-xxx 2/2 Running (2/2 = app + istio-proxy)

Step 6: Register VM in Service Mesh

Here's the breakthrough - we register our VM application with Istio using WorkloadEntry:

cat > vm-workloadentry.yaml << 'EOF'
apiVersion: networking.istio.io/v1beta1
kind: WorkloadEntry
metadata:
  name: migration-demo-vm
  namespace: default
spec:
  address: "host.k3d.internal"  # k3d's way to reach host machine
  ports:
    http: 3000
  labels:
    app: migration-demo
    version: vm
Enter fullscreen mode Exit fullscreen mode
apiVersion: networking.istio.io/v1beta1
kind: ServiceEntry
metadata:
  name: migration-demo-vm-service
  namespace: default
spec:
  hosts:
  - migration-demo-vm.local
  ports:
  - number: 3000
    name: http
    protocol: HTTP
  location: MESH_EXTERNAL
  resolution: STATIC
  endpoints:
  - address: "host.k3d.internal"
    ports:
      http: 3000
    labels:
      app: migration-demo
      version: vm
EOF

Enter fullscreen mode Exit fullscreen mode
kubectl apply -f vm-workloadentry.yaml
kubectl apply -f vm-serviceEntry.yaml
Enter fullscreen mode Exit fullscreen mode

What just happened?

  • Our VM application is now part of the Kubernetes service discovery!
  • Istio can route traffic to both VM and Kubernetes versions
  • Both applications share the same service name: migration-demo-service

πŸŽ›οΈ Canary Deployment: The Migration Control Panel

Now for the most powerful part - intelligent traffic routing:

Step 7: Configure Traffic Management

bash

Create traffic routing rules

cat > destination-rule.yaml << 'EOF'
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
  name: migration-demo-destination
spec:
  host: migration-demo-service
  subsets:
  - name: vm
    labels:
      version: vm
  - name: k8s
    labels:
      version: k8s
EOF
Enter fullscreen mode Exit fullscreen mode
cat > virtual-service.yaml << 'EOF'
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: migration-demo-vs
spec:
  hosts:
  - migration-demo-service
  http:
  - match:
    - headers:
        canary:
          exact: "true"
    route:
    - destination:
        host: migration-demo-service
        subset: k8s
  - route:
    - destination:
        host: migration-demo-service
        subset: vm
      weight: 80
    - destination:
        host: migration-demo-service
        subset: k8s
      weight: 20
EOF
Enter fullscreen mode Exit fullscreen mode

kubectl apply -f destination-rule.yaml
kubectl apply -f virtual-service.yaml

What we just created:

  • 80% traffic goes to VM (safe, proven version)
  • 20% traffic goes to Kubernetes (canary testing)
  • Feature flag: canary: true header routes 100% to Kubernetes
  • Instant control: Change weights anytime without deployment

Step 8: Test the Migration in Action

Create a test client:

cat > test-pod.yaml << 'EOF'
apiVersion: v1
kind: Pod
metadata:
  name: test-client
spec:
  containers:
  - name: curl
    image: curlimages/curl:latest
    command: ["/bin/sh"]
    args: ["-c", "while true; do sleep 3600; done"]
EOF
Enter fullscreen mode Exit fullscreen mode

kubectl apply -f test-pod.yaml
kubectl wait --for=condition=ready pod/test-client

Test normal traffic distribution:

echo "Testing normal traffic (80% VM, 20% K8s):"
for i in {1..10}; do
  kubectl exec -it test-client -- curl -s http://migration-demo-service:3000 | grep '"platform"'
done
Enter fullscreen mode Exit fullscreen mode

Test canary routing:

echo "Testing canary header (100% K8s):"
for i in {1..5}; do
  kubectl exec -it test-client -- curl -s -H "canary: true" http://migration-demo-service:3000 | grep '"platform"'
done

Enter fullscreen mode Exit fullscreen mode

πŸŽ‰ You should see:

  • Normal requests: Mix of "platform": "VM" and "platform": "Kubernetes"
  • Canary requests: All show "platform": "Kubernetes"

πŸ“ˆ Production Migration Strategy

Now that we have the foundation, here's how you'd execute this in production:

Phase 1: Initial Deployment (Week 1)

# Start conservative: 95% VM, 5% K8s
weight: 95  # VM
weight: 5   # Kubernetes
Enter fullscreen mode Exit fullscreen mode

Phase 2: Confidence Building (Week 2-3)

# Increase gradually as metrics look good
weight: 70  # VM  
weight: 30  # Kubernetes
Enter fullscreen mode Exit fullscreen mode

Phase 3: Equal Split Testing (Week 4)

# Test at scale with equal traffic
weight: 50  # VM
weight: 50  # Kubernetes
Enter fullscreen mode Exit fullscreen mode

Phase 4: Kubernetes Majority (Week 5)

# Shift majority to K8s
weight: 20  # VM
weight: 80  # Kubernetes  
Enter fullscreen mode Exit fullscreen mode

Phase 5: Migration Complete (Week 6)

# Full migration
weight: 0   # VM (decommission)
weight: 100 # Kubernetes
Enter fullscreen mode Exit fullscreen mode

🐞 Production Challenges I've Solved

Problem 1: Database Connection Storms

Symptom: K8s pods create more DB connections than VM
Solution:

apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
spec:
  trafficPolicy:
    connectionPool:
      tcp:
        maxConnections: 50
        connectTimeout: 30s
Enter fullscreen mode Exit fullscreen mode

Problem 2: Session Affinity Issues

Symptom: User sessions break during traffic shifts
Solution:

apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
spec:
  trafficPolicy:
    loadBalancer:
      consistentHash:
        httpCookieName: "JSESSIONID"
Enter fullscreen mode Exit fullscreen mode

Problem 3: Instant Rollback Needed

Command for emergency rollback:

kubectl patch virtualservice migration-demo-vs --type='merge' -p='
{
  "spec": {
    "http": [{
      "route": [{
        "destination": {
          "host": "migration-demo-service", 
          "subset": "vm"
        },
        "weight": 100
      }]
    }]
  }
}'
Enter fullscreen mode Exit fullscreen mode

🎯 Key Takeaways

What we accomplished:

  • βœ… Zero downtime migration with traffic splitting
  • βœ… Hybrid VM + Kubernetes architecture
  • βœ… Instant rollback capability
  • βœ… Feature flagging with header-based routing
  • βœ… Production-ready traffic management

This approach is used by:

  • Netflix (microservices migration)
  • Spotify (platform modernization)
  • Airbnb (infrastructure consolidation)

πŸ’¬ Let's Connect

If you implement this migration strategy or face any challenges, I'd love to hear about it!

GitHub β†’ GitHub
LinkedIn β†’ LinkedIn
X (Twitter) β†’ Twitter

Drop a star ⭐ on the repo if it helped you β€” it keeps me motivated to write more experiments like this!

Found this helpful? Drop a ⭐ - it motivates me to write more production battle stories like this!

Questions about your specific migration scenario? Let's discuss in the comments below. Every legacy system has unique challenges, and I've probably faced something similar!


Top comments (0)