Sumit Roy

Posted on Sep 22

Zero-Downtime VM to Kubernetes Migration with Istio: A Complete Production Guide

#kubernetes #opensource #devops #cloud

I was troubleshooting a failed migration for one of my previous projects, watching our legacy service crash as we tried moving it from VMs to Kubernetes. The traditional 'maintenance window and hope' approach wasn't working.

That's when I discovered something magical: hybrid deployments with Istio service mesh. The ability to run applications on both VMs and Kubernetes simultaneously, gradually shifting traffic with zero downtime.

So, I asked myself:
"Can I migrate legacy applications from VMs to Kubernetes without any service interruption, while having full control over traffic routing and the ability to instantly rollback?"

The answer: Absolutely. Using k8s + Istio + WorkloadEntry + Canary Deployments.

Here i have stimulated the exact approach on my local

🎯 What You'll Learn

Set up a hybrid VM + Kubernetes deployment using Istio service mesh
Register VM applications in Kubernetes service discovery with WorkloadEntry
Implement canary deployments with intelligent traffic splitting
Master production-grade migration strategies with instant rollback capabilities
Handle real-world migration challenges and troubleshooting

🛠️ Tech Stack

k3d - Lightweight Kubernetes cluster for local development
Istio - Service mesh for traffic management and observability
WorkloadEntry - Register VM workloads in Kubernetes service registry
ServiceEntry - Define external services in the mesh
VirtualService - Advanced traffic routing and canary deployments
Node.js - Sample application (easily replaceable with any tech stack)

📚 The Migration Challenge

Why Traditional Migration Fails

Most organizations attempt migrations like this:

Maintenance Window → Schedule downtime (expensive!)
Pray and Deploy → Deploy new version, hope nothing breaks
All or Nothing → 100% traffic shift immediately
Panic Mode → When things go wrong, scramble to rollback

Result: Sleepless nights, angry customers, and failed projects.

The Istio Solution

Instead of switching instantly, we create a hybrid architecture:

VM Application serves 80% of traffic initially
Kubernetes Application serves 20% of traffic (canary)
Gradual Migration → Shift from 80/20 → 50/50 → 20/80 → 0/100
Instant Rollback → One command reverts all traffic to VM

🧑‍💻 Building Our Migration Lab

Let's simulate a real-world scenario where we migrate a Node.js API from a VM to Kubernetes.

Step 1: Create the "Legacy" VM Application

First, let's build our legacy application that's currently running on a VM:

mkdir migration-demo
cd migration-demo

Create a simple Node.js API

cat > app.js << 'EOF'
const express = require('express');
const os = require('os');
const app = express();
const port = 3000;

app.get('/', (req, res) => {
    res.json({
        message: 'Hello from Migration Demo!',
        hostname: os.hostname(),
        platform: 'VM',
        timestamp: new Date().toISOString(),
        version: 'v1.0'
    });
});

app.get('/health', (req, res) => {
    res.json({ status: 'healthy' });
});

app.listen(port, '0.0.0.0', () => {
    console.log(`App running on port ${port}`);
});
EOF

Create package.json

cat > package.json << 'EOF'
{
  "name": "migration-demo",
  "version": "1.0.0",
  "main": "app.js",
  "scripts": {
    "start": "node app.js"
  },
  "dependencies": {
    "express": "^4.18.2"
  }
}
EOF

Install and run our "VM" application:

# Install Node.js and dependencies
`npm install`

# Start the VM application
`npm start &`

# Test it's working
`curl http://localhost:3000`

You should see:

{
  "message": "Hello from Migration Demo!",
  "hostname": "your-machine",
  "platform": "VM",
  "timestamp": "2024-01-27T...",
  "version": "v1.0"
}

This is our legacy application running on the "VM".

Step 2: Containerize for Kubernetes

Now let's prepare the same application for Kubernetes:

bash

Create Dockerfile

cat > Dockerfile << 'EOF'
FROM node:18-alpine

WORKDIR /app
COPY package*.json ./
RUN npm install --only=production
COPY . .
EXPOSE 3000
CMD ["npm", "start"]
EOF

# Build Docker image
docker build -t migration-demo:v1.0 .

Step 3: Set Up Kubernetes Cluster

bash

Create k3d cluster with port mappings

k3d cluster create migration-cluster \ --port "8080:80@loadbalancer" \ --port "8443:443@loadbalancer" \ --agents 2

Load our image into the cluster

k3d image import migration-demo:v1.0 -c migration-cluster

Verify cluster

kubectl get nodes


### Step 4: **Install Istio Service Mesh**

  bash
# Install Istio
`istioctl install --set values.defaultRevision=default -y`

# Enable automatic sidecar injection
`kubectl label namespace default istio-injection=enabled`

# Verify Istio is running
`kubectl get pods -n istio-system`

Wait until all Istio pods show Running status.

🚀 The Magic: Hybrid Deployment Setup

This is where it gets exciting! We're going to register our VM application with Istio so both VM and Kubernetes versions can coexist.

Step 5: Deploy Kubernetes Version

bash

cat > k8s-deployment.yaml << 'EOF'
apiVersion: apps/v1
kind: Deployment
metadata:
  name: migration-demo-k8s
  labels:
    app: migration-demo
    version: k8s
spec:
  replicas: 2
  selector:
    matchLabels:
      app: migration-demo
      version: k8s
  template:
    metadata:
      labels:
        app: migration-demo
        version: k8s
    spec:
      containers:
      - name: migration-demo
        image: migration-demo:v1.0
        ports:
        - containerPort: 3000
        env:
        - name: PLATFORM
          value: "Kubernetes"
        livenessProbe:
          httpGet:
            path: /health
            port: 3000
          initialDelaySeconds: 30
          periodSeconds: 10
        readinessProbe:
          httpGet:
            path: /health
            port: 3000
          initialDelaySeconds: 5
          periodSeconds: 5
---
apiVersion: v1
kind: Service
metadata:
  name: migration-demo-service
spec:
  ports:
  - port: 3000
    name: http
  selector:
    app: migration-demo
EOF

kubectl apply -f k8s-deployment.yaml

Wait for pods to be ready:

kubectl get pods
Should show:
migration-demo-k8s-xxx 2/2 Running (2/2 = app + istio-proxy)

Step 6: `Register VM in Service Mesh`

Here's the breakthrough - we register our VM application with Istio using WorkloadEntry:

cat > vm-workloadentry.yaml << 'EOF'
apiVersion: networking.istio.io/v1beta1
kind: WorkloadEntry
metadata:
  name: migration-demo-vm
  namespace: default
spec:
  address: "host.k3d.internal"  # k3d's way to reach host machine
  ports:
    http: 3000
  labels:
    app: migration-demo
    version: vm

apiVersion: networking.istio.io/v1beta1
kind: ServiceEntry
metadata:
  name: migration-demo-vm-service
  namespace: default
spec:
  hosts:
  - migration-demo-vm.local
  ports:
  - number: 3000
    name: http
    protocol: HTTP
  location: MESH_EXTERNAL
  resolution: STATIC
  endpoints:
  - address: "host.k3d.internal"
    ports:
      http: 3000
    labels:
      app: migration-demo
      version: vm
EOF

kubectl apply -f vm-workloadentry.yaml
kubectl apply -f vm-serviceEntry.yaml

What just happened?

Our VM application is now part of the Kubernetes service discovery!
Istio can route traffic to both VM and Kubernetes versions
Both applications share the same service name: migration-demo-service

🎛️ Canary Deployment: The Migration Control Panel

Now for the most powerful part - intelligent traffic routing:

Step 7: Configure Traffic Management

bash

Create traffic routing rules

cat > destination-rule.yaml << 'EOF'
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
  name: migration-demo-destination
spec:
  host: migration-demo-service
  subsets:
  - name: vm
    labels:
      version: vm
  - name: k8s
    labels:
      version: k8s
EOF

cat > virtual-service.yaml << 'EOF'
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: migration-demo-vs
spec:
  hosts:
  - migration-demo-service
  http:
  - match:
    - headers:
        canary:
          exact: "true"
    route:
    - destination:
        host: migration-demo-service
        subset: k8s
  - route:
    - destination:
        host: migration-demo-service
        subset: vm
      weight: 80
    - destination:
        host: migration-demo-service
        subset: k8s
      weight: 20
EOF

kubectl apply -f destination-rule.yaml kubectl apply -f virtual-service.yaml

What we just created:

80% traffic goes to VM (safe, proven version)
20% traffic goes to Kubernetes (canary testing)
Feature flag: canary: true header routes 100% to Kubernetes
Instant control: Change weights anytime without deployment

Step 8: Test the Migration in Action

Create a test client:

cat > test-pod.yaml << 'EOF'
apiVersion: v1
kind: Pod
metadata:
  name: test-client
spec:
  containers:
  - name: curl
    image: curlimages/curl:latest
    command: ["/bin/sh"]
    args: ["-c", "while true; do sleep 3600; done"]
EOF

kubectl apply -f test-pod.yaml kubectl wait --for=condition=ready pod/test-client

Test normal traffic distribution:

echo "Testing normal traffic (80% VM, 20% K8s):"
for i in {1..10}; do
  kubectl exec -it test-client -- curl -s http://migration-demo-service:3000 | grep '"platform"'
done

Test canary routing:

echo "Testing canary header (100% K8s):"
for i in {1..5}; do
  kubectl exec -it test-client -- curl -s -H "canary: true" http://migration-demo-service:3000 | grep '"platform"'
done

🎉 You should see:

Normal requests: Mix of "platform": "VM" and "platform": "Kubernetes"
Canary requests: All show "platform": "Kubernetes"

📈 Production Migration Strategy

Now that we have the foundation, here's how you'd execute this in production:

Phase 1: Initial Deployment (Week 1)

# Start conservative: 95% VM, 5% K8s
weight: 95  # VM
weight: 5   # Kubernetes

Phase 2: Confidence Building (Week 2-3)

# Increase gradually as metrics look good
weight: 70  # VM  
weight: 30  # Kubernetes

Phase 3: Equal Split Testing (Week 4)

# Test at scale with equal traffic
weight: 50  # VM
weight: 50  # Kubernetes

Phase 4: Kubernetes Majority (Week 5)

# Shift majority to K8s
weight: 20  # VM
weight: 80  # Kubernetes

Phase 5: Migration Complete (Week 6)

# Full migration
weight: 0   # VM (decommission)
weight: 100 # Kubernetes

🐞 Production Challenges I've Solved

Problem 1: Database Connection Storms

Symptom: K8s pods create more DB connections than VM
Solution:

apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
spec:
  trafficPolicy:
    connectionPool:
      tcp:
        maxConnections: 50
        connectTimeout: 30s

Problem 2: Session Affinity Issues

Symptom: User sessions break during traffic shifts
Solution:

apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
spec:
  trafficPolicy:
    loadBalancer:
      consistentHash:
        httpCookieName: "JSESSIONID"

Problem 3: Instant Rollback Needed

Command for emergency rollback:

kubectl patch virtualservice migration-demo-vs --type='merge' -p='
{
  "spec": {
    "http": [{
      "route": [{
        "destination": {
          "host": "migration-demo-service", 
          "subset": "vm"
        },
        "weight": 100
      }]
    }]
  }
}'

🎯 Key Takeaways

What we accomplished:

✅ Zero downtime migration with traffic splitting
✅ Hybrid VM + Kubernetes architecture
✅ Instant rollback capability
✅ Feature flagging with header-based routing
✅ Production-ready traffic management

This approach is used by:

Netflix (microservices migration)
Spotify (platform modernization)
Airbnb (infrastructure consolidation)

💬 Let's Connect

If you implement this migration strategy or face any challenges, I'd love to hear about it!

GitHub → GitHub
LinkedIn → LinkedIn
X (Twitter) → Twitter

Drop a star ⭐ on the repo if it helped you — it keeps me motivated to write more experiments like this!

Found this helpful? Drop a ⭐ - it motivates me to write more production battle stories like this!

Questions about your specific migration scenario? Let's discuss in the comments below. Every legacy system has unique challenges, and I've probably faced something similar!

DEV Community

Zero-Downtime VM to Kubernetes Migration with Istio: A Complete Production Guide

🎯 What You'll Learn

🛠️ Tech Stack

📚 The Migration Challenge

Why Traditional Migration Fails

The Istio Solution

🧑‍💻 Building Our Migration Lab

Step 1: Create the "Legacy" VM Application

Create a simple Node.js API

Create package.json

Step 2: Containerize for Kubernetes

Create Dockerfile

Step 3: Set Up Kubernetes Cluster

Create k3d cluster with port mappings

Load our image into the cluster

Verify cluster

🚀 The Magic: Hybrid Deployment Setup

Step 5: Deploy Kubernetes Version

Step 6: `Register VM in Service Mesh`

🎛️ Canary Deployment: The Migration Control Panel

Step 7: Configure Traffic Management

Create traffic routing rules

Step 8: Test the Migration in Action

📈 Production Migration Strategy

Phase 1: Initial Deployment (Week 1)

Phase 2: Confidence Building (Week 2-3)

Phase 3: Equal Split Testing (Week 4)

Phase 4: Kubernetes Majority (Week 5)

Phase 5: Migration Complete (Week 6)

🐞 Production Challenges I've Solved

Problem 1: Database Connection Storms

Problem 2: Session Affinity Issues

Problem 3: Instant Rollback Needed

🎯 Key Takeaways

💬 Let's Connect

Top comments (0)

🎯 What You'll Learn

🛠️ Tech Stack

📚 The Migration Challenge

Why Traditional Migration Fails

The Istio Solution

🧑‍💻 Building Our Migration Lab

Step 1: Create the "Legacy" VM Application

Create a simple Node.js API

Create package.json

Step 2: Containerize for Kubernetes

Create Dockerfile

Step 3: Set Up Kubernetes Cluster

Create k3d cluster with port mappings

Load our image into the cluster

Verify cluster

🚀 The Magic: Hybrid Deployment Setup

Step 5: Deploy Kubernetes Version

Step 6: Register VM in Service Mesh

🎛️ Canary Deployment: The Migration Control Panel

Step 7: Configure Traffic Management

Create traffic routing rules

Step 8: Test the Migration in Action

📈 Production Migration Strategy

Phase 1: Initial Deployment (Week 1)

Phase 2: Confidence Building (Week 2-3)

Phase 3: Equal Split Testing (Week 4)

Phase 4: Kubernetes Majority (Week 5)

Phase 5: Migration Complete (Week 6)

🐞 Production Challenges I've Solved

Problem 1: Database Connection Storms

Problem 2: Session Affinity Issues

Problem 3: Instant Rollback Needed

🎯 Key Takeaways

💬 Let's Connect

Step 6: `Register VM in Service Mesh`