DEV Community

Alister Baroi for Tigera Inc

Posted on • Originally published at tigera.io on

KubeVirt Live Migration Done Right: What it Takes to Run VMs on Kubernetes

Running VMs in Kubernetes sounds like a crazy workaround for avoiding vendor lock-in, and standardizing legacy applications and newer containerized workloads on one control plane with one set of security policies to govern them all. It is, however, a rapidly growing pattern, and KubeVirt live migration — moving running VMs between nodes without downtime — is increasingly central to platform engineering use cases that require full VMs, like on-demand CI/CD pipelines.

KubeVirt is gaining traction as a way to bring VMs into Kubernetes as first-class workloads, managed with the same tools and primitives that platform teams already use for containers. It has, however, introduced some unique challenges.

Here’s the uncomfortable truth about that migration: compute and storage are the easy parts. Networking is where migrations stall, roadblock multiple, and platform teams start questioning whether KubeVirt was the right call in the first place.

If your VMs have no fixed IP dependencies, no VLAN memberships, and no upstream firewall rules scoped to specific subnets, you can migrate them into Kubernetes without losing sleep over the networking layer. If you’re running hundreds or thousands of VMs with IP addresses hardcoded into application configs, DNS entries, and firewall ACLs — and you need to move those VMs to Kubernetes without rewriting any of it — then your networking layer is about to become the most important decision in your migration.

What follows is a technical walk-through of the L2 plumbing that keeps KubeVirt VMs connected when they move between nodes in a production cluster and how it eliminates the need to update your complicated network infrastructure.

Kubernetes Networking Wasn’t Built for VMs

In a traditional hypervisor environment — vSphere, Hyper-V, Nutanix — VMs sit on VLANs and have fixed IPs. Upstream firewalls, load balancers, and DNS records all reference those IPs. A security team owns the VLAN segmentation while the network team owns the routing. This network infrastructure is the accumulated work of many years and forms a static, and somewhat brittle, system of securing hosts and getting traffic to its destination. The Kubernetes networking model, with its dynamic allocation of IPs that are meaningful only inside a cluster, is at odds with this traditional approach. Therein lies the problem.

The upstream network has no direct visibility into the pod network. When a VM is migrated from your existing hypervisor into Kubernetes, its original network segment is not preserved. The VM gets a new IP from the pod CIDR, and every firewall rule, DNS entry, and load balancer config that referenced the old IP is now broken. For a handful of VMs, you can reconfigure your firewall rules and routing manually. For hundreds or thousands reconfiguration becomes not only costly in terms of engineering effort but also injects the risk of breaking critical functionality and introducing security blind spots.

Two Networking Modes, Two Different Problems

Before diving into solutions, it helps to understand how KubeVirt presents networking to VMs. There are two modes for the primary pod interface, and they solve different problems.

Masquerade mode decouples the pod IP from the VM IP. KubeVirt assigns a static IP to the VM internally and uses NAT rules to translate between the two. Live migration works out of the box because the pod IP can change without affecting the VM. The trade-off is that you need a service-level abstraction to reach the VM from outside the pod, which makes this mode impractical for production workloads that need stable, directly-addressable IPs.

Bridge mode is the production-grade option. The pod IP and the VM IP are identical. The VM is directly reachable on the network. No NAT, no service abstraction. But bridge mode introduces a hard problem: when a VM live-migrates to a new node, KubeVirt creates a new pod on the destination. That new pod gets a fresh IP from the CNI. The VM still thinks it has its original IP. The result is a routing mismatch — the network doesn’t know where to send traffic, and the VM’s connections break.

KubeVirt only handles memory and disk migration. This does not matter much in masquerade mode since the VM’s IP is decoupled from the pod’s IP via NAT but becomes a critical consideration in bridge mode. So the CNI has to do three things to ensure nothing breaks: preserve the IP across the pod transition, converge routes so the rest of the network knows the VM has moved, and ensure network policy is in place on the destination before the VM goes live.

Live Migration in Bridge Mode: What Happens Under the Hood

VMs need to move between nodes for a variety of reasons, for example maintenance, load balancing, or high availability. What actually happens during a live migration in bridge mode and why is making it work right so hard?

The 5-step network handover during live migration in bridge mode
The 5-step network handover during live migration in bridge mode

The Core Challenge

When a migration is triggered using the KubeVirt command line utility, virtctl, KubeVirt creates a new pod on a destination node chosen by the Kubernetes scheduler in the usual way based on available resources, affinity rules, shared storage, etc. Next, KubeVirt copies the VM’s memory state using libvirt’s pre-copy and post-copy mechanisms.

Then things get a bit interesting.

The source pod continues running during the whole process. From a networking perspective, the same IP now needs to exist in two places temporarily — on the source node (where the VM is still running) and on the destination (where it’s about to go live).

The CNI has to solve three problems simultaneously: IP persistence across pod lifecycles, route convergence during the handover window, and policy continuity so the VM isn’t exposed during migration.

Let’s look at how Calico makes this happen.

IP Persistence: IPAM That Understands VMs

Traditionally, Calico IPAM allocates IPs to pods. The IPAM handle (the ownership ticket for an IP reservation) is derived from the pod’s identity. This works for containers because pods are ephemeral. But a KubeVirt VM is more like a Kubernetes Deployment: you define a VirtualMachine resource, and KubeVirt creates a randomly-named pod to run it. Every time you restart or migrate the VM, the pod changes, but the VM stays the same with the same identity, memory state and the same IP.

Since IPAM assigns the IP to the pod, every migration means a new IP, which defeats the purpose of preserving the VM’s IP and breaks any firewall rules, load balancer configurations or DNS records pointed at this IP.

To fix this, Calico constructs the IPAM handle from the VM’s name instead of the pod’s name ensuring that the reservation persists across pod lifecycles. When a VM migrates and its old pod is destroyed, the IPAM handle survives because it’s tied to the VM identity. When the new pod starts, the IPAM finds the existing handle and reuses the same IP. During migration, the IPAM transiently tracks dual ownership — an active owner on the source node and an alternate owner on the destination — then converges to a single owner once the source pod is cleaned up.

Route Convergence: The GARP Handover

IP persistence ensures the VM keeps its address. Route convergence ensures the rest of the network knows where to find it. Here’s the sequence:

  1. Migration initiated. The CNI watches for migration events in the Kubernetes API. As soon as one is created, it starts preparing the destination node’s networking — policies, routes, interface configuration — so that everything is in place before the VM actually moves.
  2. Memory pre-copy. KubeVirt and libvirt handle the iterative memory copy. The VM continues running on the source node. Traffic continues routing to the source at standard priority.
  3. VM goes live on destination. The VM broadcasts a Gratuitous ARP (GARP) packet announcing “I own this IP now, and I’m on this node.” Felix picks up this GARP and immediately advertises a high-priority route for the VM’s IP via the destination node. The networking layer picks this up and immediately starts steering traffic for the VM’s IP toward the new node, overriding the old route.
  4. Route priority override. This is a critical engineering detail. Normal routing uses a standard metric (1024). During migration, the destination node advertises the VM’s route at a higher priority metric (512). Because the source pod still exists briefly in a post-life state, both nodes momentarily have routes for the same IP. The higher-priority route ensures all traffic is forwarded to the destination, even before the source pod is fully cleaned up.
  5. Cleanup and steady state. Once the source pod terminates, the high-priority route is replaced with a standard-priority route. The source node’s route is removed. The network converges to its normal state with the VM on its new node at the same IP.

Policy continuity

The CNI watches for migration events and uses the lead time to pre-program network policies on the destination node while the memory copy is still in progress. By the time the VM cuts over, its security posture is already in place leaving no gap for unsanctioned traffic to slip through.

This works because Kubernetes network policies use label selectors, not IP addresses. The policies follow the VM’s identity, its labels, namespace, and network membership, not its physical location. When the VM appears on the destination node with the same labels, the same policies apply automatically. One nuance worth noting: while the policy rules carry over, stateful connection tracking (conntrack) does not currently replicate between nodes. Established connections survive because the routes converge, but the destination node evaluates them as new flows. Full conntrack replication is a planned future enhancement.

Portability and Standardization for VMs

If you’re familiar with vSphere, you know vMotion, paired with the vSphere distributed switch, managed live migration networking seamlessly. However, this transparency relies on a vertically integrated stack that is not portable to other cloud environments.

In Kubernetes, the stack is disaggregated. Components like KubeVirt (VM lifecycle), CNI (networking), policy engines (security), and storage operators (disks) each manage their own part. For live migration, the CNI must coordinate with KubeVirt’s migration state machine to manage the VM’s temporary dual-existence across two nodes and converge routing without a centralized controller.

The Kubernetes approach is fundamentally different. It uses open standards: CRI, CNI, CSI, and NetworkPolicy. KubeVirt extends this; VMs are custom resources, managed by kubectl, and scheduled by the same control plane. This approach demands a CNI that understands the unique lifecycle, identity and networking requirements of a pod running a VM but it also makes VMs portable.

It also means that now your containers and VMs can be managed and monitored using the same policies and tools and that means not only operational efficiency but better security and more reliable auditing.

Live migration is one piece of a larger networking story. If your KubeVirt rollout involves bridge mode at scale, multi-cluster topologies, BGP peering, or policy parity across VMs and containers, those decisions compound quickly. We pulled the full picture into The Complete Guide to VM Networking for Kubernetes, a practitioner’s reference covering the architectural choices, networking modes, and operational patterns that determine whether a migration ships or stalls.

Get The Complete Guide to VM Networking for Kubernetes

The post KubeVirt Live Migration Done Right: What it Takes to Run VMs on Kubernetes appeared first on Tigera – Creator of Calico.

Top comments (0)