DEV Community

Cover image for Backup and Restore Kubernetes Resources Across vCluster using Velero
Improving
Improving

Posted on • Originally published at improving.com

Backup and Restore Kubernetes Resources Across vCluster using Velero

In Kubernetes environments, teams are constantly looking for ways to move faster without sacrificing security or efficiency. Managing multiple environments like development, testing, and staging often leads to cluster sprawl, higher costs, and complex maintenance. This is where virtual clusters come in.

Virtual clusters make it possible to create isolated, on-demand Kubernetes environments that share the same underlying infrastructure. They give developers the freedom to spin up their own clusters quickly for testing new features, running experiments, or deploying temporary workloads — all without waiting on cluster admins or consuming extra resources. Each virtual cluster runs its own control plane, offering stronger isolation and flexibility than namespace-based setups. We'll be using vCluster, an implementation of virtual clusters by Loft, to illustrate the concept in practice.

Managing workloads across multiple virtual clusters is a common pattern in multi-tenant environments. However, while virtual clusters make isolation easy, moving workloads across them is not straightforward. That's where Velero comes in — it is a powerful Kubernetes backup tool that migrates workloads from one virtual cluster to another.

In this blog post, we'll understand the importance of backups, how Velero works, and walk you through a practical migration of resources using Velero — from backing up one virtual cluster to restoring it in another.


What is Velero?

Velero is an open source tool to back up and restore your Kubernetes cluster resources and persistent volumes. You can run Velero with a cloud provider or on-premises.

Velero lets you:

  • Take backups of your cluster and restore in case of loss
  • Migrate cluster resources to other clusters
  • Replicate your production cluster to development and testing clusters

Velero consists of:

  1. Velero CLI

    • Runs on your local machine.
    • Used to create, schedule, and manage backups and restores.
  2. Kubernetes API Server

    • Receives backup requests from the Velero CLI.
    • Stores Velero custom resources (like Backup) in etcd.
  3. Velero Server (BackupController)

    • Runs inside the Kubernetes cluster.
    • Watches the Kubernetes API for Velero backup requests.
    • Collects Kubernetes resource data and triggers backups.
  4. Cloud Provider / Object Storage

    • Stores backup data and metadata.
    • Creates volume snapshots using the cloud provider's API (e.g., Azure Disk Snapshots).

How it works:

  1. User runs a Velero backup command using the CLI: velero backup create my-backup
  2. CLI creates a backup request in Kubernetes
  3. The Velero server detects the request and gathers cluster resources
  4. Backup data is uploaded to cloud object storage
  5. Persistent volumes are backed up using cloud snapshots (if enabled)

Velero supports a variety of storage providers for different backup and snapshot operations. In this blog post, we will focus on the Azure provider.


What is vCluster?

vCluster enables building virtual clusters — a certified Kubernetes distribution that runs as isolated, virtual environments within a physical host cluster. They enhance isolation and flexibility in multi-tenant Kubernetes setups. Multiple teams can work independently on shared infrastructure, helping minimize conflicts, increase team autonomy, and reduce infrastructure costs.

A virtual cluster:

  • Runs inside a namespace of the host cluster
  • Has an API server, control plane, and syncer
  • Maintains its own set of Kubernetes resources, operating like a full cluster

Why Backup and Migrate Workloads Using vCluster?

Common reasons to back up or migrate workloads between vClusters:

  • Promoting apps from dev to staging or prod: Backing up and restoring workloads between vClusters allows smooth promotion of applications across environments, ensuring consistent configurations and deployments without manual rework.
  • Replicating test environments: It helps recreate identical test setups quickly, enabling developers to reproduce issues, validate fixes, or test new features in isolated environments.
  • Disaster recovery (DR) setup: Regular backups across vClusters ensure business continuity by allowing workloads to be restored rapidly in another cluster if the primary one fails.
  • Tenant migration in multi-tenant environments: vClusters make it easier to move tenants between isolated environments without affecting others, maintaining data security and minimizing downtime.
  • Cluster version upgrades or deprecations: When upgrading or decommissioning a cluster, backing up workloads to another vCluster ensures a seamless transition without losing data or configurations.

Why Use Velero with vCluster?

Virtual clusters built with vCluster are lightweight and isolated, but they don't provide built-in mechanisms for backing up workloads, restoring them, or moving applications between clusters. Without a backup solution, recovery and migration can be risky.

Using Velero with vCluster fills this gap by enabling simple backup, restore, and migration workflows directly inside virtual clusters. It allows you to move applications between clusters with minimal setup and perform migrations with little to no downtime, especially for stateless workloads.


How to Backup and Migrate Workloads Between vClusters

Let's see how to use Velero to back up workloads from one vCluster and restore them into another. Think of it as moving your app from dev to staging across two clusters running on two different Azure clusters.


Prerequisites

Before starting, make sure you have the following:

  • Two clusters up and running on Azure (any cloud offering works)
  • Two running vClusters (source and destination)
  • Velero CLI installed on your machine

Step-by-step Guide

In the source vCluster and destination vCluster, we will install Velero with the same configuration, deploy a sample MySQL Pod, take its backup at source, and restore it in the destination vCluster. We will be using the Azure provider to run Velero.

To set up Velero on Azure, you have to:

  • Create an Azure storage account and blob container
  • Get the resource group details
  • Set permissions for Velero

Velero needs access to your Azure storage account to upload and retrieve backups. You'll need to assign the "Storage Blob Data Contributor" role (or equivalent) to the identity or service principal Velero uses, ensuring it can read, write, and manage backup data in the blob container.

1. Create Azure Resources

Create a resource group:

AZURE_RESOURCE_GROUP=<YOUR_RESOURCE_GROUP>
az group create --name $AZURE_RESOURCE_GROUP --location <YOUR_LOCATION>
Enter fullscreen mode Exit fullscreen mode

Create the storage account:

AZURE_STORAGE_ACCOUNT=<YOUR_STORAGE_ACCOUNT>
az storage account create \
  --name $AZURE_STORAGE_ACCOUNT \
  --resource-group $AZURE_RESOURCE_GROUP \
  --sku Standard_GRS \
  --encryption-services blob \
  --https-only true \
  --kind BlobStorage \
  --access-tier Hot
Enter fullscreen mode Exit fullscreen mode

Create a blob container:

BLOB_CONTAINER=velero
az storage container create \
  --name $BLOB_CONTAINER \
  --public-access off \
  --account-name $AZURE_STORAGE_ACCOUNT
Enter fullscreen mode Exit fullscreen mode

2. Create a Service Principal with Contributor Privileges

AZURE_SUBSCRIPTION_ID=$(az account list --query '[?isDefault].id' -o tsv)
AZURE_TENANT_ID=$(az account list --query '[?isDefault].tenantId' -o tsv)

az ad sp create-for-rbac \
  --name "velero" \
  --role "Contributor" \
  --scopes /subscriptions/$AZURE_SUBSCRIPTION_ID \
  --query '{clientId: appId, clientSecret: password, tenantId: tenant}'
Enter fullscreen mode Exit fullscreen mode

This outputs clientId, clientSecret, subscriptionId, and tenantId. Store these values.

Get the Client ID and store it in a variable:

AZURE_CLIENT_ID=$(az ad sp list --display-name "velero" --query '[0].appId' -o tsv)
Enter fullscreen mode Exit fullscreen mode

Assign additional permissions to the Client ID:

az role assignment create \
  --assignee $AZURE_CLIENT_ID \
  --role "Storage Blob Data Contributor" \
  --scope /subscriptions/$AZURE_SUBSCRIPTION_ID/resourceGroups/$AZURE_RESOURCE_GROUP/providers/Microsoft.Storage/storageAccounts/$AZURE_STORAGE_ACCOUNT
Enter fullscreen mode Exit fullscreen mode

3. Prepare Credentials

With the output received above, create bsl-creds and cloud-creds for the Velero setup.

  • BSL (Backup Storage Location) — the blob container where Velero stores backups. Velero needs a secret to access this storage location.
  • cloud-creds — credentials required to access the Azure cluster.

You will need the following values:

AZURE_SUBSCRIPTION_ID=<YOUR_SUBSCRIPTION_ID>
AZURE_TENANT_ID=<YOUR_TENANT_ID>
AZURE_CLIENT_ID=<YOUR_CLIENT_ID>
AZURE_CLIENT_SECRET=<YOUR_CLIENT_SECRET>
AZURE_RESOURCE_GROUP=<YOUR_RESOURCE_GROUP>
AZURE_CLOUD_NAME=AzurePublicCloud
AZURE_ENVIRONMENT=AzurePublicCloud
Enter fullscreen mode Exit fullscreen mode

4. Log in to vCluster and Create Velero Namespace

kubectl create namespace velero
Enter fullscreen mode Exit fullscreen mode

5. Create BSL and Cloud Credentials

bsl-creds.yaml:

apiVersion: v1
kind: Secret
metadata:
  name: bsl-creds
  namespace: velero
type: Opaque
data:
  cloud: <BASE64_ENCODED_VALUE>
  # Encode the following as base64:
  # [default]
  # storageAccount: <YOUR_STORAGE_ACCOUNT>
  # storageAccountKey: <YOUR_STORAGE_ACCOUNT_KEY>
  # subscriptionId: <YOUR_SUBSCRIPTION_ID>
  # resourceGroup: <YOUR_RESOURCE_GROUP>
Enter fullscreen mode Exit fullscreen mode

cloud-creds.yaml:

apiVersion: v1
kind: Secret
metadata:
  name: cloud-creds
  namespace: velero
type: Opaque
data:
  cloud: <BASE64_ENCODED_VALUE>
  # Encode the following as base64:
  # AZURE_SUBSCRIPTION_ID=<YOUR_SUBSCRIPTION_ID>
  # AZURE_TENANT_ID=<YOUR_TENANT_ID>
  # AZURE_CLIENT_ID=<YOUR_CLIENT_ID>
  # AZURE_CLIENT_SECRET=<YOUR_CLIENT_SECRET>
  # AZURE_RESOURCE_GROUP=<YOUR_RESOURCE_GROUP>
  # AZURE_CLOUD_NAME=AzurePublicCloud
Enter fullscreen mode Exit fullscreen mode

Apply the secrets:

kubectl apply -f bsl-creds.yaml -n velero
kubectl apply -f cloud-creds.yaml -n velero
Enter fullscreen mode Exit fullscreen mode

6. Install Velero Using Helm

Use the following values.yaml. Both the source and destination vClusters use the same file:

configuration:
  backupStorageLocation:
    - name: default
      provider: azure
      bucket: velero
      config:
        resourceGroup: <YOUR_RESOURCE_GROUP>
        storageAccount: <YOUR_STORAGE_ACCOUNT>
        subscriptionId: <YOUR_SUBSCRIPTION_ID>
      credential:
        name: bsl-creds
        key: cloud

  volumeSnapshotLocation:
    - name: default
      provider: azure
      config:
        resourceGroup: <YOUR_RESOURCE_GROUP>
        subscriptionId: <YOUR_SUBSCRIPTION_ID>
      credential:
        name: cloud-creds
        key: cloud

credentials:
  useSecret: true
  existingSecret: cloud-creds

deployNodeAgent: true

nodeAgent:
  podVolumePath: /var/lib/kubelet/pods
  privileged: true
Enter fullscreen mode Exit fullscreen mode

Install the Helm chart:

helm install velero vmware-tanzu/velero \
  --namespace velero \
  -f values.yaml
Enter fullscreen mode Exit fullscreen mode

Once installed, you will see velero and node-agent pods running in the velero namespace:

kubectl get pods -n velero
Enter fullscreen mode Exit fullscreen mode

Repeat the same Velero installation steps in the destination vCluster.


Backup and Restore a Sample MySQL Pod

Deploy MySQL in Source vCluster

mysql-pod.yaml:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: mysql-pvc
  namespace: default
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
---
apiVersion: v1
kind: Pod
metadata:
  name: mysql-pod
  namespace: default
  labels:
    app: mysql
spec:
  containers:
    - name: mysql
      image: mysql:8.0
      env:
        - name: MYSQL_ROOT_PASSWORD
          value: rootpassword
        - name: MYSQL_DATABASE
          value: testdb
      volumeMounts:
        - name: mysql-storage
          mountPath: /var/lib/mysql
  volumes:
    - name: mysql-storage
      persistentVolumeClaim:
        claimName: mysql-pvc
Enter fullscreen mode Exit fullscreen mode

Apply the manifest:

kubectl apply -f mysql-pod.yaml
Enter fullscreen mode Exit fullscreen mode

Add Test Data

Exec into the pod:

kubectl exec -it mysql-pod -- /bin/bash
Enter fullscreen mode Exit fullscreen mode

Run the following commands inside the pod to add test files:

echo "test data 1" > /var/lib/mysql/test1.txt
echo "test data 2" > /var/lib/mysql/test2.txt
Enter fullscreen mode Exit fullscreen mode

This creates test1.txt and test2.txt.

Take a Backup

velero backup create mysql-backup \
  --include-namespaces default \
  --default-volumes-to-fs-backup \
  --wait
Enter fullscreen mode Exit fullscreen mode

Check backup status:

velero backup get
Enter fullscreen mode Exit fullscreen mode

The backup status should show Completed.


Restore in Destination vCluster

Update values.yaml for Destination

Make sure the Velero config is the same as the source. Use the same values.yaml, but update these two parameters:

# Change these in values.yaml for destination cluster
configuration:
  backupStorageLocation:
    - name: default
      # Keep all values the same as source — point to the same blob container
      accessMode: ReadOnly   # Destination reads from source's storage
Enter fullscreen mode Exit fullscreen mode

After Velero is installed at the destination vCluster, verify you can see the source backups:

velero backup get
Enter fullscreen mode Exit fullscreen mode

You will see the same backup list as the source vCluster.

Create a Restore

restore.yaml:

apiVersion: velero.io/v1
kind: Restore
metadata:
  name: mysql-restore
  namespace: velero
spec:
  backupName: mysql-backup
  includedNamespaces:
    - default
  restorePVs: true
  itemOperationTimeout: 4h
Enter fullscreen mode Exit fullscreen mode

Apply the restore:

kubectl apply -f restore.yaml -n velero
Enter fullscreen mode Exit fullscreen mode

Check restore status:

velero restore get
velero restore describe mysql-restore --details
Enter fullscreen mode Exit fullscreen mode

To verify the restore, attach the PVC (created after restore completes) to a pod, exec into it, and confirm the data (test1.txt and test2.txt) is present.


Troubleshooting Tips

Issue 1: Backup status is PartiallyFailed or FailedValidation

Solution: Describe the backup for details:

velero backup describe mysql-backup --details
Enter fullscreen mode Exit fullscreen mode

Check the backup logs:

velero backup logs mysql-backup
Enter fullscreen mode Exit fullscreen mode

If nothing useful appears, check the Velero pod logs:

kubectl logs -n velero deployment/velero | grep mysql-backup
Enter fullscreen mode Exit fullscreen mode

After running the above three commands, you'll likely find the root cause. Common causes include permission issues or incorrect credentials. Sometimes partial failures occur because the node-agent pod isn't running on a node — in that case, manually schedule a pod on that node.


Issue 2: Node Agent Pod is Not Running

node-agent-xxxxx   0/1   Pending   0   5m
Enter fullscreen mode Exit fullscreen mode

Solution: There is a node with no pods running on it, so the node-agent DaemonSet pod is also not scheduled. Manually schedule a sample pod on that node to trigger scheduling. Once a sample pod is running, the node-agent pod will also be scheduled and start running.


Issue 3: Restore Fails Without Specific Errors

Solution: Restart the restore process from scratch:

  1. Delete all resources created by the restore job (pods, statefulsets, deployments, PVCs, etc.)

    OR

    If restoring a whole namespace, delete the entire restored namespace.

  2. Delete the restore job:

velero restore delete mysql-restore
Enter fullscreen mode Exit fullscreen mode
  1. After the restore job is deleted, ArgoCD (if used) will automatically sync and recreate the restore job, triggering the Velero restoration.

Conclusion

Using Velero to back up and restore workloads across vClusters provides a robust and flexible approach for managing multi-tenant Kubernetes environments. Whether you're migrating applications between development and production, setting up disaster recovery, or replicating environments for testing, Velero simplifies the process significantly.

In this blog post, we explored how to back up and restore Kubernetes clusters using Velero. While the process is straightforward in principle, production environments can introduce added complexity — factors like cluster size, workloads, and configurations often make a difference.


Originally published at improving.com

Top comments (0)