by Fabian Kramm
Virtual Kubernetes clusters are fully functional Kubernetes clusters that run within another Kubernetes cluster. The difference between a regular Kubernetes namespace and a virtual cluster is that a virtual cluster has its own separate Kubernetes control plane and storage backend. Only a handful of core resources, such as pods and services, are actually shared among the virtual and host cluster. All other resources, such as CRDs, statefulsets, deployments, webhooks, jobs, etc., only exist in the pure virtual Kubernetes cluster.
This provides a lot better isolation than a regular Kubernetes namespace and decreases the pressure on the host Kubernetes cluster as API requests to the virtual Kubernetes cluster in most cases do not reach the host cluster at all. In addition, all created resources by the virtual cluster are also tied to a single namespace in the host cluster, no matter in which virtual cluster namespace you create those resources in.
With version v0.3.0, vcluster an open source implementation of the virtual Kubernetes cluster pattern and that builds upon the lightweight Kubernetes distribution k3s, now also became a certified Kubernetes distribution and is 100% Kubernetes API compatible. This makes virtual clusters now even more interesting to use.
In essence, virtual Kubernetes clusters are a trade-off between namespaces and separate Kubernetes clusters. They are easier and cheaper to create than fully blown clusters, but they are not as well isolated as completely separate clusters, since they still interact with the host Kubernetes cluster and create the actual workloads in it. On the other hand, they provide much better isolation than namespaces. Virtual clusters use a completely separate control plane, and within a virtual cluster you have full cluster-wide control access. Nonetheless, a single namespace is still cheaper and easier to create.
The table below summarizes the differences between Namespaces, virtual Kubernetes clusters, and fully separate clusters.
The important takeaway from this is that virtual Kubernetes clusters provide a new alternative to both namespaces and separate clusters. Virtual Kubernetes clusters provide an excellent opportunity to replace separate clusters and drastically reduce your infrastructure and management costs, especially in scenarios where you have at least basic trust in your tenants (say separate teams across your company, CI/CD pipelines, or even several trusted customers).
Let's say you are a company that provides some sort of SaaS service, and you have around 100 developers distributed across 20 teams that implement different parts of the service. For each of those 20 teams, you provisioned separate Kubernetes clusters to test and develop the application, as this was the easiest and most flexible approach. Each team’s cluster has at least three nodes to guarantee availability and then automatically scales up and down based on usage.
Your minimum infrastructure bill in Google Cloud might look like this over 12 months (according to the Google Cloud Pricing Calculator):
20 Clusters * 3 Nodes (n1-standard-1) = 12 * 20 * $72.82 = $17,476.8
GKE Management Cost:
20 Clusters (Zonal) = 12 * 20 * $71.60 = $17,184
Total Cost Per Year (Without Traffic etc.):
$17,476.80 + $17,184 = $34,660.80
In total, you are looking at a minimum estimated raw node + management cost of about $35,000. Obviously, you could still fine-tune certain aspects here, for example reducing the minimum node pool size or using preemptive nodes instead of regular nodes.
The advantages of this setup are clear. Each team has its own separate Kubernetes cluster and endpoint to work with and can install cluster-wide resources and dependencies (such as a custom service-mesh, ingress controller, or monitoring solution). On the other hand, you'll also notice that the cost is quite high. Resource sharing across teams is rather difficult, and there is a huge overhead if certain clusters are not used at all.
This is a perfect example where virtual Kubernetes clusters could come in handy. Instead of 20 different GKE clusters, you would create a single GKE Kubernetes cluster and then deploy 20 virtual Kubernetes clusters within it. Now each team gets access to only a single virtual Kubernetes cluster endpoint that essentially maps to a single namespace in the underlying GKE cluster.
The really great part about this is that from the developers’ perspective, nothing has changed. Each team can still create all the cluster services they want within their own virtual cluster, such as deploy their own Istio service mesh, custom cert-manager version, Prometheus stack, Kafka operator, etc. without affecting the host cluster services. They can essentially use it the same way as they would have used the separate cluster before.
Another benefit is that the setup is now much more resource-efficient. Since the virtual Kubernetes clusters and all of their workloads are also just simple pods in the host GKE cluster, you can leverage the full power of the Kubernetes scheduler. So, for example, if a team is on vacation or is not using the virtual Kubernetes cluster at all, there will be no pods scheduled in the host cluster consuming any resources. In general, this means the node resource utilization of the GKE cluster should now be much better than before.
Another significant advantage with a single GKE cluster and multiple virtual clusters in it is that you as the infrastructure team can centralize certain services in the host cluster, such as a central ingress controller, service mesh, metrics, or logging solutions instead of installing it every time into all of the separate clusters. The virtual clusters will be able to consume those services, or if the teams prefer they can still add their own. Teams will also be able to access each other’s services if that is needed, which would be very difficult with completely separate clusters.
Furthermore, you will save on the cloud provider Kubernetes management fees by sharing resources better. And vcluster is open source, so it’s self-managed and completely free. The new cost estimate would look a little bit more like this if you would reserve a node for each team and add three extra nodes as a high availability buffer:
1 Clusters * 23 Nodes (n1-standard-1) = 12 * $558.26 = $6699.12
GKE Management Cost:
1 Clusters (Zonal) = 12 * 1 * $71.60 = $859.20
Total Cost Per Year (Without Traffic etc.):
$6699.12 + $859.2 = $7558.32
Total Cost Savings:
$34,660.80 - $7558.32 = $27,102.48 (78.2% savings)
In this case, your minimum node & GKE management fee infrastructure bill would be cut down by 78.2%. This is a rather constructed example, but it shows the considerable potential of virtual clusters. For the teams using the virtual clusters, essentially nothing would change because they would still have access to a fully functional Kubernetes cluster where they could deploy their workloads and cluster services freely.
Virtual Kubernetes clusters are a third option if you have to decide between namespaces or separate clusters. Virtual clusters will probably never completely replace the need for separate clusters. Still, they have significant advantages if your use case fits, as you can save significant infrastructure and management costs with them.