Lukas Gentele for LoftLabs

Posted on Jul 7, 2022 • Originally published at loft.sh

10 Essentials For Kubernetes Multi-Tenancy

#kubernetes #vcluster #devspace #loft

Kubernetes’s popularity continues to grow as increasing numbers of companies adopt it to manage their containerized workloads. According to the 2021 annual CNCF report, ninety-six percent of enterprises surveyed use Kubernetes to some extent—the highest since the survey began in 2016.

While Kubernetes adoption has enabled many businesses to optimize resource usage, it’s still a challenge for multiple entities, or tenants, to share resources securely in a Kubernetes cluster. The resource-sharing practice in which each user’s workload and data are isolated from one another is known as multi-tenancy.

Kubernetes does not offer multi-tenancy by default. Enabling a Kubernetes multi-tenant architecture comes with significant challenges, especially in regard to achieving true cluster isolation and fair resource allocation. In this article, you’ll learn about ten essential considerations when opting to use Kubernetes multi-tenancy.

Introduction to Kubernetes Multi-Tenancy

Kubernetes multi-tenancy is an architectural model where multiple tenants share a cluster’s resources, including storage, networking, CPU, compute, and control plane. This model also includes resource quotas and limits that act as boundaries for resource usage. Sharing Kubernetes cluster resources helps to reduce costs and enhance productivity through faster and more scalable container-based deployments.

While it’s possible to use single-tenant clusters rather than multi-tenant ones, this solution is ineffective and problematic, particularly at scale. For instance, imagine having to spin up and manage hundreds or thousands of clusters. By moving to a multi-tenancy model, a tenant’s resources can be organized into separate groups in the control plane, enabling restricted access or visibility to resources not within the control plane domain. This helps you reduce operating costs, increase accessibility and productivity, and boost your return on investment.

Considerations for Kubernetes Multi-Tenancy

A multi-tenant Kubernetes cluster is not without challenges. For example, by implementing multi-tenancy, you automatically impose limitations on the tenants. While the tenants still have access to many resources within their namespaces, they don’t retain the complete control and capabilities they had with single-tenant clusters.

Another potential issue is the noisy neighbor problem. This happens when a tenant takes up an excessive amount of shared resources, leading to less-than-optimal performance for other tenants. This problem can slow down the latter’s workloads, and sometimes even prevent them from running altogether.

The following list covers ten of the most critical considerations when it comes to effective enterprise-grade Kubernetes multi-tenancy.

Resource Limits

Tenants should share resources fairly. A tenant using a disproportionate share of cluster resources in a cluster can impair performance for other tenants. To resolve this problem, ensure that you set up resource usage limits for every tenant.

In Kubernetes, this is managed by the ResourceQuota object. This object limits the total consumption of resources per namespace. You can set quotas based on CPU, memory usage, or object counts. If you enable quotas for a namespace, the users have to specify usage limits for each resource, or the request will be rejected. Alternatively, users can set default limits via LimitRanges.

Here’s a sample resource quota file that allocates five CPU requests and a CPU limit of ten to a specific namespace.

apiVersion: v1
kind: ResourceQuota
metadata:
  name: user1
spec:
  hard:
    requests.cpu: "5"
    requests.memory: 1Gi
    limits.cpu: "10"
    limits.memory: 2Gi

Policies like this can be applied to namespaces via the kubectl apply command.

Cost Tracking

It can be surprisingly easy to incur unnecessary cloud costs in Kubernetes, particularly in multi-tenant environments. To put this in perspective, over thirty-five percent of respondents in a FinOps for Kubernetes survey have seen their Kubernetes-related expenses rise by more than twenty percent within the space of a year.

The report from the survey also highlights the difficulty in monitoring and instilling practices to curb escalating Kubernetes costs. One of the best ways to reduce costs is to distribute them according to the number of resources a workload consumes. For multi-tenant Kubernetes clusters, it is important to implement a chargeback model to help allocate and distribute cloud and infrastructure costs and resources among tenants, developer teams, and organizations. This helps promote shared responsibility and accountability among tenants in a cluster, making it easy to enforce chargebacks and bill tenants for resources consumed. Prometheus, Kubecost, and Grafana are popular solutions to monitor Kubernetes costs.

Audit Logging

Another essential element of Kubernetes multi-tenancy is auditing. A Kubernetes audit log contains a chronological set of relevant records describing the sequence of events in a cluster. These log results help quickly identify and resolve issues regarding unusual cluster activity, attacks, slow API requests, and failed authentications. As part of its auditing process, the Kubernetes cluster tracks user, application, and control plane activities.

Other logged actions include URL requests made through the Kubernetes API server, who made the request, time of request, origin, and destination, as well as the reasons for approval or rejection. Audit logging can be configured via the Loft config in Loft UI, or through the secret loft/loft-config. Other tools to consider for audit logging in Kubernetes include Prometheus and Grafana.

Network Policies

By default, Kubernetes cluster configurations allow any service on any namespace to be accessible. As a result, the pods are open to all traffic. Network policies allow you to specify how pods are can interact with various entities on the network.

In essence, network policies allow you to control which ports are accessible to the pods on the network. Network policies help you create default blocking rules. When you’re defining a network policy based on a namespace or pod, you can use a selector to set the traffic that can pass to or from the pod corresponding to that selector.

RBAC

Kubernetes Role-Based Access Control (RBAC) is a form of authorization that helps control tenant access to Kubernetes clusters and network resources. What makes RBAC valuable is its flexibility in adding or modifying Kubernetes access control and permissions. It enables you to specify the Kubernetes components available to each user, as well as what that user can do with each component.

The four objects of the RBAC API are Role, ClusterRole, RoleBinding, and ClusterRoleBinding. A Role defines a set of permissions in which the Role is always responsible for setting permissions in a given namespace, and ClusterRole does the same thing for cluster-wide permissions.

RoleBinding and ClusterRoleBinding are similar to Role and Cluster, but with a broader scope. RoleBinding grants namespace-scoped permissions for Role and ClusterRole, while ClusterRoleBinding grants cluster-wide permissions to all namespaces in that cluster.

Virtual Clusters

Sharing Kubernetes clusters among team members and organizations can be so complex that many companies have given up on multi-tenancy in favor of provisioning individual Kubernetes clusters. Namespaces can offer a solution to this problem by running isolated environments in a single Kubernetes cluster—but they’re not without limitations.

Teams or engineers that leverage dedicated namespaces are limited to the RBAC permissions assigned to them. When you factor in network policies and admission control, this often means that engineers have less flexibility and control than they would have with a single dedicated cluster.

This can be limiting if, for example, an application requires cluster-scoped resources that are not namespaced, since the developer can only access them via the host cluster. If you want a truly isolated environment, namespaces are not the best option. Virtual clusters help you create a virtual representation of a Kubernetes cluster inside an existing one.

As the name implies, virtual clusters mimic the concept of virtual machines. These clusters provide you with many of the resources present in a standard Kubernetes cluster. It can contain an API server, controller manager, and storage (etcd). One notable solution for creating and running virtual clusters is vcluster by Loft Labs.

Pod Security

Pod Security addresses a number of problems plaguing Kubernetes’ deprecated PodSecurityPolicy, or PSP, approach. PSPs were effective in providing native cluster-based resource policies that governed the creation and updating of Kubernetes pods. In reality, however, it’s rare to create a pod without leveraging a robust controller that sets permissions for granular policy authorization, consequently reducing security risk.

While PSPs are automatically applied when a pod creation request is made, there is no way to apply them to pods that already exist in the cluster. These shortcomings led to the creation of Pod Security, which is currently in beta. Pod security is an integrated admission controller that can also run as an independent webhook that intercepts requests on the kube-apiserver before persisting them to storage. It evaluates the pod specifications against three policy levels based on Pod Security standards:

Privileged: No restrictions, broadest permission level
Baseline: Minimal restrictions to guard against foreseeable privilege escalations
Restricted: Extreme restrictions

The policy levels can be assigned by labels to namespace resources, allowing for fine-grained policy control per namespace. Using the API server’s AdmissionConfiguration resource, you can configure and define cluster-wide admission policies and exemptions. While Pod Security doesn’t offer all the features available on the deprecated PSP, you can leverage other solutions such as Kyverno, OPA GateKeeper, and Kubewarden for more granular policy control.

Usage Metrics

Understanding how an application behaves is crucial for scalability and reliability. You can observe the performance of your application in a cluster by monitoring and inspecting the pods, services, namespaces, and other resources in your cluster. The Metrics API lets you examine these resources, enabling you to assess your application’s performance.

Kubernetes provides detailed data on how an application consumes resources. Using Kubernetes’ access control mechanisms, you can configure permissions to allow Kubernetes API clients to access your cluster’s metrics information.

Secrets Encryption at Rest

Kubernetes uses secrets to store and manage sensitive cluster data such as passwords, usernames, SSH keys, OAuth tokens, and other encryptions. By default, the information in secrets is stored in plaintext in the etcd, but encryption is supported for etcd data at rest.

Each tenant of a multi-tenant cluster accesses and shares one etcd storage. When accessing a Kubernetes control plane, it is advisable to encrypt and store sensitive secrets data at rest. This adds an extra layer of security to your cluster, protecting sensitive information from breaches and facilitating compliance. There are many solutions for storing sensitive data, such as HashiCorp Vault and AWS Secrets Manager.

Policy Engines

Many organizations have applications that cater to a global audience. As a result, they must comply with a variety of regulatory standards such as GDPR, HIPAA, and PCI, as well as any other regulations or policies in the regions and countries they serve. Policy engines play an important role in Kubernetes’ compliance, security, and configuration management.

They enable you to establish the policies and regulations that govern cluster deployments and applications. Using predefined policies, policy engines can dynamically modify or create configurations. Policy engines such as Gatekeeper and Kyverno can be leveraged to meet legal and compliance requirements while maintaining operational flexibility and development speed.

Final Thoughts On Kubernetes Multi-Tenancy

Kubernetes multi-tenancy can be incredibly challenging for high-traffic enterprises with security and reliability requirements, particularly considering the limitations and cost of single-tenant solutions. Despite this complexity, there are several features available to assist you.

Knowledge of the key essentials of Kubernetes multi-tenancy—such as virtual clusters, resource limits, cost tracking, RBAC, and audit logging—can get you started. These core elements will help you achieve a high degree of resource sharing and true cluster isolation, regardless of the number of applications running on your Kubernetes cluster.

Photo by Teng Yuhong on Unsplash