In the stone age, almost everybody had the same answer regarding running stateful applications on top of Kubernetes: "Don't do it dude at all!". The main reasons behind this philosophy are persistent storage in the eco-system did its first baby steps at that time, and no one had experience with long-running storage systems in Kubernetes. As a storage space engineer, I can tell you that data loss is not a joke. The storage system is one of the most critical environments and needs lots of care before we say: stable
. Just remember BTRFS was in "testing" for almost a decade. But I think it is time to swap the paradigm (if you haven't done it yet) and start using Kubernetes with its full power. Why do I say that? Because there are nice options on the market to create replicated persistent volumes or permanent and reusable cloud disks. Ondat.io is one of the solutions, for more info, I suggest watching this nice video. But Ondat isn't the topic of this article :)
Please let me introduce our new open-source declarative disk configuration system: Discoblocks. It is in an early alpha state at the moment, but we put significant effort into turning it's production-ready soon. So USE IT YOUR OWN RISK, please!
What are Discoblocks good for?
Discoblocks is a disk and volume manager for Kubernetes helping to automate CRUD (Create, Read, Update, Delete) operations for cloud disk device resources attached to Kubernetes cluster nodes.
Why use Discoblocks?
Discoblocks can be leveraged by cloud-native data management platform to manage the backend disks in the cloud.
When using such a data management platform to overcome the block disk device limitation from hyperscalers, a new set of manual operational tasks needs to be considered.
At the current stage, Discoblocks is leveraging the available hyperscaler CSI (Container Storage Interface) within the Kubernetes cluster to:
- introduce a CRD (Custom Resource Definition) per workload with
- capacity
- mount path within the Pod
- nodeSelector
- podSelector
- upscale policy
- provision the relevant disk device using the CSI when the workload deployment will happen
- monitor the volume(s)
- resize automatically the volume based on the upscale policy
Currently, the only supported cloud driver is AWS EBS, but more coming. CSI driver has a really simple interface in Discoblocks:
// Driver public functionality of a driver
type Driver interface {
// IsStorageClassValid validates StorageClass
IsStorageClassValid(*storagev1.StorageClass) error
// GetPVCStub creates a PersistentVolumeClaim for driver
GetPVCStub(string, string, string) (*corev1.PersistentVolumeClaim, error)
// GetCSIDriverPodLabels returns the labels of CSI driver Pod
GetCSIDriverPodLabels() (string, map[string]string)
}
Can you tell me a real-world use case?
Imagine you have a website and you would like to store media files on a managed persistent volume. On the regular way, you have to create a PersistentVolumeClaim, then you have to open the manifests of the service, seek for workload part, and edit the workload at two places to attach PVC. Not a big deal, but requires some manual work, and the question is how many times do you have to do it? If the answer is exactly 1, the rest of this article is not for you, otherwise please continue reading :)
How does it work in action?
The first step is to create the cluster:
: ${NAME?= required}
: ${AWS_ACCOUNT_ID?= required}
# Create EKS cluster with EBS
eksctl create cluster --version 1.21 --name $NAME
Enable EBS addon on the cluster:
eksctl utils associate-iam-oidc-provider --cluster $NAME --approve
issuer=$(aws eks describe-cluster --name $NAME --query 'cluster.identity.oidc.issuer' --output text | sed 's|https://||')
now=$(date +'%s')
cat > aws-ebs-csi-driver-trust-policy.json <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::$AWS_ACCOUNT_ID:oidc-provider/$issuer"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringEquals": {
"$issuer:aud": "sts.amazonaws.com",
"$issuer:sub": "system:serviceaccount:kube-system:ebs-csi-controller-sa"
}
}
}
]
}
EOF
aws iam create-role --role-name $NAME-EKS_EBS_CSI_DriverRole-$now --assume-role-policy-document file://'aws-ebs-csi-driver-trust-policy.json'
aws iam attach-role-policy --policy-arn arn:aws:iam::aws:policy/service-role/AmazonEBSCSIDriverPolicy --role-name $NAME-EKS_EBS_CSI_DriverRole-$now
aws eks create-addon --cluster-name $NAME --addon-name aws-ebs-csi-driver --service-account-role-arn arn:aws:iam::$AWS_ACCOUNT_ID:role/$NAME-EKS_EBS_CSI_DriverRole-$now
Create StorageClass:
cat <<EOF | kubectl apply -f -
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: ebs-sc
provisioner: ebs.csi.aws.com
parameters:
type: gp3
allowVolumeExpansion: true
reclaimPolicy: Retain
volumeBindingMode: WaitForFirstConsumer
EOF
And finally, install Cert Manager:
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.8.0/cert-manager.yaml
The cluster is up and running, it is time for the magic:
kubectl apply -f https://github.com/ondat/discoblocks/releases/download/v0.0.3/discoblocks_v0.0.3.yaml
Please make sure Discoblocks is up and running:
kubectl get deploy -n kube-system discoblocks-controller-manager
NAME READY UP-TO-DATE AVAILABLE AGE
discoblocks-controller-manager 1/1 1 1 37s
Create a disk configuration custom resource:
cat <<EOF | kubectl apply -f -
apiVersion: discoblocks.ondat.io/v1
kind: DiskConfig
metadata:
name: diskconfig-nginx
namespace: default
spec:
# Name of the underlaying StorageClass
storageClassName: ebs-sc
# Initial capacity of the disk
capacity: 1Gi
# Where to mount the disk
mountPointPattern: /usr/share/nginx/html/media-%d
# What kind of workloads requires the volume
podSelector:
app: nginx
# Resize policies
policy:
upscaleTriggerPercentage: 50
maximumCapacityOfDisk: 2Gi
EOF
As the title mentions Discoblocks turns volume management upside-down. That means you don't have to change your manifests to attach a volume to the workload, the tool does the magic in the background. So use kubectl
command to create a deployment:
kubectl create deployment --image=nginx nginx
Check the result:
kubectl describe po -l app=nginx
Name: nginx-6799fc88d8-dkjtm
Namespace: default
Containers:
nginx:
Mounts:
/usr/share/nginx/html/media-0 from discoblocks-2064555022-2155331895-2470140894 (rw)
discoblocks-metrics:
Image: bitnami/node-exporter:1.3.1
Port: 9100/TCP
Command:
/opt/bitnami/node-exporter/bin/node_exporter
--collector.disable-defaults
--collector.filesystem
It seems Discoblocks has created the disk, attached it to the workload, and added a sidecar for measuring volumes.
Well done mate, but how does automatic resize work?
Here is the current state of the underlying PVC:
kubectl get pvc discoblocks-2064555022-2155331895-2470140894
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
discoblocks-2064555022-2155331895-2470140894 Bound pvc-636102fd-be8e-4861-88d2-63f51474f427 1Gi RWO ebs-sc 7m11s
So generate some data to the volume:
kubectl exec nginx-6799fc88d8-dkjtm -c nginx -- dd if=/dev/zero of=/usr/share/nginx/html/media-0/data count=1000000
Check mount point:
kubectl exec nginx-6799fc88d8-dkjtm -c nginx -- df -h
Filesystem Size Used Avail Use% Mounted on
/dev/nvme1n1 976M 491M 470M 52% /usr/share/nginx/html/media-0
As you see more than 50
% of the disk is used, Discoblocks should resize the volume:
kubectl get pvc discoblocks-2064555022-2155331895-2470140894
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
discoblocks-2064555022-2155331895-2470140894 Bound pvc-636102fd-be8e-4861-88d2-63f51474f427 2Gi RWO ebs-sc 10m
It did, how it looks like in the application side:
kubectl exec nginx-6799fc88d8-dkjtm -c nginx -- df -h
Filesystem Size Used Avail Use% Mounted on
/dev/nvme1n1 2.0G 492M 1.5G 25% /usr/share/nginx/html/media-0
Hurray!!!
Please feel free to give it a ride. Don't hesitate to share your experiences and thoughts with us. And finally: enjoy the new world of volume configuration!
Top comments (0)