Discussion on: Using Rook / Ceph with PVCs on Azure Kubernetes Service

View post

porrascarlos802018 • Nov 6 '20 • Edited

I think this method its great but requires some update to have the PODs running .

a peer Sergio Turrent updated the cluster.yaml to get it working as best effort.

as for 11/06/2020 Our cluster.yaml proposal its as follows:

apiVersion: ceph.rook.io/v1
kind: CephCluster
metadata:
name: rook-ceph
namespace: rook-ceph
spec:
dataDirHostPath: /var/lib/rook
mon:
count: 3
allowMultiplePerNode: false
volumeClaimTemplate:
spec:
storageClassName: managed-premium
resources:
requests:
storage: 10Gi
cephVersion:
image: ceph/ceph:v15.2.4
allowUnsupported: false
dashboard:
enabled: true
ssl: true
network:
hostNetwork: false
placement:
mon:
tolerations:
- key: storage-node
operator: Exists
storage:
storageClassDeviceSets:
- name: set1
# The number of OSDs to create from this device set
count: 4
# IMPORTANT: If volumes specified by the storageClassName are not portable across nodes
# this needs to be set to false. For example, if using the local storage provisioner
# this should be false.
portable: true
# Since the OSDs could end up on any node, an effort needs to be made to spread the OSDs
# across nodes as much as possible. Unfortunately the pod anti-affinity breaks down
# as soon as you have more than one OSD per node. If you have more OSDs than nodes, K8s may
# choose to schedule many of them on the same node. What we need is the Pod Topology
# Spread Constraints, which is alpha in K8s 1.16. This means that a feature gate must be
# enabled for this feature, and Rook also still needs to add support for this feature.
# Another approach for a small number of OSDs is to create a separate device set for each
# zone (or other set of nodes with a common label) so that the OSDs will end up on different
# nodes. This would require adding nodeAffinity to the placement here.
placement:
tolerations:
- key: storage-node
operator: Exists
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: agentpool
operator: In
values:
- npstorage
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app
operator: In
values:
- rook-ceph-osd
- key: app
operator: In
values:
- rook-ceph-osd-prepare
topologyKey: kubernetes.io/hostname
resources:
limits:
cpu: "500m"
memory: "4Gi"
requests:
cpu: "500m"
memory: "2Gi"
volumeClaimTemplates:
- metadata:
name: data
spec:
resources:
requests:
storage: 100Gi
storageClassName: managed-premium
volumeMode: Block
accessModes:
- ReadWriteOnce
disruptionManagement:
managePodBudgets: false
osdMaintenanceTimeout: 30
manageMachineDisruptionBudgets: false
machineDisruptionBudgetNamespace: openshift-machine-api

porrascarlos802018 • Nov 6 '20

and just to summarize, step by step procedure I used to get to the point of having the pods running is:

Step 1 creating a nodepool in AKS:
az aks nodepool add --cluster-name aks3del --name npstorage --node-count 2 --resource-group aks3del --node-taints storage-node=true:NoSchedule

Step 2
az aks get-credentials --resource-group aks3del --name aks3del

step 4
kubectl get nodes

Step 5
execute this command:
git clone github.com/rook/rook.git

Step 6
if you are in /home/user , then a new folder called rook should be there.
cd rook

step 7
switch to the /cluster/examples/kubernetes/ceph directory and follow the steps below.

step 8
run the command , kubectl apply -f common.yaml
from the cluster/examples/kubernetes/ceph directory.

step 9
create a new operator.yaml file, do not use the one, from the directory, create a new one and apply it.
vi operator2.yaml
kubectl apply -f operator2.yaml

step 9 validate you have storage class for premiun SSD disks.
kubectl get storageclass

step 10
Create a new cluster2.yaml file, copy and paste the one attached
apply the yaml
kubectl apply -f cluster2.yaml

step 11
validate you got the OSD pods created.
kubectl get pods -n rook-ceph
you will see some pods in init, wait a while
they will eventually start.

rook-ceph-osd will mount a disk using a pvc each one
rook-ceph-osd runs 1 per node.
if a node goes down , the disk dies, the other nodes will have enough data to restore in a new node.
rook-ceph-mon-X are the ones who control the logic on which side it has to replicate the data to have redundancy.
when accessing the data mons are the ones who informs the "client" to know where to retrieve the data.