I think this method its great but requires some update to have the PODs running .
a peer Sergio Turrent updated the cluster.yaml to get it working as best effort.
as for 11/06/2020 Our cluster.yaml proposal its as follows:
apiVersion: ceph.rook.io/v1
kind: CephCluster
metadata:
name: rook-ceph
namespace: rook-ceph
spec:
dataDirHostPath: /var/lib/rook
mon:
count: 3
allowMultiplePerNode: false
volumeClaimTemplate:
spec:
storageClassName: managed-premium
resources:
requests:
storage: 10Gi
cephVersion:
image: ceph/ceph:v15.2.4
allowUnsupported: false
dashboard:
enabled: true
ssl: true
network:
hostNetwork: false
placement:
mon:
tolerations:
- key: storage-node
operator: Exists
storage:
storageClassDeviceSets:
- name: set1
# The number of OSDs to create from this device set
count: 4
# IMPORTANT: If volumes specified by the storageClassName are not portable across nodes
# this needs to be set to false. For example, if using the local storage provisioner
# this should be false.
portable: true
# Since the OSDs could end up on any node, an effort needs to be made to spread the OSDs
# across nodes as much as possible. Unfortunately the pod anti-affinity breaks down
# as soon as you have more than one OSD per node. If you have more OSDs than nodes, K8s may
# choose to schedule many of them on the same node. What we need is the Pod Topology
# Spread Constraints, which is alpha in K8s 1.16. This means that a feature gate must be
# enabled for this feature, and Rook also still needs to add support for this feature.
# Another approach for a small number of OSDs is to create a separate device set for each
# zone (or other set of nodes with a common label) so that the OSDs will end up on different
# nodes. This would require adding nodeAffinity to the placement here.
placement:
tolerations:
- key: storage-node
operator: Exists
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: agentpool
operator: In
values:
- npstorage
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app
operator: In
values:
- rook-ceph-osd
- key: app
operator: In
values:
- rook-ceph-osd-prepare
topologyKey: kubernetes.io/hostname
resources:
limits:
cpu: "500m"
memory: "4Gi"
requests:
cpu: "500m"
memory: "2Gi"
volumeClaimTemplates:
- metadata:
name: data
spec:
resources:
requests:
storage: 100Gi
storageClassName: managed-premium
volumeMode: Block
accessModes:
- ReadWriteOnce
disruptionManagement:
managePodBudgets: false
osdMaintenanceTimeout: 30
manageMachineDisruptionBudgets: false
machineDisruptionBudgetNamespace: openshift-machine-api
Step 6
if you are in /home/user , then a new folder called rook should be there.
cd rook
step 7
switch to the /cluster/examples/kubernetes/ceph directory and follow the steps below.
step 8
run the command , kubectl apply -f common.yaml
from the cluster/examples/kubernetes/ceph directory.
step 9
create a new operator.yaml file, do not use the one, from the directory, create a new one and apply it.
vi operator2.yaml
kubectl apply -f operator2.yaml
step 9 validate you have storage class for premiun SSD disks.
kubectl get storageclass
step 10
Create a new cluster2.yaml file, copy and paste the one attached
apply the yaml
kubectl apply -f cluster2.yaml
step 11
validate you got the OSD pods created.
kubectl get pods -n rook-ceph
you will see some pods in init, wait a while
they will eventually start.
rook-ceph-osd will mount a disk using a pvc each one
rook-ceph-osd runs 1 per node.
if a node goes down , the disk dies, the other nodes will have enough data to restore in a new node.
rook-ceph-mon-X are the ones who control the logic on which side it has to replicate the data to have redundancy.
when accessing the data mons are the ones who informs the "client" to know where to retrieve the data.
For further actions, you may consider blocking this person and/or reporting abuse
We're a place where coders share, stay up-to-date and grow their careers.
I think this method its great but requires some update to have the PODs running .
a peer Sergio Turrent updated the cluster.yaml to get it working as best effort.
as for 11/06/2020 Our cluster.yaml proposal its as follows:
apiVersion: ceph.rook.io/v1
kind: CephCluster
metadata:
name: rook-ceph
namespace: rook-ceph
spec:
dataDirHostPath: /var/lib/rook
mon:
count: 3
allowMultiplePerNode: false
volumeClaimTemplate:
spec:
storageClassName: managed-premium
resources:
requests:
storage: 10Gi
cephVersion:
image: ceph/ceph:v15.2.4
allowUnsupported: false
dashboard:
enabled: true
ssl: true
network:
hostNetwork: false
placement:
mon:
tolerations:
- key: storage-node
operator: Exists
storage:
storageClassDeviceSets:
- name: set1
# The number of OSDs to create from this device set
count: 4
# IMPORTANT: If volumes specified by the storageClassName are not portable across nodes
# this needs to be set to false. For example, if using the local storage provisioner
# this should be false.
portable: true
# Since the OSDs could end up on any node, an effort needs to be made to spread the OSDs
# across nodes as much as possible. Unfortunately the pod anti-affinity breaks down
# as soon as you have more than one OSD per node. If you have more OSDs than nodes, K8s may
# choose to schedule many of them on the same node. What we need is the Pod Topology
# Spread Constraints, which is alpha in K8s 1.16. This means that a feature gate must be
# enabled for this feature, and Rook also still needs to add support for this feature.
# Another approach for a small number of OSDs is to create a separate device set for each
# zone (or other set of nodes with a common label) so that the OSDs will end up on different
# nodes. This would require adding nodeAffinity to the placement here.
placement:
tolerations:
- key: storage-node
operator: Exists
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: agentpool
operator: In
values:
- npstorage
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app
operator: In
values:
- rook-ceph-osd
- key: app
operator: In
values:
- rook-ceph-osd-prepare
topologyKey: kubernetes.io/hostname
resources:
limits:
cpu: "500m"
memory: "4Gi"
requests:
cpu: "500m"
memory: "2Gi"
volumeClaimTemplates:
- metadata:
name: data
spec:
resources:
requests:
storage: 100Gi
storageClassName: managed-premium
volumeMode: Block
accessModes:
- ReadWriteOnce
disruptionManagement:
managePodBudgets: false
osdMaintenanceTimeout: 30
manageMachineDisruptionBudgets: false
machineDisruptionBudgetNamespace: openshift-machine-api
and just to summarize, step by step procedure I used to get to the point of having the pods running is:
Step 1 creating a nodepool in AKS:
az aks nodepool add --cluster-name aks3del --name npstorage --node-count 2 --resource-group aks3del --node-taints storage-node=true:NoSchedule
Step 2
az aks get-credentials --resource-group aks3del --name aks3del
step 4
kubectl get nodes
Step 5
execute this command:
git clone github.com/rook/rook.git
Step 6
if you are in /home/user , then a new folder called rook should be there.
cd rook
step 7
switch to the /cluster/examples/kubernetes/ceph directory and follow the steps below.
step 8
run the command , kubectl apply -f common.yaml
from the cluster/examples/kubernetes/ceph directory.
step 9
create a new operator.yaml file, do not use the one, from the directory, create a new one and apply it.
vi operator2.yaml
kubectl apply -f operator2.yaml
step 9 validate you have storage class for premiun SSD disks.
kubectl get storageclass
step 10
Create a new cluster2.yaml file, copy and paste the one attached
apply the yaml
kubectl apply -f cluster2.yaml
step 11
validate you got the OSD pods created.
kubectl get pods -n rook-ceph
you will see some pods in init, wait a while
they will eventually start.
rook-ceph-osd will mount a disk using a pvc each one
rook-ceph-osd runs 1 per node.
if a node goes down , the disk dies, the other nodes will have enough data to restore in a new node.
rook-ceph-mon-X are the ones who control the logic on which side it has to replicate the data to have redundancy.
when accessing the data mons are the ones who informs the "client" to know where to retrieve the data.