1. Backup
This pare can refer to my last post, here.
2. Restore
To follow the official procedure[1]:
"If any API servers are running in your cluster, you should not attempt to restore instances of etcd."
Therefore, for restoring an etcd backup, where we need to stop all API server instances, restore the etcd state, then restart the API servers:
- stop all API server instances
- restore state in all etcd instances
- restart all API server instances
2.1 Stop all API server instances
- check the api server
# check the api server
$ k get pods -n kube-system
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-94fb6bc47-wr56s 1/1 Running 5 (14m ago) 21d
canal-cgrhr 2/2 Running 2 (60m ago) 21d
canal-jb5rr 2/2 Running 2 (60m ago) 21d
coredns-57888bfdc7-895dj 1/1 Running 1 (60m ago) 21d
coredns-57888bfdc7-9rjt5 1/1 Running 1 (60m ago) 21d
etcd-controlplane 1/1 Running 2 (60m ago) 21d
kube-apiserver-controlplane 1/1 Running 0 21d
kube-controller-manager-controlplane 1/1 Running 3 (17m ago) 21d
kube-proxy-5xtp7 1/1 Running 1 (60m ago) 21d
kube-proxy-bt2pv 1/1 Running 2 (60m ago) 21d
kube-scheduler-controlplane 1/1 Running 3 (17m ago) 21d
we can see kube-apiserver-controlplane
is exist, so we need to stop this first.
- Move the
kube-apiserver
manifest file to a temporary location to stop the API server:
sudo mv /etc/kubernetes/manifests/kube-apiserver.yaml /tmp/
To be notice, this will be essentially to plan for the temporary loss of kubectl functionality.
# temporary loss of kubectl functionality.
controlplane $ k get pods -n kube-system
The connection to the server 172.30.1.2:6443 was refused - did you specify the right host or port?
However, as we see above, stopping the API server makes kubectl unusable, because kubectl communicates with the API server. Once the API server is stopped, kubectl commands cannot be executed since the API server is no longer available to handle requests.
But because the kubelet watches the /etc/kubernetes/manifests
directory for static pod definitions and removes the pod when the manifest file is removed or moved, therefore moving the kube-apiserver.yaml
manifest to a temporary location can stop the API server in a kubeadm-based Kubernetes cluster.
But we can use crictl
to Interact with Containers
If the cluster uses containerd, we can use crictl to interact with the containers:
crictl ps | grep kube-apiserver
crictl stop <container-id>
crictl rm <container-id>
2.2 Restore state in all etcd instances
sudo ETCDCTL_API=3 etcdctl snapshot restore /path/to/snapshot.db \
--data-dir=/var/lib/etcd_restore
and then edit the /etc/kubernetes/manifests/etcd.yaml
volumes:
- hostPath:
path: /etc/kubernetes/pki/etcd
type: DirectoryOrCreate
name: etcd-certs
- hostPath:
path: /var/lib/etcd_restore # change here to etcd path in host machine
type: DirectoryOrCreate
name: etcd-data
Here maybe confused, but we will talk it later. (here)
2.3 Restart all API server instances
sudo mv /tmp/kube-apiserver.yaml /etc/kubernetes/manifests/
Verify Cluster Health:
Use kubectl
to check the state of the cluster after the API server is back up.
# check API server running
kubectl get pods -n kube-system
# check API server health
kubectl get --raw /healthz
Top comments (0)