DEV Community

loading...

How to create a Kubernetes cluster on Alpine Linux

Dave
K8s, Infra, Backend, and Distributed Systems!
・4 min read

this post will help you understand kubeadm, kubelet flags, and nuances of alpine

Creating a production-ready K8s cluster is almost a breeze nowadays on most cloud platforms so I was curious to see how hard it'd be to create a cluster from scratch on my own set of VMs... turns out not very hard.

To accomplish this, you can either do it the hard way or use some automation. You're presented with two options:

  • kubespray which uses Ansible under the hood
  • kubeadm which is the official way to do it, part of k/k, and supported by amazing k8s team of VMWare

As kubeadm's binary already comes with kubernetes package on Alpine, I decided to go with that option.

I already had KVM installed on my machine and had an Alpine Linux 3.9 VM ready to go, so you need to pause here and provision your VMs if you haven't already before you proceed.

Once you have your VM, you need to add community and testing repositories so you can get the needed binaries for Kubernetes and Docker packages:

# echo "@testing http://dl-cdn.alpinelinux.org/alpine/edge/testing/" >> /etc/apk/repositories
# echo "@community http://dl-cdn.alpinelinux.org/alpine/edge/community/" >> /etc/apk/repositories

then install required packages with:

# apk add kubernetes@testing
# apk add docker@community
# apk add cni-plugins@testing

at this point when I tried to start my docker service, I'd get an error:

# service docker start
supervise-daemon: --pidfile must be specified
failed to start Docker Daemon
ERROR: docker failed to start

This is apparently a bug on part of supervise-daemon and I created a merge request for this issue to alpine/aports but apparently this issue has been solved in newer versions of Alpine. In case you still run into this, you need to edit your /etc/init.d/docker file, add pidfile="/run/docker/docker.pid" and inside start_pre block add mkdir -p /run/docker.

Now you can duplicate your VM in KVM, and name the new one worker-1:

# hostname worker-1
# echo "worker-1" > /etc/hostname

make sure to do the same steps for master node but with the name master-1.

You're ready to create your control-plane on master node, run:

# kubeadm init --apiserver-advertise-address=[ Master Node's IP Here ] --kubernetes-version=1.17.5

Kubeadm runs in phases, and it was crashing when reaching:

[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s

Running another terminal (using SSH) and restarting the kubelet service fixed this issue. Turns out, Kubeadm starts the kubelet service first and then writes the config files it needs to start properly. On other OSes such as Ubuntu, Systemd -the OS init system- takes care of restarting the crashing service until the config files are there and kubelet can be run.

Alpine on the other hand, uses OpenRC as its init system which doesn't restart on crash loops. For that Gentoo community has introduced supervise-daemon which is experimental at the moment. To make this possible on Alpine, we fixed this issue directly on kubeadm with this PR.

Once kubeadm runs it course, it gives you two notes, one is the location of your kube config file. This is the file that kubectl uses to authenticate to API server on every call. You need to copy this file on any machines that needs to interact with cluster using kubectl.

Another one is a join statement like below, which is how you'll add your worker nodes to the cluster. First add your CNI on master node and then join from worker node:

# on master node
master-1 # kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')"

# on worker node
worker-1 # kubeadm join 192.168.122.139:6443 --token hcexp0.qiaxub64z17up9rn --discovery-token-ca-cert-hash sha256:05653259a076769faa952024249faa9c9457b4abf265914ba58f002f08834006

Note:
Your join command should succeed now but when I initially tried this command, my kubelet service would again fail to start because config files were missing and, surprisingly, restarting kubelet service didn't help this time. (Shocking, I know!)

After some investigation I realized another mismatch between Systemd and OpenRC, --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf was missing from /etc/conf.d/kubelet and adding it fixed this but didn't specify CNI and my pods would get Docker IPs. You guessed it right, another kubelet argument missing. (See the full changes that were necessary here)

At this point you can deploy your workloads, and if you come from Ubuntu world one subtle difference is that you need to make sure you apps are compatible with musl as opposed to glibc. For example if you're deploying Go static binaries, make sure you're compiling with CGO_ENABLED=0 to create a statically-linked binary or if you're deploying node apps, make sure your npm install is being run inside an Alpine container.

That's it! Feel free to reach out to me if you need help with your k8s clusters.

Discussion (7)

Collapse
runlevel5 profile image
Trung Lê • Edited

Firstly, thanks for writing this up.

I haven't much luck with getting kubelet up running on Alpine 3.12 (ppc64le)

$ /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/confi
g.yaml --resolv-conf=/run/systemd/resolve/resolv.conf 

would fail with following output:

F0627 15:23:40.684395   24145 kubelet.go:1383] Failed to start ContainerManager failed to initialize top level QOS containers: failed to update top level Burstable QOS cgroup : failed to set supported cgroup subsystems for cgroup [kubepods burstable]: failed to find subsystem mount for required subsystem: pids

I would really appreciate if you could give me guidance on what the failure might be.

Many thanks in advance

P/S: I have no issue with setting up kubelet running under Ubuntu.

Collapse
runlevel5 profile image
Trung Lê • Edited

I manage to resolve this issue by adding cgroup_enable=pids to the grub boot command:

# file: /etc/default/grub
- GRUB_CMDLINE_LINUX_DEFAULT="modules=sd-mod,usb-storage,ext4 nomodeset quiet rootfstype=ext4"
+ GRUB_CMDLINE_LINUX_DEFAULT="modules=sd-mod,usb-storage,ext4 nomodeset quiet rootfstype=ext4 cgroup_enable=pids"

This issue does not occur on x86_64 architecture and seems to me only impact ppc64le. I am investigating if it is related to the fact that CONFIG_CGROUP_PIDS is not enabled in the linux-lts kernel.

Collapse
runlevel5 profile image
Trung Lê • Edited

FYI I've lodged a new Merge Request to enable CONFIG_CGROUPS_PIDS for linux-lts ppc64le (3.12-stable). This would address the issue at its root and users won't have to explicitly declare the cgroup_enable=pids in /etc/default/grub anymore.

$ cat /proc/cgroups 
#subsys_name    hierarchy   num_cgroups enabled
cpuset  2   1   1
cpu 3   1   1
cpuacct 4   1   1
memory  5   1   1
devices 6   1   1
freezer 7   1   1
net_cls 8   1   1
perf_event  9   1   1
pids    10  1   1 # <-- HERE IT IS :D
Collapse
jkaldon profile image
Joshua Kaldon

Dave, thanks for this simple walkthrough.

I've gotten stuck on the kubeadm init... step and I'm hoping you can help me understand where I've gone wrong. I tried your advice on restarting kubelet and checking the parameters being passed in to no avail.

Here's the output from kubeadm init:

k8s-node-01:~# kubeadm init --node-name k8s-node-01 --token=${TOKEN}
[init] Using Kubernetes version: v1.21.0
[preflight] Running pre-flight checks
    [WARNING SystemVerification]: missing optional cgroups: hugetlb
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [k8s-node-01 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.1.201]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [k8s-node-01 localhost] and IPs [192.168.1.201 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [k8s-node-01 localhost] and IPs [192.168.1.201 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.

    Unfortunately, an error has occurred:
        timed out waiting for the condition

    This error is likely caused by:
        - The kubelet is not running
        - The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

    If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
        - 'systemctl status kubelet'
        - 'journalctl -xeu kubelet'

    Additionally, a control plane component may have crashed or exited when started by the container runtime.
    To troubleshoot, list all containers using your preferred container runtimes CLI.

    Here is one example how you may list all Kubernetes containers running in docker:
        - 'docker ps -a | grep kube | grep -v pause'
        Once you have found the failing container, you can inspect its logs with:
        - 'docker logs CONTAINERID'

error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster
To see the stack trace of this error execute with --v=5 or higher
Enter fullscreen mode Exit fullscreen mode

Here's some output from /var/log/kubelet/kubelet.log:

I0429 20:48:52.250906    3237 kubelet.go:461] "Kubelet nodes not sync"
E0429 20:48:52.546014    3237 kubelet.go:2218] "Container runtime network not ready" networkReady="NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized"
I0429 20:48:54.073913    3237 kubelet_node_status.go:71] "Attempting to register node" node="k8s-node-01"
E0429 20:48:54.075479    3237 kubelet_node_status.go:93] "Unable to register node with API server" err="Post \"https://192.168.1.201:6443/api/v1/nodes\": dial tcp 192.168.1.201:6443: connect: connection refused" node="k8s-node-01"
I0429 20:48:54.251246    3237 kubelet.go:461] "Kubelet nodes not sync"
I0429 20:48:54.637281    3237 cni.go:239] "Unable to update cni config" err="no networks found in /etc/cni/net.d"
E0429 20:48:56.301131    3237 certificate_manager.go:437] Failed while requesting a signed certificate from the master: cannot create certificate signing request: Post "https://192.168.1.201:6443/apis/certificates.k8s.io/v1/certificatesigningrequests": dial tcp 192.168.1.201:6443: connect: connection refused
I0429 20:48:57.250927    3237 kubelet.go:461] "Kubelet nodes not sync"
E0429 20:48:57.602935    3237 kubelet.go:2218] "Container runtime network not ready" networkReady="NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized"
E0429 20:48:57.723059    3237 controller.go:144] failed to ensure lease exists, will retry in 7s, error: Get "https://192.168.1.201:6443/apis/coordination.k8s.io/v1/namespaces/kube-node-lease/leases/k8s-node-01?timeout=10s": dial tcp 192.168.1.201:6443: connect: connection refused
E0429 20:48:59.251706    3237 kubelet.go:2298] "Error getting node" err="nodes have not yet been read at least once, cannot construct node object"
I0429 20:48:59.638137    3237 cni.go:239] "Unable to update cni config" err="no networks found in /etc/cni/net.d"
E0429 20:49:01.020298    3237 event.go:273] Unable to write event: '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"k8s-node-01.167a6d02c5fc74e6", GenerateName:"", Namespace:"default", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry(nil)}, InvolvedObject:v1.ObjectReference{Kind:"Node", Namespace:"", Name:"k8s-node-01", UID:"k8s-node-01", APIVersion:"", ResourceVersion:"", FieldPath:""}, Reason:"NodeHasNoDiskPressure", Message:"Node k8s-node-01 status is now: NodeHasNoDiskPressure", Source:v1.EventSource{Component:"kubelet", Host:"k8s-node-01"}, FirstTimestamp:v1.Time{Time:time.Time{wall:0xc01ae266a80ff0e6, ext:4600481839, loc:(*time.Location)(0x3b69380)}}, LastTimestamp:v1.Time{Time:time.Time{wall:0xc01ae26a11c222cb, ext:18226284051, loc:(*time.Location)(0x3b69380)}}, Count:8, Type:"Normal", EventTime:v1.MicroTime{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, Series:(*v1.EventSeries)(nil), Action:"", Related:(*v1.ObjectReference)(nil), ReportingController:"", ReportingInstance:""}': 'Patch "https://192.168.1.201:6443/api/v1/namespaces/default/events/k8s-node-01.167a6d02c5fc74e6": dial tcp 192.168.1.201:6443: connect: connection refused'(may retry after sleeping)
I0429 20:49:01.244928    3237 kubelet_node_status.go:71] "Attempting to register node" node="k8s-node-01"
E0429 20:49:01.247474    3237 kubelet_node_status.go:93] "Unable to register node with API server" err="Post \"https://192.168.1.201:6443/api/v1/nodes\": dial tcp 192.168.1.201:6443: connect: connection refused" node="k8s-node-01"
I0429 20:49:02.352656    3237 kubelet.go:461] "Kubelet nodes not sync"
E0429 20:49:02.658584    3237 kubelet.go:2218] "Container runtime network not ready" networkReady="NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized"
I0429 20:49:04.353118    3237 kubelet.go:461] "Kubelet nodes not sync"
I0429 20:49:04.639286    3237 cni.go:239] "Unable to update cni config" err="no networks found in /etc/cni/net.d"
E0429 20:49:04.725136    3237 controller.go:144] failed to ensure lease exists, will retry in 7s, error: Get "https://192.168.1.201:6443/apis/coordination.k8s.io/v1/namespaces/kube-node-lease/leases/k8s-node-01?timeout=10s": dial tcp 192.168.1.201:6443: connect: connection refused
I0429 20:49:07.352911    3237 kubelet.go:461] "Kubelet nodes not sync"
E0429 20:49:07.714725    3237 kubelet.go:2218] "Container runtime network not ready" networkReady="NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized"
I0429 20:49:08.416803    3237 kubelet_node_status.go:71] "Attempting to register node" node="k8s-node-01"
E0429 20:49:08.419563    3237 kubelet_node_status.go:93] "Unable to register node with API server" err="Post \"https://192.168.1.201:6443/api/v1/nodes\": dial tcp 192.168.1.201:6443: connect: connection refused" node="k8s-node-01"
I0429 20:49:09.353151    3237 kubelet.go:461] "Kubelet nodes not sync"
E0429 20:49:09.353408    3237 kubelet.go:2298] "Error getting node" err="nodes have not yet been read at least once, cannot construct node object"
E0429 20:49:09.561414    3237 reflector.go:138] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:45: Failed to watch *v1.Pod: failed to list *v1.Pod: Get "https://192.168.1.201:6443/api/v1/pods?fieldSelector=spec.nodeName%3Dk8s-node-01&limit=500&resourceVersion=0": dial tcp 192.168.1.201:6443: connect: connection refused
I0429 20:49:09.640110    3237 cni.go:239] "Unable to update cni config" err="no networks found in /etc/cni/net.d"
E0429 20:49:09.977853    3237 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.Service: failed to list *v1.Service: Get "https://192.168.1.201:6443/api/v1/services?limit=500&resourceVersion=0": dial tcp 192.168.1.201:6443: connect: connection refused
I0429 20:49:10.454221    3237 kubelet.go:461] "Kubelet nodes not sync"
E0429 20:49:11.022600    3237 event.go:273] Unable to write event: '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"k8s-node-01.167a6d02c5fc74e6", GenerateName:"", Namespace:"default", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry(nil)}, InvolvedObject:v1.ObjectReference{Kind:"Node", Namespace:"", Name:"k8s-node-01", UID:"k8s-node-01", APIVersion:"", ResourceVersion:"", FieldPath:""}, Reason:"NodeHasNoDiskPressure", Message:"Node k8s-node-01 status is now: NodeHasNoDiskPressure", Source:v1.EventSource{Component:"kubelet", Host:"k8s-node-01"}, FirstTimestamp:v1.Time{Time:time.Time{wall:0xc01ae266a80ff0e6, ext:4600481839, loc:(*time.Location)(0x3b69380)}}, LastTimestamp:v1.Time{Time:time.Time{wall:0xc01ae26a11c222cb, ext:18226284051, loc:(*time.Location)(0x3b69380)}}, Count:8, Type:"Normal", EventTime:v1.MicroTime{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, Series:(*v1.EventSeries)(nil), Action:"", Related:(*v1.ObjectReference)(nil), ReportingController:"", ReportingInstance:""}': 'Patch "https://192.168.1.201:6443/api/v1/namespaces/default/events/k8s-node-01.167a6d02c5fc74e6": dial tcp 192.168.1.201:6443: connect: connection refused'(may retry after sleeping)
I0429 20:49:11.455253    3237 kubelet.go:461] "Kubelet nodes not sync"
E0429 20:49:11.727673    3237 controller.go:144] failed to ensure lease exists, will retry in 7s, error: Get "https://192.168.1.201:6443/apis/coordination.k8s.io/v1/namespaces/kube-node-lease/leases/k8s-node-01?timeout=10s": dial tcp 192.168.1.201:6443: connect: connection refused
E0429 20:49:12.770559    3237 kubelet.go:2218] "Container runtime network not ready" networkReady="NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized"
Enter fullscreen mode Exit fullscreen mode

And here's a few details on my Raspberry Pi 4 Alpine installation.

k8s-node-01:~# uname -a
Linux k8s-node-01 5.10.29-0-rpi4 #1-Alpine SMP PREEMPT Mon Apr 12 15:55:08 UTC 2021 aarch64 Linux

k8s-node-01:~# kubectl version
Client Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.0", GitCommit:"cb303e613a121a29364f75cc67d3d580833a7479", GitTreeState:"archive", BuildDate:"2021-04-12T14:02:47Z", GoVersion:"go1.16.3", Compiler:"gc", Platform:"linux/arm64"}
The connection to the server localhost:8080 was refused - did you specify the right host or port?
Enter fullscreen mode Exit fullscreen mode

And a list of (some) installed packages / versions in case that is interesting:

k8s-node-01:~# apk list -I | sort | cut -f1 -d\
acct-6.6.4-r0
alpine-base-3.13.5-r0
alpine-baselayout-3.2.0-r8
alpine-conf-3.11.0-r2
alpine-keys-2.2-r0
cni-plugins-0.9.1-r0
containerd-1.4.4-r0
docker-20.10.3-r1
docker-cli-20.10.3-r1
docker-engine-20.10.3-r1
docker-openrc-20.10.3-r1
iptables-1.8.6-r0
iptables-openrc-1.8.6-r0
kubeadm-1.21.0-r1
kubectl-1.21.0-r1
kubelet-1.21.0-r1
kubelet-openrc-1.21.0-r1
kubernetes-1.21.0-r1
libnetfilter_conntrack-1.0.8-r0
libnetfilter_cthelper-1.0.0-r1
libnetfilter_cttimeout-1.0.0-r1
libnetfilter_queue-1.0.5-r0
linux-firmware-brcm-20201218-r0
linux-rpi4-5.10.29-r0
openrc-0.42.1-r19
Enter fullscreen mode Exit fullscreen mode
Collapse
fcolista profile image
Francesco • Edited

Hi Dave.
I recently upgraded kubernetes to 1.18.3 on Alpine. This is still in testing, so is not production-ready.
I would like to go ahead and make kubernetes fully usable in Alpine, and move it on community.
I've 0 experience with Kubernetes, so If you can give me a feeedback if the packages are working (they have been splitted now...no need to install the monolithic kubernetes package), I would appreciate.
If you want, contact me directly.
Have a great day.
.: Francesco Colista

Collapse
xphoniex profile image
Dave Author • Edited

Hi Francesco,

There are still issues getting k8s DNS to work in Alpine. Probably some library issue.

Will take a look when I get a chance.

Collapse
runlevel5 profile image
Trung Lê

I've create a new package to orchestrate a cluster with 1 master and 2 workers on Vagrant. The source code could be found at github.com/runlevel5/kubernetes-cl...

P/S: the upstream alpine is ever changing, I will be following closely and update the source code accordingly. Let's hope alpine 3.13 would have k8s and related packages in stable branch.