So we have built a control plan from scratch. That’s kinda useful and was also kinda hard to make? But, really what we want to make is a proper, functioning, multi-node cluster that can actually run our containerized applications. A control plane by itself is just the brain; we need the muscle (the worker nodes) to do the actual work.
When we set up the control node, setting up the Control Plane required us to manually configure every single detail of the seed VM, line-by-line. While that was an important exercise in understanding the components, trying to repeat those steps for 10, 50, or 100 worker nodes would quickly become a nightmare. This part of the series is about making the process scalable by creating a “Golden Image” from our pre-configured machine, allowing us to rapidly provision identical workers and efficiently join them to our existing cluster.
Creating a Golden Image
The core principle here is to stop doing repetitive manual configuration. We already spent the time when we set up the control node getting the operating system tuned exactly right. Remember all those steps? We loaded the overlay and br_netfilter kernel modules, tweaked the network bridge settings for iptables, and, critically, disabled swap entirely because Kubelet won’t run otherwise. We also installed containerd and configured it to use systemd as the cgroup driver.
That entire stack of prerequisites, the OS configuration, and the necessary Kubernetes binaries (kubeadm, kubelet, kubectl) are now perfectly baked into our k8s-seed VM. A Golden Image is simply a frozen, ready-to-use snapshot of that perfect setup. Instead of repeating those steps on every new node, we use this image as the template to rapidly spin up as many identical worker nodes as we need. This moves us from a manual, configuration-based process to an automated, image-based provisioning process, which is the foundational step toward true cloud-native infrastructure.
Step 1: Generalize the Seed Instance
Before we can use our k8s-seed VM as a template, we need to “seal” it. Every running Linux machine has unique identifiers — a machine ID and specific network logs. If we duplicate these across multiple nodes, the Kubernetes Control Plane will be confused, seeing multiple machines with the same identity. We need to clear these unique details so that when a new VM boots from the image, it generates its own fresh identity and network configuration.
- SSH into the seed machine:
gcloud compute ssh k8s-seed --zone=us-central1-a
2. Clean up unique identifiers: Run the following command to remove the machine ID and reset cloud-init logs. This ensures that when new VMs boot from this image, they generate their own fresh identity.
sudo cloud-init clean --seed --logs --machine-id
exit
3. Stop the instance: Back on your local machine, stop the VM so we can snapshot its disk.
gcloud compute instances stop k8s-seed --zone=us-central1-a
Step 2: Create the Custom Image
With the seed machine prepped and stopped, we now create a Google Compute Engine image from that disk. This image, k8s-node-image-v1, will serve as the foundation for every node in our cluster, whether they eventually become a control plane member or a worker.
Run the following command to create the image:
gcloud compute images create k8s-node-image-v1 \
--source-disk=k8s-seed \
--source-disk-zone=us-central1-a \
--family=k8s-node-family
Note: We added the — family flag so we can easily reference the “latest” version of this image later without needing to know the exact version name. This is a great cloud-native practice for image management!
Step 3: Provision Worker Nodes
Now for the easy part. Because the heavy lifting of configuration is done, spinning up new workers is instantaneous. We can create two worker nodes identically, pre-loaded with all the necessary Kubernetes prerequisites. This process takes minutes, a stark contrast to the hours spent manually configuring the seed VM in when we setup the control node.
Run these commands to create worker-1 and worker-2 using the image family we just created:
gcloud compute instances create worker-1 \
--zone=us-central1-a \
--image-family=k8s-node-family \
--tags=k8s-worker
gcloud compute instances create worker-2 \
--zone=us-central1-a \
--image-family=k8s-node-family \
--tags=k8s-worker
Step 4: Join the Cluster
Your worker VMs are running, but they are just generic machines with Kubernetes components installed; they don’t know they belong to a cluster yet. This is where the magic of the kubeadm join command comes in. The worker’s kubelet agent needs to securely authenticate and register with the Control Plane’s API server. The join command provides the token and the hash that validates the worker’s identity, effectively making it a true node.
- Locate your join command: Find the kubeadm join command you saved from the output of kubeadm init when we set up the control node. It looks something like this:
- Join the workers: SSH into each worker and run that command with sudo. For Worker 1:
kubeadm join 10.128.0.x:6443 --token <token> --discovery-token-ca-cert-hash sha256:<hash>
3. For Worker 2: Repeat the same steps for worker-2.
gcloud compute ssh worker-1 --zone=us-central1-a
sudo kubeadm join ... # Paste your command here
exit
Step 5: Verify the Cluster
Head back to your Control Plane to verify that everyone has checked in. This is the satisfying moment where all that hard work pays off and you see your multi-node cluster come alive.
kubectl get nodes
You should see output similar to this, with all nodes showing a Ready status:
NAME STATUS ROLES AGE VERSION
control-plane Ready control-plane 10m v1.35.0
worker-1 Ready <none> 2m v1.35.0
worker-2 Ready <none> 1m v1.35.0
The Easy Way (GKE)
It’s always worth pausing to appreciate just how much effort we put into this. Throughout this series, we’ve manually provisioned VMs, installed binaries, configured systemd services, and joined nodes — all to get a basic Kubernetes cluster running.
If you were using Google Kubernetes Engine (GKE), this entire process (for both the control plane and worker nodes) could have been replaced by a single command:
gcloud container clusters create k8s-easy-cluster \
--zone us-central1-a \
--num-nodes 3 \
--machine-type e2-medium
What’s Next?
You now have a multi-node cluster, which is a huge win. But there’s a big problem: your cluster is still static. If worker-1 crashes, it’s gone forever and your capacity is reduced until you manually replace it. If your application load increases, you have to manually provision worker-3, SSH in, and run the kubeadm join command. Future-proofing your infrastructure means moving beyond these manual interventions.
To achieve a truly “cloud-native” and resilient cluster, we need to automate the lifecycle of our worker nodes, ensuring they can be recreated and scaled without manual effort. This is something to think about, maybe this is something that we might touch on in the future possibly?
Go Deeper
The journey through “Kubernetes the Kinda Hard Way” is fundamentally about building foundational knowledge. Ready to solidify your understanding and tackle advanced concepts? Check out these resources:
- The Original Blueprint: This entire series is inspired by Kelsey Hightower’s foundational guide, Kubernetes The Hard Way.
- Official Documentation: Get familiar with the source of truth for all components: Kubernetes.io Documentation.
Top comments (0)