B.R.O.L.Y

Posted on Oct 14

Mastering Kubernetes Step by Step Part 1: Introduction to Kubernetes Architecture

#kubernetes #webdev #architecture #programming

Introduction

Kubernetes is an orchestrator for containerized, cloud-native, microservices applications. But what does that mean exactly? A bunch of jumbled words, right? Let's break it down step by step.

Orchestration

An orchestrator is like an operating system: it has to dynamically respond to changes. In our case, Kubernetes reacts to deployed applications, scales them up or down as needed, self-heals when things break, and performs rolling updates and rollbacks with zero downtime — meaning the app should never go offline and users should never be frustrated. All of this happens automatically, without human interference, except for the initial setup of course.

Containerization

Containerization is packaging your application into an image along with its dependencies so it can run anywhere. If you are familiar with Docker, this will come naturally. I recommend taking a moment to read a couple of blogs on Docker and Docker Compose, including some lower-level concepts, because these fundamentals are crucial for this course.

Cloud Native

We call an application cloud-native if it leverages cloud features like auto-scaling, self-healing, automated updates, and rollbacks — basically, the things Kubernetes is capable of managing.

Microservices

Microservices is an architectural decision to make your application composed of independent services. This means if one service is down, the others continue working. This approach is excellent because it helps developers manage projects better — teams can work on individual services independently and track them properly.

A Bit of History

Kubernetes was developed by a group of engineers at Google in response to AWS's growing popularity. Google had its own internal tools for orchestrating containers due to the massive scale of the applications they managed. They released Kubernetes in 2014 and donated it to the Cloud Native Computing Foundation (CNCF).

Kubernetes and Docker

Early versions of Kubernetes shipped with Docker as the default runtime, handling tasks like creating, starting, and stopping containers. Over time, Docker became bloated, and many alternatives emerged. To address this, Kubernetes introduced the Container Runtime Interface (CRI), allowing users to choose the runtime that best fits their needs. As of Kubernetes 1.24, Docker is no longer supported, and most clusters now use containerd, a lightweight runtime optimized for Kubernetes that fully supports Docker container images. Multiple runtimes can run on the same cluster, offering flexibility for performance and isolation requirements.

Kubernetes: the operating system of the cloud

Kubernetes is often called the operating system of the cloud because it transforms the chaos of distributed infrastructure into a seamless platform for developers. Just as Linux or Windows hides the complexity of CPUs, memory, and storage from application processes, Kubernetes abstracts the sprawling resources of clouds and datacenters, letting you deploy microservices without worrying about which node, storage volume, or failure zone they run on. Whether your cluster lives on AWS, Azure, GCP, or a mix of clouds, Kubernetes schedules, scales, and heals your applications automatically, making hybrid deployments, multi-cloud setups, and cloud migrations effortless. From a developer’s perspective, you simply declare what your application needs — replicas, resources, dependencies — and Kubernetes handles the rest, turning the cloud into a reliable, self-managing operating environment.

Kubernetes: cluster

A cluster is is just what the word means LMAO. well its a grouping of one or more nodes that provide CPU, MEMORY .. for an app
kubernetes has two nodes A Control plane and a Worker node. a plus is that a control plane has to be a linux node indifferent from the worker node which can be windows/linux. in a good setup you'll have to setup multiple control planes for HA(High Availability) the control planes mainly manaage the worker nodes which in turn are for running applications. people sometimes run application on control planes mainly for testing but prohibit it in production so that control planes only focus on manaing the worker nodes properly.

Control Plane

As we mentioned, a Kubernetes cluster is a combination of a control plane and worker nodes. The control plane is the brain of the cluster — a collection of services that manage the cluster’s state, schedule tasks, handle auto-scaling, and orchestrate updates. It also exposes the API server, which is the entry point for all interactions with the cluster. A simple setup might involve one control plane managing multiple worker nodes, but what really makes the control plane tick? Let’s take a closer look.

The API Server

The API server acts as the front door for Kubernetes, handling all requests and communication within the cluster. Every action — from deploying an application to communicating between controllers and the cluster store — goes through the API server. It exposes a RESTful API over HTTPS, and every request must be authenticated and authorized.

For example, deploying an application involves these steps:

Define the desired state of the application in a YAML configuration file.
Submit the file to the API server.
The API server authenticates and authorizes the request.
The desired state is stored in the cluster database.
The control plane schedules and executes the necessary changes on the worker nodes.

The Cluster Store

The cluster store is essentially a database that records the desired state of the cluster and all object definitions. Kubernetes uses etcd, a distributed key-value store, as its backend. Each control plane node contains an etcd replica to ensure high availability (HA). In larger setups, architects may run a separate etcd cluster connected to all control planes to handle high-demand workloads.

One challenge in distributed databases like etcd is the split-brain scenario, which occurs when network partitions isolate some nodes from the rest, potentially causing multiple nodes to believe they are the leader and accept conflicting writes. For example, imagine a three-node etcd cluster: if one node gets disconnected from the other two, it might try to process updates independently. Etcd prevents this by requiring a majority quorum for any write operation — only the majority of nodes can commit a change, ensuring consistency.

Etcd also handles concurrent writes to the same key using version numbers. If two clients try to update the same value at the same time, etcd will accept the first write and reject the second with a conflict error. The client must then retry using the latest version of the key, ensuring that no updates are accidentally overwritten.

Controllers and the controller manager

Kubernetes relies on controllers to handle much of the cluster’s “intelligence” and automation. These controllers run on the control plane and continuously monitor the cluster, comparing the observed state with the desired state you define. Common examples include the Deployment controller, StatefulSet controller, and ReplicaSet controller, each responsible for different types of workloads. Essentially, controllers act like caretakers: if you ask for three replicas of an application, the controller ensures that exactly three healthy instances are running and will create, delete, or restart pods as needed to maintain that state. To keep everything organized, Kubernetes runs a controller manager, which spawns and supervises these individual controllers, ensuring they operate reliably and maintain the overall health and consistency of the cluster.

The scheduler

The Kubernetes scheduler is responsible for assigning new workloads to healthy worker nodes. It continuously watches the API server for new tasks and evaluates which nodes are capable of running them. This involves filtering nodes based on factors like taints, affinity and anti-affinity rules, network port availability, and available CPU and memory. The scheduler then ranks the suitable nodes using a scoring system, considering criteria such as whether the required container image is already present, how much CPU and memory are free, and how many tasks the node is currently running. The nodes with the highest scores are chosen to execute the tasks. If no suitable node is available, the task is marked as pending. In clusters configured with autoscaling, a pending task can trigger a node autoscaling event, adding a new node to the cluster so the task can be scheduled and run.

The cloud controller manager

If your Kubernetes cluster runs on a public cloud like AWS, Azure, GCP, or Civo Cloud, it uses a cloud controller manager to integrate with cloud services. This component handles tasks such as provisioning instances, storage, and load balancers. For example, if an application requests a load balancer, the cloud controller manager automatically creates one in the cloud and connects it to your app, making cloud resources seamless to use.

Worker nodes

an architecture of a worker node looks like this :

kubelet

The kubelet is the main Kubernetes agent running on every worker node, acting as the bridge between the node and the control plane. It watches the API server for new tasks, instructs the appropriate container runtime to execute them, and continuously reports the status of these tasks back to the API server. If a task fails to run, the kubelet reports the problem so the control plane can take the necessary actions to maintain the desired state of the cluster.

Runtime

Each worker node also has one or more container runtimes responsible for executing tasks. Most modern Kubernetes clusters use containerd, which handles pulling container images and managing container lifecycle operations like starting and stopping containers. Older clusters shipped with Docker, which is now deprecated as a runtime, while platforms like Red Hat OpenShift often use CRI-O. Each runtime has its strengths and trade-offs, and Kubernetes can work with any runtime that implements the Container Runtime Interface (CRI).

Kube-proxy

Finally, every worker node runs a kube-proxy service that manages cluster networking. Kube-proxy ensures that network traffic is correctly routed to the tasks running on the node and handles load balancing, making communication between services and pods seamless. With the kubelet, container runtime, and kube-proxy in place, each worker node becomes a self-managing unit capable of running applications reliably within the Kubernetes cluster.

Packaging apps for kubernetes

In Kubernetes, all workloads — whether containers, VMs, or Wasm apps — must be wrapped in Pods to run on the cluster. Think of Pods like standardized packages for a courier service: just as couriers can ship books, clothes, or electronics only if they’re properly packaged and labeled, Kubernetes can run any workload only when it’s packaged in a Pod. Once wrapped, Kubernetes handles the logistics of running the app — choosing the right nodes, connecting networks, attaching storage volumes, and monitoring its health. Typically, Pods are managed by higher-level controllers like Deployments, which add extra value, similar to courier services offering insurance, express delivery, or tracking. Controllers ensure your applications stay healthy, scale automatically when needed, and maintain the desired state, letting you focus on building apps rather than managing the infrastructure.

The important thing to understand is that each layer of wrapping adds something:
• The container wraps the app and provides dependencies
• The Pod wraps the container so it can run on Kubernetes
• The Deployment wraps the Pod and adds self-healing, scaling, and more

The declarative model and desired state

At the heart of Kubernetes lies the declarative model, a powerful system that revolves around three key concepts — observed state, desired state, and reconciliation. The observed state represents the current reality of your cluster, while the desired state defines how you want it to look or behave. The reconciliation process acts as the bridge between the two, continuously ensuring that what exists matches what you’ve declared. In practice, this model starts when you define your application’s configuration in a YAML manifest — specifying details like container images, replicas, and ports — and submit it to the Kubernetes API server. Once authenticated and stored in the cluster’s key-value store, this configuration becomes a “record of intent.” From there, controllers constantly monitor the system, detecting any drift between the observed and desired states. If a discrepancy appears — say, a pod crashes or a node fails — the controller automatically takes corrective actions, such as rescheduling pods or pulling new images, to restore harmony. This self-correcting loop keeps your system stable and resilient. Unlike the imperative model, which relies on manually executed, platform-specific commands and scripts, the declarative model is clean, consistent, and version-controllable — making it easier to roll out changes, recover from failures, and scale seamlessly. For example, if you declare ten replicas of an app and two nodes fail, Kubernetes automatically creates two new replicas to maintain the declared count. Similarly, updating a deployment is as simple as changing one line in your YAML file — Kubernetes handles the rollout, monitoring, and recovery automatically. This model’s elegance lies in its simplicity and automation: you tell Kubernetes what you want, and it continuously works to make that true.

DEV Community