DEV Community

Cover image for Mastering Kubernetes: Unveiling Its Architecture
vertisystem-global-ltd
vertisystem-global-ltd

Posted on

Mastering Kubernetes: Unveiling Its Architecture

What additional features does Kubernetes offer over Docker if both work on the containerization concept? Or the reason for its evolution?

As containerization became a game-changer in software deployment, Docker emerged as a popular choice for its simplicity and efficiency. However, with the increasing scale and complexity of modern applications, new challenges surfaced, prompting the evolution of Kubernetes. In this article, we will explore the additional features that Kubernetes brings to the table, addressing the limitations of traditional container platforms like Docker. By empowering organizations with container orchestration and management across clusters of hosts, self-healing mechanisms, automatic scaling, and robust enterprise-level capabilities, Kubernetes has revolutionized how applications are deployed and managed in the ever-evolving landscape of technology. Let’s delve into the critical reasons behind the rise of Kubernetes as the go-to solution for modern infrastructure management.

Kubernetes evolved to solve the above-mentioned problems which we encountered with the Containerization Platform.

How does Kubernetes Solve these problems?

  1. Kubernetes overcomes Docker’s single-host container issue by providing container orchestration and management across a cluster of multiple hosts.

  2. Kubernetes Self-Healing Replication Controllers (now replaced by Replica Sets or Deployments in newer Kubernetes versions) ensure that the desired number of pods is always running. If a pod fails, the replication controller takes care of creating a replacement pod to maintain the desired level of availability.

  3. Kubernetes provides HPA, a feature that automatically scales the number of pod replicas based on CPU utilization, memory usage, or custom metrics. HPA continuously monitors the metrics and adjusts the number of pod replicas to meet the defined thresholds. This allows the application to scale up or down dynamically based on workload demand.

Kubernetes provides a robust and feature-rich container orchestration platform that addresses enterprise-level requirements for scalability, high availability, security, extensibility, and integration. It has become the in-practice standard for container orchestration and has gained wide adoption in the enterprise community.

Kubernetes Architecture
Kubernetes architecture consists of various components that work together to manage and orchestrate containers within a cluster. Here’s an overview of the key components and their roles with its Architecture diagram:

Kubernetes Architecture

A. CONTROL PLANE
The control plane in Kubernetes is responsible for managing and maintaining the desired state of the cluster. It consists of several components that work together to orchestrate and control the cluster’s operations. Here’s an overview of the architecture of the control plane in Kubernetes:

Kubernetes Architecture - Control Plane

1. API SERVER:
The API Server on the master node in Kubernetes acts as the “control center” for the entire cluster. Its primary job is to handle incoming requests and provide a way for users, administrators, and other components to interact with the cluster.

⚙️ The API Server serves as an interface or entry point that allows users to communicate with the cluster and perform various actions such as creating, updating, and deleting resources like Pods, Services, and Deployments. It exposes the Kubernetes API, which clients can use to interact with the cluster programmatically or through tools like kubectl.

⚙️The API Server also performs important tasks like authentication and authorization, ensuring that only authorized users or applications can access and modify the cluster’s resources. It verifies the identity of the requestor and checks if they have the necessary permissions to perform the requested actions.

⚙️Additionally, the API Server maintains the cluster’s state and configuration by storing information about the resources and their current status. It communicates with the etcd database, which acts as the persistent store, to read and write this information.

2. ETCD STORE:
The etcd store on the master node in Kubernetes serves as a database that stores and maintains the cluster’s configuration data and state information. It acts as a reliable source of truth for the control plane components, allowing them to access and update the cluster’s information.

The primary job of the etcd store can be summarized as follows:

**⚙️ Data Storage: **Etcd stores the desired state of the cluster’s resources, such as Pods, Services, Deployments, and ReplicaSets. It keeps track of the configurations and specifications for these resources, including their metadata, labels, and relationships.

**⚙️Consistency and Replication: **Etcd ensures that the stored data remains consistent across the distributed system. It uses replication techniques to replicate the data across multiple etcd nodes, ensuring redundancy and fault tolerance. This replication mechanism allows the etcd store to continue functioning even if some nodes fail.

⚙️Cluster State Management: The etcd store maintains information about the current state of the cluster, including the status of nodes, availability of resources, and health checks. It stores metadata and runtime information for each node in the cluster, enabling control plane components to make informed decisions and perform necessary actions.

⚙️Watch and Notification System: Etcd supports a watch mechanism that allows components to monitor changes to the stored data in real-time. Control plane components can set up watches on specific keys or directories in etcd to receive notifications when changes occur. This feature helps components stay informed about updates and trigger appropriate actions accordingly.

3. KUBE CONTROLLER MANAGER:
The Kube Controller Manager on the master node in Kubernetes acts as the “brain” or “manager” of the cluster. Its main job is to monitor and control the state of the cluster, ensuring that the desired state is maintained and responding to any changes or events that occur.

Here’s a simplified explanation of the Kube Controller Manager’s job:

⚙️ Resource Monitoring: The Kube Controller Manager continuously monitors the state of resources in the cluster. It keeps an eye on various Kubernetes objects like Pods, Services, Deployments, ReplicaSets, and more. It checks if these resources exist, is running as expected, and if any changes or failures occur.

**⚙️Desired State Enforcement: **The Kube Controller Manager ensures that the cluster’s resources match the desired state specified by users or administrators. It compares the actual state of resources with the desired state and takes action to reconcile any discrepancies. For example, if a Pod fails or gets deleted, the Controller Manager will initiate the creation of a new Pod to maintain the desired number of replicas.

⚙️Automatic Healing: If any resource fails or becomes unhealthy, the Kube Controller Manager takes corrective actions to heal the cluster. It can restart failed Pods, reschedule them to healthy nodes, or create new instances as needed. This helps in maintaining the overall health and availability of the cluster’s resources.

**⚙️Scaling and Auto-scaling: **The Kube Controller Manager handles scaling operations. It can scale resources like Deployments and ReplicaSets by creating or terminating instances based on the specified scaling policies or metrics. For example, it can automatically add more Pods to handle the increased workload or remove Pods during periods of low demand.

**⚙️Event-driven Actions: **The Kube Controller Manager listens for events and triggers actions accordingly. It reacts to events such as Pod creation, deletion, or changes in resource utilization. Based on these events, it can perform tasks like load balancing, triggering rolling updates, or adjusting the cluster’s configuration.

Some types of these controllers are:

· Replication controller: Ensures the correct number of pods is in existence for each replicated pod running in the cluster.

**· Node Controller: **Monitors the health of each node and notifies the cluster when nodes come online or become unresponsive.

**· Endpoints controller: **Connects Pods and Services to populate the Endpoints object.

· Service Account and Token Controllers: Allocates API access tokens and default accounts to new namespaces in the cluster.

4. KUBE SCHEDULER:
The Kube Scheduler on the master node in Kubernetes acts as the “matchmaker” for the cluster. Its primary job is to decide which worker node in the cluster should run each newly created Pod based on various factors and constraints.

Here’s a simplified explanation of the Kube Scheduler’s job:

⚙️ Pod Scheduling: When a new Pod is created in Kubernetes, the Kube Scheduler determines the most suitable worker node to run it. It takes into account factors like resource availability, node capacity, and other scheduling preferences to make an optimal decision.

⚙️Resource Optimization: The Kube Scheduler looks at the resource requirements of the Pod, such as CPU and memory, and checks the availability of these resources on the worker nodes. It aims to distribute the workload evenly across the cluster to ensure efficient resource utilization.

⚙️ Node Affinity/Anti-affinity: The Kube Scheduler considers any affinity or anti-affinity rules specified in the Pod’s configuration. These rules define preferences or constraints regarding the placement of the Pod. For example, a Pod may be required to run on a node with specific labels or avoid running on nodes with certain labels.

⚙️Load Balancing: The Kube Scheduler aims to balance the workload across worker nodes to prevent any single node from becoming overloaded. It takes into account the current load on each node and distributes Pods accordingly, promoting efficient utilization and preventing resource bottlenecks.

⚙️High Availability: The Kube Scheduler ensures high availability by considering fault tolerance. It avoids placing multiple instances of the same Pod on the same node to minimize the impact of node failures. This way, if a node goes down, the Pod can be quickly rescheduled on a healthy node.

⚙️Custom Scheduling Policies: The Kube Scheduler can also take into account custom scheduling policies defined by administrators. These policies may prioritize certain Pods or enforce specific placement rules based on business requirements or application characteristics.

5. CLOUD CONTROLLER MANAGER:
The Cloud Controller Manager (CCM) on the master node in Kubernetes acts as a bridge between the Kubernetes cluster and the underlying cloud provider’s services. Its main job is to manage and interact with the cloud infrastructure on behalf of the cluster, enabling Kubernetes to leverage the cloud provider’s capabilities.

_Here’s a simplified explanation of the Cloud Controller Manager’s job:
_

⚙️Cloud Provider Integration: The Cloud Controller Manager integrates Kubernetes with the services and features provided by the underlying cloud provider. It understands the cloud provider’s APIs, protocols, and mechanisms for interacting with the infrastructure.

⚙️ Resource Management: The Cloud Controller Manager manages cloud resources that are relevant to Kubernetes, such as virtual machines (VMs), load balancers, storage volumes, and networking components. It creates, deletes, and manages these resources based on the cluster’s needs and user-defined configurations.

⚙️Node Management: The Cloud Controller Manager handles the management of worker nodes in the cluster. It interacts with the cloud provider to provision and manage the VM instances that serve as worker nodes. It ensures that the nodes are properly created, scaled, and terminated as needed while adhering to the cluster’s specifications.

⚙️ Load Balancing: The Cloud Controller Manager configures and manages load balancers provided by the cloud provider. It automatically provisions and configures load balancers to distribute incoming network traffic across the Pods or services running in the cluster. This helps to ensure high availability, scalability, and efficient traffic routing.

⚙️Storage Provisioning: The Cloud Controller Manager interfaces with the cloud provider’s storage services to provision and manage storage resources needed by the cluster. It dynamically creates and attaches storage volumes, such as Persistent Volumes, to Pods, enabling applications to store data persistently.

⚙️Networking: The Cloud Controller Manager configures and manages networking components provided by the cloud provider. It ensures that Pods can communicate with each other across nodes, manages network policies, and sets up networking rules to allow external access to services.

B. DATA PLANE

Kubernetes Architecture - Data Plane

Data Plane consists of three main components:

1. Container Runtime:
It is a software that is responsible for managing the execution and lifecycle of containers on workers' nodes. It interacts with the operating systems kernel to create, start, stop, and manage the containers. The most commonly used container runtime with Kubernetes is Docker, but there are also other options like dockershim, containerd, CRI-O, and rkt. It pulls container images, creates containers, mounts volumes, manages networking, and enforces resource constraints.

2. Kubelet:
It is an essential component that runs on each worker node in the cluster. Its primary responsibility is to manage the state of the nodes and ensure that the containers running on the node are running as expected. Here’s a closer look at the role and functions of the kubelet:

2.A. POD MANAGEMENT

⚙️ Pod Creation: The kubelet receives Pod specifications from the API server and is responsible for creating and managing the containers that make up the Pod on the node. It communicates with the container runtime (e.g., Docker) to pull container images and create containers based on the Pod specifications.

**⚙️Pod Monitoring: **The kubelet continuously monitors the health of the containers within the assigned Pods. It regularly checks the container status, resource usage, and health probes defined in the Pod specifications. If a container fails or becomes unresponsive, the kubelet takes appropriate actions to recover or restart the container.

⚙️Resource Management: The kubelet manages the resources allocated to each Pod and enforces resource constraints defined in the Pod specifications. It monitors CPU, memory, and other resource usage of containers and ensures they stay within the specified limits.

2.B. NODE STATUS AND REPORTING:

⚙️ Node Heartbeat: The kubelet sends periodic heartbeats to the cluster’s control plane, indicating that the node is alive and functioning properly. This heartbeat includes information about the node’s resources, availability, and any changes in the status of its assigned Pods.

**⚙️ Node Registration: **When a node joins the cluster, the kubelet registers itself with the API server, providing information about the node, its capacity, and available resources.

**⚙️ Node Eviction: **If the control plane detects that a node is unresponsive or unhealthy, it can initiate node eviction. The kubelet gracefully terminates the Pods running on the node and notifies the control plane about the node’s status change.

2.C. CONTAINER LIFECYCLE:

⚙️ Container Start and Stop: The kubelet starts and stops containers based on Pod specifications. It ensures that the required containers are running and, if necessary, pulls the container images from the registry.

⚙️ Container Cleanup: When a Pod is removed or its containers are terminated, the kubelet ensures the proper cleanup of containers, volumes, and other associated resources on the node.

2.D. VOLUME MANAGEMENT:

**⚙️Volume Attach and Mount: **The kubelet manages the lifecycle of volumes attached to Pods. It ensures that the specified volumes are attached to the containers and mounted as expected.

**⚙️Volume Cleanup: **When a Pod is removed or its volumes are no longer needed, the kubelet detaches and cleans up the associated volumes.

3. Kubeproxy:
It is a component that runs on each worker node and is responsible for network proxying and load balancing within the cluster. Its main role is to enable communication between services and manage networking on the worker nodes. Here’s a closer look at the functions and responsibilities of kube-proxy:

3.A. SERVICE DISCOVERY:

⚙️ Service Endpoint Discovery: kube-proxy monitors the Kubernetes API server for changes in the service configuration. It discovers services and their associated Pods, retrieving their IP addresses and endpoints.

⚙️Endpoints Update: Whenever a Pod is added or removed or a service is created, updated, or deleted, kube-proxy updates the local network configuration on the worker node to reflect the changes.

3.B. LOAD BALANCING:

⚙️ Service Load Balancing: kube-proxy provides load balancing functionality for services that have multiple Pod replicas. It distributes incoming traffic across the available Pods, ensuring even distribution and efficient utilization of resources.

⚙️IP Virtual Services: kube-proxy uses IPVS (IP Virtual Server) as the underlying mechanism for load balancing. IPVS is a kernel-level feature that allows for high-performance load balancing by distributing network traffic based on various algorithms like round-robin, least connections, or source IP hash.

3.C. NETWORK PROXYING:

⚙️ Service Cluster IP: kube-proxy assigns a virtual IP address, known as the Cluster IP, to each service in the cluster. It ensures that requests made to the Cluster IP are properly routed to the appropriate Pods that back the service.

⚙️External Traffic: kube-proxy also facilitates external access to services within the cluster. It sets up network address translation (NAT) rules or uses load balancers provided by cloud providers to enable communication between external clients and the services in the cluster.

3.D. HIGH AVAILABILITY and FAILOVER:

⚙️ Endpoint Health Checks: kube-proxy periodically checks the health of the Pods associated with a service by sending requests to their endpoints. It detects any unhealthy or unresponsive endpoints and excludes them from the load balancing rotation until they become healthy again.

⚙️Endpoint Failover: In case a Pod or endpoint fails, kube-proxy dynamically adjusts the load balancing configuration, removing the failed endpoint from the rotation and redirecting traffic to the remaining healthy endpoints.

⚙️ IPv6 Support: kube-proxy supports IPv6 in addition to IPv4, allowing services and Pods to be addressed and accessed using IPv6 addresses.

PODS AND SERVICES:
In Kubernetes, pods and services are fundamental building blocks that work together to enable the deployment and networking of applications. Here’s an explanation of pods and services in Kubernetes:

Kubernetes Architecture - Pods and Services

PODS:
A pod is the smallest and simplest unit in the Kubernetes ecosystem. It represents a group of one or more containers deployed together on the same host and sharing the same network namespace.

Containers within a pod are tightly coupled and typically work together to form a cohesive application or microservice. They share the same IP address and port space, making it easy for them to communicate with each other using localhost.

Pods are ephemeral, meaning they can be created, stopped, and replaced as needed. They are often used as the deployment target for applications and encapsulate the application’s code, dependencies, and resources.

SERVICES:
A service is an abstraction that defines a logical set of pods and provides a consistent way to access them. It acts as a stable network endpoint that enables communication with the pods regardless of their dynamic nature.

Services provide a higher-level networking mechanism for pods. They assign a unique IP address and DNS name to a group of pods, allowing other components or services within or outside the cluster to communicate with the pods using these identifiers.

Services can be of different types, such as ClusterIP (accessible only within the cluster), NodePort (exposes the service on a static port on each worker node), or LoadBalancer (provisions a cloud provider’s load balancer to distribute traffic to the service).

When pods are created or removed, services automatically update their configuration to include the new pods or remove the outdated ones, ensuring seamless and uninterrupted communication.

✍️Sashi Akula

Top comments (0)