Kenta Takeuchi

Posted on Mar 15 • Originally published at bmf-tech.com

Reading Kubernetes Documentation - Summary of Concepts

#kubernetes #docker #container

This article was originally published on bmf-tech.com.

Overview

To seriously catch up with Kubernetes, I read the documentation and left my personal notes. Since it's long, I only took notes on the concepts section.

kubernetes.io

What is Kubernetes?

cf. What is Kubernetes?

What is Kubernetes?

Declarative configuration management
Promotion of automation
A platform for managing containerized workloads and services

Looking Back

Deployment before virtualization (Traditional deployment)
- No resource limitations for applications on physical servers
- Resource allocation issues
- Difficult to scale
- Maintenance costs
Deployment using virtualization (Virtualized deployment)
- Applications can be isolated per VM
- Restriction on data access between applications
- Improved resource utilization within physical servers through virtualization
- Easy to add or update applications
- Reduced hardware costs, improved scalability
Deployment using containers (Container deployment)
- OS can be shared between applications
- Lightweight
- Containers have their own filesystem, CPU, memory, process space, etc.
- Independent of cloud or OS distribution
- Benefits of containers
- Easier and more efficient to create container images than VM images
- Continuous build and deployment of container images
- Separation of concerns between development and operations
  - Application container images are created during build and release
- High observability
  - In addition to OS-level information and metrics, also includes application status and other alerts
- Consistency across environments
  - Can be run the same way in development, testing, and production
- Portability across cloud and OS distributions
  - Can be run in any environment, whether on-premises or public cloud
- Application-centric management
  - From running OS on virtual machines to running applications using logical resources on OS
- High affinity with microservices
  - Compatible with loosely coupled, distributed, scalable, and flexible microservices
- Resource partitioning
  - Predictable application performance
- Efficient use and aggregation of resources

Why Kubernetes is Needed and Its Features

Service discovery and load balancing
- Containers can be exposed using DNS names or IP addresses
- Network traffic can be distributed
Storage orchestration
- Freedom to choose storage to mount
Automated rollouts and rollbacks
- Can define the state of containers to be deployed
Automatic bin packing
- Can declare CPU and memory (RAM) required by containers
- Can adjust according to nodes, efficiently utilizing resources
Self-healing
- Can restart, replace, or terminate containers that fail to start
Secrets and configuration management
- Can update application configuration without recreating container images

What Kubernetes Does Not Include

Kubernetes does not...
- Limit the types of applications it supports
- Deploy source code or build applications
- Provide built-in application-level (middleware, databases, caches, etc.) features
- Specify logging, monitoring, or alerting features
- Provide a configuration language
- Provide or adopt systems for machine configuration, maintenance, management, or self-healing
- Assume orchestration

Kubernetes Components

cf. Kubernetes Components

Deploying Kubernetes results in a cluster
- A cluster is a set of nodes that run containerized applications
- Every cluster has at least one worker node
- Worker nodes host Pods, which are components of applications
- Master nodes manage worker nodes and Pods within the cluster
- Using multiple master nodes provides failover and high availability to the cluster
- The control plane manages worker nodes and Pods within the cluster
- In production environments, multiple nodes can be used to provide fault tolerance and high availability
- Diagram of a Kubernetes Cluster

Control Plane Components

Makes overall decisions about the cluster (e.g., scheduling)

kube-apiserver

Component that exposes the Kubernetes API externally
Designed to scale horizontally

etcd

Consistent, highly available key-value store
Stores all cluster information for Kubernetes
When using etcd as a data store for Kubernetes, always create a backup plan

kube-scheduler

Monitors newly created Pods that have no node assigned, and selects a node for them to run on

kube-controller-manager

Runs multiple controller processes
Operates as a single process
- Logically, each controller is compiled into a single executable file
Includes the following controllers:
- Node Controller
- Notifies and responds when nodes go down
- Replication Controller
- Maintains the correct number of Pods for all replication controller objects
- Endpoint Controller
- Links Services and Pods
- Service Account and Token Controller
- Creates default accounts and API access tokens for new namespaces

cloud-controller-manager

Runs controllers that interact with the underlying cloud provider
The following controllers have dependencies on the cloud provider:
- Node Controller
- Checks with the cloud provider to determine if a node has been deleted after it stops responding
- Routing Controller
- Sets up routing in the underlying cloud infrastructure
- Service Controller
- Creates, updates, and deletes cloud provider load balancers
- Volume Controller
- Creates, attaches, mounts volumes, and coordinates with the cloud provider

Node Components

Provides the runtime environment for managing Pods on all nodes

kubelet

Agent that runs on each node in the cluster
Ensures that containers are running in a Pod

kube-proxy

Network proxy that runs on each node in the cluster, implementing part of the Kubernetes Service concept

Container Runtime

Software responsible for running containers
ex. Docker, containerd, CRI-O etc...

Add-ons

Uses Kubernetes resources (DaemonSet, Deployment, etc.) to implement cluster features
Provides cluster-level features
- Add-ons that require namespaces belong to the kube-system namespace
Some add-ons include:
- DNS
- Web UI
- Container Resource Monitoring
- Cluster-level Logging

Kubernetes API

cf. Kubernetes API

Refer to API Reference

About Kubernetes Objects

cf. About Kubernetes Objects

Object spec and status

Kubernetes objects have two nested object fields that manage the object's configuration
- spec
- Describes the desired state and characteristics of the object
- status
- Indicates the current state of the object

Kubernetes Object Management

cf. Kubernetes Object Management

Management Methods

Imperative Commands
- Targets existing objects
- Recommended for development project environments
Imperative Object Configuration
- Targets individual files
- Recommended for production project environments
Declarative Object Configuration
- Targets directories of files
- Recommended for production project environments

Imperative Commands

Users perform operations on existing objects in the cluster

Imperative Object Configuration

Specify the operation, optional flags, and one or more filenames with the kubectl command

Declarative Object Configuration

Users operate on configuration files located locally
Operations are not recorded in the files

Object Names and IDs

cf. Object Names and IDs

Namespace

cf. Namespace

Supports running multiple virtual clusters on the same physical cluster
- These virtual clusters are called Namespaces

Nodes

cf. Nodes

Worker machines
A node can be a VM or a physical machine, depending on the cluster
Each node includes services necessary to run Pods and is managed by master components

Pod Overview

cf. Pod Overview

The smallest deployable unit in the Kubernetes object model

Understanding Pods

Pods are the basic execution units of Kubernetes applications
Encapsulate application containers, storage resources, unique network IPs, and options for managing container execution

ReplicaSet

cf. ReplicaSet

Aims to maintain a stable set of replica Pods at all times

When to Use ReplicaSet

Ensures that a specified number of Pod replicas are running at all times

Deployment

cf. Deployment

Provides declarative updates for Pods and ReplicaSets

StatefulSet

cf. StatefulSet

A workload API for managing stateful applications
Manages scaling of a set of Pods and ensures order and uniqueness

DaemonSet

cf. DaemonSet

Ensures that all (or some) nodes run a copy of a Pod

Job

cf. Job

Creates one or more Pods and ensures that a specified number of them terminate successfully
Tracks successful completion of Pods

Service

cf. Service

An abstract way to expose an application running on a set of Pods as a network service

Motivation for Using Service

Pods are designed to be ephemeral, and when they are created and stopped, they are not recreated
- Using Deployment for application operation allows dynamic creation and deletion of Pods
Each Pod has its own IP address

Service Resources

In Kubernetes, a Service defines a logical set of Pods and a policy for accessing them

Service Exposure (Service Types)

ClusterIP
- Exposes the Service on a cluster-internal IP
NodePort
- Exposes the Service on a static port on each Node's IP
LoadBalancer
- Uses a cloud provider's load balancer to expose the Service externally
ExternalName
- Maps the Service to the contents specified in the externalName field by returning a CNAME record

Configuration

cf. Configuration

Best Practices for Configuration

General configuration tips
- Use the latest stable API version
- Configuration files should be stored in a version control system
- Use YAML instead of JSON. While compatibility is similar, YAML is more user-friendly.
- Group related objects into a single file when meaningful
- Remember that many kubectl commands can be called on directories
- Avoid specifying default values unnecessarily. Simplicity and minimalism reduce errors.
- Add annotations to describe objects

ConfigMap

An API object used to store non-confidential data in key-value pairs.
- ConfigMap does not provide confidentiality or encryption. Use Secret for sensitive data or additional third-party tools.
Pods can use ConfigMap as environment variables, command-line arguments, or configuration files within volumes

Secrets

Allows storage and management of sensitive information like passwords, OAuth tokens, and SSH keys
Can be included in Pod definitions or images

Security

cf. https://kubernetes.io/ja/docs/concepts/security/

Overview of Cloud Native Security

The 4Cs of Cloud Native Security

Security can be thought of in layers.
The 4Cs of Cloud Native
- Cloud
- Cluster
- Container
- Code

Infrastructure Security

Concerns related to Kubernetes infrastructure
- Network access to the API Server (control plane)
- Network access to Nodes
- Access to cloud provider APIs from Kubernetes
- Access to etcd
- Encryption of etcd

Components Within the Cluster (Applications)

Concerns related to workload security
- RBAC authorization (access to Kubernetes API)
Authentication
Secret management for applications (and encryption when stored in etcd)
PodSecurityPolicy
Quality of Service (and cluster resource management)
NetworkPolicy
TLS for Kubernetes Ingress

Containers

Concerns related to containers
- Vulnerability scanning and OS-dependent security
- Image signing and enforcement
- Do not allow privileged users

Code

Concerns related to code
- Access only via TLS
- Restrict communication port ranges
- Security dependencies on third parties
- Static code analysis
- Dynamic probing attacks

Impressions

The notes are quite abbreviated. It took a fair amount of time to read through the documentation...

References

While progressing through the Kubernetes documentation, I also looked at external materials that were helpful.