DEV Community

James Lee
James Lee

Posted on

Kubernetes Resource Orchestration: How kubelet Prepares Storage, Network & Compute for Every Pod

In the previous article we covered how kube-scheduler selects the optimal node for a Pod through its three-phase pipeline (Filter → Score → Preemption). Once that decision is written to etcd, the baton passes to kubelet. This article covers what happens next: resource orchestration.


1. What Is Resource Orchestration?

Resource orchestration is the process by which a worker node's kubelet, upon receiving a scheduling result, organizes and prepares all the resources a workload needs before its containers can start.

These resources fall into three categories:

Category What gets prepared
Storage Persistent volumes (PVCs) and ephemeral (temporary) storage
Network Shared Linux network stack for the Pod + per-container network devices
Compute CPU and memory allocation and management via cgroups

2. Where Orchestration Fits in the Control Flow

Resource orchestration is step ⑤ of the resource creation pipeline — it begins the moment kubelet detects a new scheduling binding in etcd:

┌─────────────────────────────────────────────────────────────────────┐
│   kubectl / REST Request                                            │
│          │ ①                                                        │
│          ▼                                                          │
│   ┌──────────────────────────────────────────────────────────────┐  │
│   │                    kube-apiserver                            │  │
│   └──┬───────────────────────┬──────────────────────────────┬───┘  │
│    ② │                     ③ │                            ④ │      │
│      ▼                       ▼                              ▼      │
│    etcd          kube-controller-manager            kube-scheduler  │
│                                                    (binding → etcd) │
│                                                           │ ⑤       │
│                                                           ▼         │
│                                           ┌──────────────────────┐  │
│                                           │       kubelet        │  │
│                                           │  RESOURCE            │  │
│                                           │  ORCHESTRATION  ←    │  │
│                                           │        │ ⑥           │  │
│                                           │  ┌─────▼──────────┐  │  │
│                                           │  │ Pod [C]...[C]  │  │  │
│                                           │  └────────────────┘  │  │
│                                           └──────────────────────┘  │
└─────────────────────────────────────────────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

Trigger: kubelet watches etcd for new Pod/Node binding records at its designated path. When a new binding appears for its node, orchestration begins immediately.


3. The Resource Orchestration Pipeline

The three resource categories are prepared in a strict sequential order — each phase depends on the previous one being ready:

Scheduling result received by kubelet
          │
          ▼
┌─────────────────────────────────────────────────────────────┐
│  PHASE 1: STORAGE                                           │
│                                                             │
│  ┌──────────────────────┐  ┌──────────────────────────┐    │
│  │  Persistent Storage  │  │   Ephemeral Storage      │    │
│  │  (PVC → PV mount)    │  │   (emptyDir, configMap,  │    │
│  │                      │  │    secret, etc.)          │    │
│  └──────────────────────┘  └──────────────────────────┘    │
└─────────────────────────────┬───────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────┐
│  PHASE 2: NETWORK                                           │
│                                                             │
│  Step A: Create shared Linux network stack for the Pod      │
│          (network namespace) + connect to host network      │
│                    │                                        │
│                    ▼                                        │
│  Step B: Create per-container network devices               │
│          + attach to the shared Pod network stack           │
└─────────────────────────────┬───────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────┐
│  PHASE 3: COMPUTE                                           │
│                                                             │
│  ┌──────────────────────┐  ┌──────────────────────────┐    │
│  │  Memory Management   │  │    CPU Management        │    │
│  │  (cgroup limits,     │  │    (cgroup CPU shares,   │    │
│  │   OOM priority)      │  │     CPU pinning for      │    │
│  │                      │  │     guaranteed QoS)      │    │
│  └──────────────────────┘  └──────────────────────────┘    │
└─────────────────────────────┬───────────────────────────────┘
                              │
                              ▼
                All resources ready → containers start  ✅
Enter fullscreen mode Exit fullscreen mode

4. Deep Dive: Storage Orchestration

Storage is prepared first because containers may need to mount volumes before their processes start.

Persistent Storage

Persistent storage survives Pod restarts and rescheduling. The orchestration steps are:

PersistentVolumeClaim (PVC) in Pod spec
      │
      ▼
kubelet checks if PVC is already bound to a PV
      │
      ├── Not bound → kube-controller-manager's
      │               PersistentVolume controller binds PVC → PV
      │
      └── Bound → kubelet calls volume plugin to attach the volume
                  to the node (e.g. mount NFS, attach EBS disk)
                        │
                        ▼
                  Volume mounted into Pod's filesystem
                  at the path specified in volumeMounts  ✅
Enter fullscreen mode Exit fullscreen mode

Common persistent volume types:

Type Use case
PersistentVolumeClaim Dynamic provisioning (cloud disks, NFS, Ceph)
hostPath Direct mount from node filesystem (dev/testing only)
nfs Shared network filesystem across Pods
configMap / secret Configuration and credentials injected as files

Ephemeral Storage

Temporary storage exists only for the lifetime of the Pod:

Type Description
emptyDir Empty directory created when Pod starts, deleted when Pod ends
Container writable layer Each container's own writable overlay filesystem
downwardAPI Exposes Pod metadata (labels, annotations) as files

5. Deep Dive: Network Orchestration

Network setup is a two-step process, coordinated between kubelet and the CNI (Container Network Interface) plugin.

Step A — Create the Pod Network Namespace

kubelet calls CNI plugin (e.g. Flannel, Calico, Cilium)
      │
      ▼
CNI creates a new Linux network namespace for the Pod
(all containers in the Pod share this namespace)
      │
      ▼
CNI creates a virtual ethernet pair (veth pair):
  - One end: inside the Pod network namespace (eth0)
  - One end: on the host network bridge
      │
      ▼
Pod gets an IP address from the cluster CIDR
Routing rules updated so other Pods can reach this IP  ✅
Enter fullscreen mode Exit fullscreen mode

Step B — Connect Each Container

For each container in the Pod:
      │
      ▼
Container runtime creates the container
with the Pod's existing network namespace
(NOT a new namespace — shared with all siblings)
      │
      ▼
Container sees the same eth0, same IP, same ports
as all other containers in the Pod  ✅
Enter fullscreen mode Exit fullscreen mode

Key insight: All containers in a Pod share one IP address and one network namespace. They communicate with each other via localhost. This is why a Pod is the atomic unit of networking in Kubernetes, not the individual container.

kube-proxy's role

While kubelet handles Pod-level networking, kube-proxy handles Service-level networking:

kube-proxy watches Service and Endpoint objects in etcd
      │
      ▼
Maintains iptables / IPVS rules on every node
      │
      ▼
Traffic to Service ClusterIP → load-balanced to healthy Pod IPs  ✅
Enter fullscreen mode Exit fullscreen mode

6. Deep Dive: Compute Orchestration

Compute resources (CPU and memory) are managed via Linux cgroups, enforcing the requests and limits defined in the Pod spec.

Memory Management

resources:
  requests:
    memory: "256Mi"    # guaranteed minimum — node must have this free
  limits:
    memory: "512Mi"    # hard cap — exceed this → OOMKilled
Enter fullscreen mode Exit fullscreen mode
kubelet creates cgroup for the Pod
      │
      ├── memory.limit_in_bytes = limits.memory (512Mi)
      │   Container exceeds this → Linux OOM killer terminates it
      │
      └── memory.soft_limit_in_bytes = requests.memory (256Mi)
          Guaranteed allocation under memory pressure
Enter fullscreen mode Exit fullscreen mode

CPU Management

resources:
  requests:
    cpu: "500m"        # 0.5 CPU — used for scheduling decisions
  limits:
    cpu: "1000m"       # 1.0 CPU — hard cap via CFS bandwidth
Enter fullscreen mode Exit fullscreen mode
kubelet creates cgroup for the Pod
      │
      ├── cpu.shares = proportional to requests.cpu
      │   (relative weight when CPU is contested)
      │
      └── cpu.cfs_quota_us = limits.cpu
          (hard cap: container throttled if it exceeds this)
Enter fullscreen mode Exit fullscreen mode

QoS Classes

Kubernetes assigns a QoS class based on resource spec, which determines eviction priority under node pressure:

QoS Class Condition Eviction priority
Guaranteed requests == limits for all containers Last to be evicted
Burstable requests < limits (at least one container) Middle priority
BestEffort No requests or limits set First to be evicted

7. The Complete Orchestration Flow

kubelet detects new Pod/Node binding in etcd
      │
      ▼
① Admit Pod (check node-level admission, resource availability)
      │
      ▼
② Pull container images (if not cached locally)
      │
      ▼
③ Prepare STORAGE
   ├── Attach/mount persistent volumes (PVC → PV)
   └── Create ephemeral volumes (emptyDir, secrets, configMaps)
      │
      ▼
④ Prepare NETWORK (via CNI plugin)
   ├── Create Pod network namespace
   ├── Assign Pod IP from cluster CIDR
   └── Connect containers to shared network namespace
      │
      ▼
⑤ Prepare COMPUTE (via cgroups)
   ├── Set memory limits and soft limits
   └── Set CPU shares and CFS quota
      │
      ▼
⑥ Start containers via container runtime (containerd/CRI-O)
      │
      ▼
⑦ Monitor Pod health (liveness/readiness probes)
   Report status back to kube-apiserver  ✅
Enter fullscreen mode Exit fullscreen mode

8. Summary

Phase Managed by Key mechanism Purpose
Storage kubelet + volume plugins PVC/PV binding, mount Provide data persistence and config injection
Network kubelet + CNI plugin Linux netns, veth pairs, iptables Give Pod a unique IP, enable cluster-wide connectivity
Compute kubelet + cgroups cpu.shares, memory.limit_in_bytes Enforce resource requests/limits, determine QoS class

Resource orchestration is the bridge between the scheduler's abstract decision ("run this Pod on Node D") and the concrete reality of a running container. It's where Kubernetes turns YAML into a live process — with its own IP address, mounted volumes, and guaranteed CPU and memory.


Next in this series: Kubernetes Data Access Flow: How Pods Read and Write Persistent Storage (Part 6)


Follow the series for more deep dives into Kubernetes internals.

Top comments (0)