<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Pendela BhargavaSai</title>
    <description>The latest articles on DEV Community by Pendela BhargavaSai (@pendelabhargavasai).</description>
    <link>https://dev.to/pendelabhargavasai</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F862755%2Fdccb0f1a-a7eb-46c5-a5c7-c0d4514eaae6.png</url>
      <title>DEV Community: Pendela BhargavaSai</title>
      <link>https://dev.to/pendelabhargavasai</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/pendelabhargavasai"/>
    <language>en</language>
    <item>
      <title>K3s vs Kubernetes: A Deep Dive into Control Plane Architecture</title>
      <dc:creator>Pendela BhargavaSai</dc:creator>
      <pubDate>Tue, 09 Jun 2026 03:30:00 +0000</pubDate>
      <link>https://dev.to/pendelabhargavasai/k3s-vs-kubernetes-a-deep-dive-into-control-plane-architecture-489k</link>
      <guid>https://dev.to/pendelabhargavasai/k3s-vs-kubernetes-a-deep-dive-into-control-plane-architecture-489k</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Not just "what's different" — but WHY it's different, HOW each component works under the hood, and WHEN to choose which.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  🧠 Why This Post Exists
&lt;/h2&gt;

&lt;p&gt;Every "K3s vs K8s" article you've read probably gave you a table with checkmarks and said "K3s is lightweight." That's true — but &lt;em&gt;why&lt;/em&gt; is it lightweight? What did Rancher actually strip out, merge, or replace? What are the architectural trade-offs you inherit when you deploy K3s in production?&lt;/p&gt;

&lt;p&gt;This post tears open both control planes component by component. We'll go deep into what each piece actually &lt;em&gt;does&lt;/em&gt; at the byte level, then see how K3s reimagines it.&lt;/p&gt;




&lt;h2&gt;
  
  
  🏗️ The Kubernetes Control Plane: A Ground-Up Look
&lt;/h2&gt;

&lt;p&gt;Before comparing, let's build a mental model of each standard Kubernetes control plane component. Not the 30-second version — the real one.&lt;/p&gt;




&lt;h3&gt;
  
  
  1. 🔵 kube-apiserver — The Brain's Frontal Lobe
&lt;/h3&gt;

&lt;h4&gt;
  
  
  What It Actually Does
&lt;/h4&gt;

&lt;p&gt;The API server is not just a REST endpoint. It is the &lt;strong&gt;only component in Kubernetes that talks directly to etcd&lt;/strong&gt;. Every other component — scheduler, controller-manager, kubelet — communicates exclusively through the API server. This is a deliberate architectural decision called the &lt;strong&gt;hub-and-spoke pattern&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;When you run &lt;code&gt;kubectl apply -f deployment.yaml&lt;/code&gt;, here's what actually happens:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl → HTTPS → kube-apiserver
                    │
                    ├── 1. Authentication (Who are you?)
                    │       └── x509 certs / Bearer tokens / OIDC /Webhook
                    │
                    ├── 2. Authorization (Can you do this?)
                    │       └── RBAC / ABAC / Node / Webhook evaluators
                    │
                    ├── 3. Admission Control (Should this be allowed?)
                    │       ├── Mutating Webhooks  ← can MODIFY the object
                    │       └── Validating Webhooks ← can REJECT theobject
                    │
                    ├── 4. Schema Validation
                    │       └── OpenAPI v3 schema enforcement per GVK
                    │
                    └── 5. Persist to etcd
                            └── /registry/deployments/default/my-app
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  The Watch Mechanism — The Heartbeat of Kubernetes
&lt;/h4&gt;

&lt;p&gt;The API server implements a &lt;strong&gt;long-poll watch mechanism&lt;/strong&gt; over HTTP/2. This is what makes Kubernetes reactive rather than polling-based.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# You can see this yourself&lt;/span&gt;
kubectl get pods &lt;span class="nt"&gt;--watch&lt;/span&gt; &lt;span class="nt"&gt;-v&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;9
&lt;span class="c"&gt;# Watch the raw HTTP stream — it's a chunked HTTP response that stays open&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every controller, scheduler, and kubelet maintains a persistent &lt;strong&gt;informer&lt;/strong&gt; — a cached watch stream from the API server. The informer pattern:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Does an initial &lt;code&gt;LIST&lt;/code&gt; to populate local cache&lt;/li&gt;
&lt;li&gt;Starts a &lt;code&gt;WATCH&lt;/code&gt; from the resource version of that LIST&lt;/li&gt;
&lt;li&gt;On disconnect, re-watches from the last known &lt;code&gt;resourceVersion&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;The API server buffers events in a &lt;strong&gt;watchCache&lt;/strong&gt; in memory (configurable with &lt;code&gt;--watch-cache-sizes&lt;/code&gt;)
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;                    ┌─────────────────────────┐
                    │      kube-apiserver     │
                    │                         │
                    │  ┌─────────────────┐    │
                    │  │   etcd watch    │    │
                    │  └────────┬────────┘    │
                    │           │             │
                    │  ┌────────▼────────┐    │
                    │  │   watchCache    │    │  ← In-memory ring buffer
                    │  └────────┬────────┘    │
                    │           │             │
                    └───────────┼─────────────┘
                                │
              ┌─────────────────┼──────────────────┐
              │                 │                  │
         ┌────▼────┐      ┌─────▼─────┐     ┌─────▼─────┐
         │Scheduler│      │Controller │     │  kubelet  │
         │Informer │      │  Informer │     │  Informer │
         └─────────┘      └───────────┘     └───────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Aggregation Layer &amp;amp; CRDs
&lt;/h4&gt;

&lt;p&gt;The API server can extend itself via two mechanisms:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;CRDs (Custom Resource Definitions)&lt;/strong&gt;: Schema is stored in etcd, handled natively by the API server itself&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Aggregation Layer (AA)&lt;/strong&gt;: Proxy traffic to an &lt;em&gt;external&lt;/em&gt; API server (used by metrics-server, KEDA, etc.)
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# CRD — API server owns the storage&lt;/span&gt;
&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;apiextensions.k8s.io/v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;CustomResourceDefinition&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;widgets.example.com&lt;/span&gt;

&lt;span class="c1"&gt;# AA — API server proxies to external server&lt;/span&gt;
&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;apiregistration.k8s.io/v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;APIService&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;v1beta1.metrics.k8s.io&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;service&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;metrics-server&lt;/span&gt;
    &lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;kube-system&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Production Tuning Knobs
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kube-apiserver &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--max-requests-inflight&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;400 &lt;span class="se"&gt;\ &lt;/span&gt;         &lt;span class="c"&gt;# Max non-mutating concurrent requests&lt;/span&gt;
  &lt;span class="nt"&gt;--max-mutating-requests-inflight&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;200 &lt;span class="se"&gt;\ &lt;/span&gt;&lt;span class="c"&gt;# Max mutating concurrent requests&lt;/span&gt;
  &lt;span class="nt"&gt;--watch-cache-sizes&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;pods#1000 &lt;span class="se"&gt;\ &lt;/span&gt;       &lt;span class="c"&gt;# Per-resource watch cache sizes&lt;/span&gt;
  &lt;span class="nt"&gt;--enable-admission-plugins&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;NodeRestriction,PodSecurity &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--audit-log-path&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;/var/log/audit.log &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--audit-policy-file&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;/etc/k8s/audit-policy.yaml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  2. 🟣 etcd — The Distributed Brain's Memory
&lt;/h3&gt;

&lt;h4&gt;
  
  
  What etcd Actually Is
&lt;/h4&gt;

&lt;p&gt;etcd is a &lt;strong&gt;distributed key-value store&lt;/strong&gt; built on the &lt;strong&gt;Raft consensus algorithm&lt;/strong&gt;. It's not a database in the traditional sense — it's a fault-tolerant state machine where every write must be agreed upon by a quorum of nodes before it's committed.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│   etcd-0    │     │   etcd-1    │     │   etcd-2    │
│  (LEADER)   │◄────│  (FOLLOWER) │     │  (FOLLOWER) │
│             │────►│             │     │             │
└──────┬──────┘     └─────────────┘     └──────▲──────┘
       │                                       │
       └───────────────────────────────────────┘
                    Raft Heartbeats
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Raft in Plain English
&lt;/h4&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Leader Election&lt;/strong&gt;: One node becomes leader. It sends heartbeats. If 2+ nodes don't hear a heartbeat, they call an election.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Log Replication&lt;/strong&gt;: Every write goes to the leader. Leader appends it to its log and replicates it to followers. Once a majority acknowledges, the write is &lt;strong&gt;committed&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Quorum Math&lt;/strong&gt;: &lt;code&gt;(n/2) + 1&lt;/code&gt; nodes must agree. For 3 nodes: 2. For 5 nodes: 3.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;etcd write path:
Client → Leader APPEND entry to log
         Leader SEND AppendEntries RPC to all followers
         Followers ACKNOWLEDGE
         Leader COMMITS when the majority ack
         Leader RESPONDS to client
         Leader NOTIFIES followers of the commit
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  How Kubernetes Data Lives in etcd
&lt;/h4&gt;

&lt;p&gt;All Kubernetes objects are stored under &lt;code&gt;/registry/&lt;/code&gt; with the structure:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/registry/{resource-type}/{namespace}/{name}

Examples:
/registry/pods/default/nginx-7d8b9f-xyz
/registry/deployments/kube-system/coredns
/registry/secrets/default/my-secret
/registry/events/default/pod-scheduled-event
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The data is serialized using &lt;strong&gt;protobuf&lt;/strong&gt; (not JSON!) for efficiency. You can inspect it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Decode an etcd value&lt;/span&gt;
etcdctl get /registry/pods/default/nginx &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--endpoints&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;https://127.0.0.1:2379 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--cacert&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;/etc/kubernetes/pki/etcd/ca.crt &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--cert&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;/etc/kubernetes/pki/etcd/server.crt &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;/etc/kubernetes/pki/etcd/server.key &lt;span class="se"&gt;\&lt;/span&gt;
  | auger decode  &lt;span class="c"&gt;# github.com/jpbetz/auger&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  MVCC — Multi-Version Concurrency Control
&lt;/h4&gt;

&lt;p&gt;etcd uses MVCC, meaning it keeps &lt;strong&gt;multiple historical versions&lt;/strong&gt; of every key. Each write increments a global &lt;code&gt;revision&lt;/code&gt; counter. The API server uses this &lt;code&gt;resourceVersion&lt;/code&gt; for watch ordering and conflict detection.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# See the revision&lt;/span&gt;
etcdctl get /registry/pods/default/nginx &lt;span class="nt"&gt;-w&lt;/span&gt; json | jq .header.revision
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When etcd's keyspace grows too large (default compaction at 2GB), older revisions are &lt;strong&gt;compacted&lt;/strong&gt; — deleted. This is why very old watches can fail with "compacted" errors.&lt;/p&gt;

&lt;h4&gt;
  
  
  etcd Failure Modes You Must Know
&lt;/h4&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Scenario&lt;/th&gt;
&lt;th&gt;What Happens&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1 node fails (3-node cluster)&lt;/td&gt;
&lt;td&gt;Cluster continues. Writes still work.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2 nodes fail (3-node cluster)&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;CLUSTER STOPS ACCEPTING WRITES&lt;/strong&gt;. API server returns 503.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Leader fails&lt;/td&gt;
&lt;td&gt;Election happens. ~150-300ms downtime while new leader is elected.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Network partition&lt;/td&gt;
&lt;td&gt;Minority partition goes read-only. Majority continues.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;etcd OOM&lt;/td&gt;
&lt;td&gt;API server loses state store. Catastrophic.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;⚠️ &lt;strong&gt;This is the critical difference with K3s.&lt;/strong&gt; If you're running K3s with embedded SQLite, you get &lt;em&gt;zero&lt;/em&gt; HA for the datastore by default.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h3&gt;
  
  
  3. 🟡 kube-scheduler — The CPU-Time Auctioneer
&lt;/h3&gt;

&lt;h4&gt;
  
  
  What It Actually Does
&lt;/h4&gt;

&lt;p&gt;The scheduler watches for &lt;strong&gt;Pods in Pending state&lt;/strong&gt; (no &lt;code&gt;nodeName&lt;/code&gt; assigned) and decides which Node they should run on. It does NOT place the pod — it simply writes the chosen &lt;code&gt;nodeName&lt;/code&gt; to the Pod spec in etcd via the API server. The kubelet on that node then sees its name and starts the pod.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Pod created (nodeName: "")  →  Scheduler sees it via watch
                            →  Runs filtering + scoring
                            →  Writes nodeName to Pod
                            →  kubelet on that node sees the Pod
                            →  kubelet pulls image + starts container
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  The Scheduling Framework — Two-Phase Deep Dive
&lt;/h4&gt;

&lt;p&gt;Scheduling happens in two phases: &lt;strong&gt;Filtering&lt;/strong&gt; and &lt;strong&gt;Scoring&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Phase 1: Filtering (Hard Constraints — binary pass/fail)&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;All Nodes
    │
    ▼
┌─────────────────────────────────────────────────────┐
│  Filter Plugins (run in parallel, any fail = remove) │
│                                                     │
│  • NodeUnschedulable  — node.spec.unschedulable?    │
│  • NodeAffinity       — matchLabels on node?        │
│  • TaintToleration    — pod tolerates node taints?  │
│  • PodTopologySpread  — spread constraints met?     │
│  • VolumeBinding      — PVC can bind to this node?  │
│  • NodeResourcesFit   — enough CPU/mem/GPU?         │
│  • NodePorts          — hostPort conflicts?         │
└─────────────────────────────────────────────────────┘
    │
    ▼
Feasible Nodes (subset)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Phase 2: Scoring (Soft Preferences — 0-100 score)&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Feasible Nodes
    │
    ▼
┌─────────────────────────────────────────────────────┐
│  Score Plugins (weighted sum)                        │
│                                                     │
│  • LeastAllocated       — prefer less loaded nodes  │
│  • NodeAffinity         — preferred affinities      │
│  • InterPodAffinity     — co-locate or spread pods  │
│  • ImageLocality        — prefer nodes with image   │
│  • TaintToleration      — fewer preferred taints    │
│  • TopologySpreadConstraint — balance spread        │
└─────────────────────────────────────────────────────┘
    │
    ▼
Highest Score Node → Binding (nodeName written)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Preemption — What Happens When No Node Passes Filtering
&lt;/h4&gt;

&lt;p&gt;If no node can fit the Pod, the scheduler checks if &lt;strong&gt;lower priority pods&lt;/strong&gt; can be evicted to make room:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Find nodes where evicting lower-priority pods creates enough room&lt;/li&gt;
&lt;li&gt;Pick the node that requires evicting the fewest/lowest-priority pods&lt;/li&gt;
&lt;li&gt;Send eviction requests → evicted pods are deleted → pending pod is scheduled
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Priority classes matter here&lt;/span&gt;
&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;scheduling.k8s.io/v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;PriorityClass&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;high-priority&lt;/span&gt;
&lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1000000&lt;/span&gt;
&lt;span class="na"&gt;globalDefault&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;span class="c1"&gt;# System-critical pods have value: 2000001000&lt;/span&gt;
&lt;span class="c1"&gt;# They will preempt your workloads if nodes are tight&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  The Binding Cache — Optimistic Concurrency
&lt;/h4&gt;

&lt;p&gt;The scheduler maintains an &lt;strong&gt;assumed pod cache&lt;/strong&gt;. After scoring but &lt;em&gt;before&lt;/em&gt; the API server confirms the bind, the scheduler optimistically assumes the pod is placed and accounts for that node's capacity. This prevents scheduling thrash in high-throughput clusters.&lt;/p&gt;




&lt;h3&gt;
  
  
  4. 🟢 kube-controller-manager — The Reconciliation Engine
&lt;/h3&gt;

&lt;h4&gt;
  
  
  What It Actually Is
&lt;/h4&gt;

&lt;p&gt;The controller manager is a &lt;strong&gt;single binary that runs ~30+ independent control loops&lt;/strong&gt; as goroutines. Each controller watches specific resource types and reconciles desired state vs actual state.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# The reconciliation loop in pseudocode (every controller)&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    desired :&lt;span class="o"&gt;=&lt;/span&gt; get_desired_state_from_api_server&lt;span class="o"&gt;()&lt;/span&gt;
    actual  :&lt;span class="o"&gt;=&lt;/span&gt; get_actual_state_from_world&lt;span class="o"&gt;()&lt;/span&gt;

    &lt;span class="k"&gt;if &lt;/span&gt;desired &lt;span class="o"&gt;!=&lt;/span&gt; actual &lt;span class="o"&gt;{&lt;/span&gt;
        take_action_to_make_actual_match_desired&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="nb"&gt;sleep&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;resync_period&lt;span class="o"&gt;)&lt;/span&gt;  // default: 10min
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Key Controllers and What They Actually Do
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;ReplicaSet Controller&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Watches: ReplicaSets, Pods
Loop:
  current_pods = list pods with matching selector
  delta = replicaset.spec.replicas - len(current_pods)
  if delta &amp;gt; 0: create `delta` pods
  if delta &amp;lt; 0: delete abs(delta) pods (by priority: unscheduled first)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Deployment Controller&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Watches: Deployments, ReplicaSets
Loop:
  desired_rs = compute_hash(deployment.spec.template)
  if no RS with that hash: create new RS
  scale up new RS, scale down old RS (by strategy: RollingUpdate or Recreate)
  update deployment.status (readyReplicas, conditions, etc.)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Node Controller&lt;/strong&gt; — This one is critical to understand&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Watches: Nodes
Loop:
  for each node:
    if no heartbeat for node-monitor-grace-period (default 40s):
      set NodeReady=Unknown
    if no heartbeat for pod-eviction-timeout (default 5min):
      taint node with node.kubernetes.io/unreachable:NoExecute
      (this triggers pod eviction by the taint manager)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;EndpointSlice Controller&lt;/strong&gt; — How Services actually work&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Watches: Services, Pods
Loop:
  for each service:
    pods = list pods matching service.spec.selector where pod.status.ready=true
    build EndpointSlices (groups of 100 endpoints each)
    write EndpointSlices to API server
    (kube-proxy watches EndpointSlices and updates iptables/ipvs rules)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Informer + WorkQueue Architecture
&lt;/h4&gt;

&lt;p&gt;Every controller is built on the same pattern:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;API Server Watch
      │
      ▼
   Informer
   (local cache)
      │
      ▼  (on change event)
  WorkQueue  ←──── rate-limited, deduplicated
      │
      ▼
  Worker goroutines (usually 1-5)
      │
      ▼
  Reconcile function
      │
      ├── Success → remove from queue
      └── Failure → re-queue with exponential backoff
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This pattern means controllers are &lt;strong&gt;eventually consistent&lt;/strong&gt; — they don't act on every single event, they converge to the desired state over time.&lt;/p&gt;




&lt;h3&gt;
  
  
  5. 🔴 cloud-controller-manager — The Cloud API Bridge
&lt;/h3&gt;

&lt;h4&gt;
  
  
  What It Actually Does
&lt;/h4&gt;

&lt;p&gt;The CCM was extracted from kube-controller-manager in Kubernetes 1.11 specifically to decouple Kubernetes from cloud provider APIs. It runs cloud-specific control loops:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Node Controller (cloud variant)&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;On new Node joining:
  1. Fetch instance metadata from cloud API
     (AWS EC2 DescribeInstances / GCP ComputeInstances)
  2. Apply cloud provider labels:
     - topology.kubernetes.io/zone = us-east-1a
     - node.kubernetes.io/instance-type = m5.xlarge
  3. Set node addresses (internal/external IP from cloud metadata)
  4. Check if instance still exists periodically
     → If terminated in cloud: delete the Node object
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Route Controller (AWS/GCP specific)&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;For each node:
  ensure cloud routing table has route:
  pod-cidr (e.g., 10.244.1.0/24) → node instance-id

This is how pod-to-pod routing works across nodes
WITHOUT an overlay network on supported clouds
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Service Controller — The LoadBalancer Magic&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Watch Services with type=LoadBalancer:
  on CREATE: call cloud API → create load balancer
             update service.status.loadBalancer.ingress with external IP
  on UPDATE: update LB listener rules / health checks
  on DELETE: delete cloud load balancer
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is why &lt;code&gt;kubectl get svc&lt;/code&gt; shows &lt;code&gt;&amp;lt;pending&amp;gt;&lt;/code&gt; for LoadBalancer services until the cloud LB is provisioned.&lt;/p&gt;




&lt;h2&gt;
  
  
  ⚡ The K3s Control Plane: Architectural Reimagination
&lt;/h2&gt;

&lt;p&gt;Now let's look at what K3s does differently — not just "it's smaller" but &lt;em&gt;architecturally&lt;/em&gt; why.&lt;/p&gt;




&lt;h3&gt;
  
  
  K3s Single Binary Philosophy
&lt;/h3&gt;

&lt;p&gt;K3s ships as a &lt;strong&gt;single ~70MB binary&lt;/strong&gt; (&lt;code&gt;k3s&lt;/code&gt;) that embeds:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;k3s binary
├── k3s-server (control plane)
│   ├── kube-apiserver
│   ├── kube-controller-manager
│   ├── kube-scheduler
│   ├── kubelet
│   ├── kube-proxy
│   ├── embedded containerd
│   ├── embedded CoreDNS
│   ├── embedded Flannel (CNI)
│   ├── embedded Traefik (ingress)
│   ├── embedded ServiceLB (load balancer)
│   └── embedded local-path-provisioner (storage)
└── k3s-agent (worker)
    ├── kubelet
    ├── kube-proxy
    └── embedded containerd
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is not containerized — these are linked as Go packages into a single binary. Startup goes from &lt;strong&gt;~3 minutes&lt;/strong&gt; (typical K8s) to &lt;strong&gt;under 30 seconds&lt;/strong&gt;.&lt;/p&gt;




&lt;h3&gt;
  
  
  1. K3s API Server — Same Core, Slimmer Defaults
&lt;/h3&gt;

&lt;p&gt;The K3s API server is still the upstream &lt;code&gt;kube-apiserver&lt;/code&gt; — but K3s wraps it with:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Removed/Disabled by Default:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Alpha feature gates are disabled&lt;/li&gt;
&lt;li&gt;Cloud provider plugins: &lt;code&gt;--cloud-provider=external&lt;/code&gt; not set (no CCM)&lt;/li&gt;
&lt;li&gt;Several admission plugins that assume cloud infra&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;The K3s Tunnel Proxy — Replacing the CCM Node Controller&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;K3s introduces a &lt;strong&gt;reverse tunnel&lt;/strong&gt; from agent → server. In standard K8s, the API server connects &lt;em&gt;to&lt;/em&gt; the kubelet for exec/logs/port-forward. In K3s:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Standard K8s:
  kube-apiserver → kubelet:10250  (API server initiates)
  Requires API server to reach all nodes directly

K3s:
  k3s-agent → k3s-server:6443 (agent initiates)
  ┌────────────────────────────────────────────────┐
  │  WebSocket tunnel maintained by agent          │
  │  All kubelet traffic flows THROUGH this tunnel │
  └────────────────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is why K3s works &lt;strong&gt;behind NAT&lt;/strong&gt; without special networking — agents reach out, not the server. This is a fundamental architectural shift that enables edge/IoT deployments.&lt;/p&gt;




&lt;h3&gt;
  
  
  2. SQLite / Kine — The etcd Abstraction Layer
&lt;/h3&gt;

&lt;p&gt;This is the most significant architectural difference.&lt;/p&gt;

&lt;p&gt;K3s introduces &lt;strong&gt;Kine&lt;/strong&gt; (Kubernetes Is Not Etcd) — a &lt;strong&gt;shim that translates etcd's gRPC API into SQL queries&lt;/strong&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kube-apiserver
      │
      │  etcd gRPC v3 protocol (ListWatch, Txn, etc.)
      ▼
┌──────────────┐
│     Kine     │  ← translation layer
│  (etcd shim) │
└──────┬───────┘
       │  SQL queries
       ▼
┌──────────────┐
│   SQLite /   │  ← actual datastore
│  PostgreSQL  │
│    MySQL     │
│   DQLite     │
└──────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;How Kine Implements the etcd Watch API:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;etcd's watch is event-driven via gRPC streams. SQL databases don't natively support this. Kine implements it via:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Kine's core table&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;kine&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;id&lt;/span&gt;      &lt;span class="nb"&gt;INTEGER&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt; &lt;span class="n"&gt;AUTOINCREMENT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;-- acts as etcd revision&lt;/span&gt;
  &lt;span class="n"&gt;name&lt;/span&gt;    &lt;span class="nb"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;                               &lt;span class="c1"&gt;-- the key (/registry/pods/...)&lt;/span&gt;
  &lt;span class="n"&gt;created&lt;/span&gt; &lt;span class="nb"&gt;INTEGER&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;deleted&lt;/span&gt; &lt;span class="nb"&gt;INTEGER&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;create_revision&lt;/span&gt; &lt;span class="nb"&gt;INTEGER&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;prev_revision&lt;/span&gt;   &lt;span class="nb"&gt;INTEGER&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;lease&lt;/span&gt;   &lt;span class="nb"&gt;INTEGER&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;value&lt;/span&gt;   &lt;span class="nb"&gt;BLOB&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;                              &lt;span class="c1"&gt;-- the protobuf-encoded object&lt;/span&gt;
  &lt;span class="n"&gt;old_value&lt;/span&gt; &lt;span class="nb"&gt;BLOB&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- Watch is implemented as polling:&lt;/span&gt;
&lt;span class="c1"&gt;-- SELECT * FROM kine WHERE id &amp;gt; last_seen_id ORDER BY id&lt;/span&gt;
&lt;span class="c1"&gt;-- Run every ~100ms — NOT event-driven like real etcd&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;The Implications:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;For small clusters: unnoticeable&lt;/li&gt;
&lt;li&gt;For large clusters: polling adds latency to watch events&lt;/li&gt;
&lt;li&gt;SQLite: single-writer, no HA (single node only)&lt;/li&gt;
&lt;li&gt;PostgreSQL/MySQL with Kine: HA possible but watch latency higher than etcd&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;DQLite — Embedded Distributed SQLite (Experimental)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;For HA without an external DB, K3s can use &lt;strong&gt;DQLite&lt;/strong&gt; — a distributed SQLite implementation using Raft (similar to etcd but built on SQLite). It's embedded in the binary and doesn't require an external DB.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# K3s with embedded HA using DQLite&lt;/span&gt;
k3s server &lt;span class="nt"&gt;--cluster-init&lt;/span&gt;   &lt;span class="c"&gt;# First server (bootstrap)&lt;/span&gt;
k3s server &lt;span class="nt"&gt;--server&lt;/span&gt; https://first-server:6443 &lt;span class="nt"&gt;--token&lt;/span&gt; &amp;lt;token&amp;gt;  &lt;span class="c"&gt;# Join as HA peer&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  3. K3s Controller Manager — Pruned and Extended
&lt;/h3&gt;

&lt;p&gt;K3s runs the upstream &lt;code&gt;kube-controller-manager&lt;/code&gt; with several modifications:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Removed Controllers:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;cloud-node&lt;/code&gt; controller (no cloud metadata fetching)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;cloud-node-lifecycle&lt;/code&gt; controller&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;route&lt;/code&gt; controller (no cloud routes)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;service&lt;/code&gt; controller (replaced by ServiceLB)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Added: ServiceLB (a.k.a. Klipper LoadBalancer)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Instead of calling a cloud API to provision a load balancer, K3s runs a &lt;strong&gt;DaemonSet-based solution&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Service type=LoadBalancer created
        │
        ▼
ServiceLB Controller watches for it
        │
        ▼
Creates a DaemonSet:
  - Runs a pod on every node with hostPort matching service ports
  - The pod does iptables DNAT → service ClusterIP
        │
        ▼
Every node's IP becomes a valid entry point
(no external LB needed)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# What ServiceLB actually deploys under the hood&lt;/span&gt;
&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;apps/v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;DaemonSet&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;svclb-my-service&lt;/span&gt;
  &lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;kube-system&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;template&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;hostNetwork&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
      &lt;span class="na"&gt;containers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;lb-port-80&lt;/span&gt;
        &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;rancher/klipper-lb:latest&lt;/span&gt;
        &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;hostPort&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;80&lt;/span&gt;       &lt;span class="c1"&gt;# binds on every node&lt;/span&gt;
          &lt;span class="na"&gt;containerPort&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;80&lt;/span&gt;
        &lt;span class="na"&gt;env&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;SRC_PORT&lt;/span&gt;
          &lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;80"&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;DEST_PROTO&lt;/span&gt;
          &lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;TCP&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;DEST_IP&lt;/span&gt;
          &lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;10.96.100.50"&lt;/span&gt;  &lt;span class="c1"&gt;# ClusterIP&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;DEST_PORT&lt;/span&gt;
          &lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;80"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  4. K3s Scheduler — Unchanged but Co-located
&lt;/h3&gt;

&lt;p&gt;The scheduler in K3s is the &lt;strong&gt;unmodified upstream kube-scheduler&lt;/strong&gt;. However, it runs as a goroutine &lt;strong&gt;inside the k3s-server binary&lt;/strong&gt; rather than as a separate process.&lt;/p&gt;

&lt;p&gt;The key difference is operational:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;In K8s: scheduler can be independently scaled, upgraded, or replaced (e.g., with Volcano, Yunikorn)&lt;/li&gt;
&lt;li&gt;In K3s: scheduler is embedded — replacing it requires rebuilding or running an external scheduler with leader election disabled on the built-in one&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  5. The Flannel CNI — Embedded Networking
&lt;/h3&gt;

&lt;p&gt;Standard K8s requires you to install a CNI (Calico, Cilium, Flannel, Weave) separately. K3s embeds &lt;strong&gt;Flannel&lt;/strong&gt; with VXLAN as the default backend.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Pod on Node 1 (10.42.1.5) → Pod on Node 2 (10.42.2.7)

Standard K8s + Calico:
  10.42.1.5 → BGP route → 10.42.2.7 (no encapsulation on supported networks)

K3s + Flannel VXLAN:
  10.42.1.5 → VXLAN encapsulate → eth0:8472 → Node 2 → decapsulate → 10.42.2.7
  (works everywhere, slight overhead from encapsulation)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;K3s also supports swapping Flannel for Cilium or Calico if you disable the built-in:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;k3s server &lt;span class="nt"&gt;--flannel-backend&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;none &lt;span class="nt"&gt;--disable-network-policy&lt;/span&gt;
&lt;span class="c"&gt;# Then install Cilium/Calico manually&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  📊 Side-by-Side Deep Comparison
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dimension&lt;/th&gt;
&lt;th&gt;Standard Kubernetes&lt;/th&gt;
&lt;th&gt;K3s&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Deployment model&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Separate processes (+ etcd cluster)&lt;/td&gt;
&lt;td&gt;Single binary, all-in-one&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;API server&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Full upstream, all features&lt;/td&gt;
&lt;td&gt;Full upstream, conservative defaults&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Datastore&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;etcd (Raft, event-driven watch)&lt;/td&gt;
&lt;td&gt;SQLite/Kine (SQL polling) or embedded DQLite&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Watch latency&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;~10ms (event-driven)&lt;/td&gt;
&lt;td&gt;~100ms (polling on SQL backends)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;HA datastore&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;etcd cluster (3/5 nodes)&lt;/td&gt;
&lt;td&gt;External DB + Kine OR embedded DQLite&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Control plane HA&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Multiple API server replicas&lt;/td&gt;
&lt;td&gt;Multiple k3s-server nodes possible&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Cloud integration&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;cloud-controller-manager&lt;/td&gt;
&lt;td&gt;No CCM, uses ServiceLB + node-ip flags&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;LoadBalancer&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Cloud LB (AWS ELB, GCP GLB)&lt;/td&gt;
&lt;td&gt;ServiceLB DaemonSet (hostPort)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Ingress&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Bring your own (nginx, traefik)&lt;/td&gt;
&lt;td&gt;Traefik v2 embedded&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;CNI&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Bring your own&lt;/td&gt;
&lt;td&gt;Flannel (VXLAN) embedded&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;DNS&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Bring your own CoreDNS&lt;/td&gt;
&lt;td&gt;CoreDNS embedded&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Storage&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Bring your own CSI&lt;/td&gt;
&lt;td&gt;local-path-provisioner embedded&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Kubelet location&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Separate binary on worker&lt;/td&gt;
&lt;td&gt;Embedded in k3s binary&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;API server → kubelet&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Direct connection (port 10250)&lt;/td&gt;
&lt;td&gt;Reverse WebSocket tunnel&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Memory (control plane)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;~2GB+ (separate processes)&lt;/td&gt;
&lt;td&gt;~512MB (single process)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Startup time&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;2-5 minutes&lt;/td&gt;
&lt;td&gt;20-30 seconds&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Alpha feature gates&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Available&lt;/td&gt;
&lt;td&gt;Disabled by default&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Admission webhooks&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Full support&lt;/td&gt;
&lt;td&gt;Full support&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;CRDs&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Full support&lt;/td&gt;
&lt;td&gt;Full support&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;RBAC&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Full support&lt;/td&gt;
&lt;td&gt;Full support&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Audit logging&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Configurable&lt;/td&gt;
&lt;td&gt;Configurable&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Scheduler extensibility&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Scheduler profiles, plugins&lt;/td&gt;
&lt;td&gt;Embedded; replace with external&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Controller extensibility&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Separate binary, hot-swap&lt;/td&gt;
&lt;td&gt;Embedded goroutine&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Upgrades&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Independent component upgrades&lt;/td&gt;
&lt;td&gt;Single binary upgrade&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Edge/NAT traversal&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Requires direct reachability&lt;/td&gt;
&lt;td&gt;Native via reverse tunnel&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;ARM support&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Separate builds&lt;/td&gt;
&lt;td&gt;Native multi-arch in single release&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  🔑 When to Choose What
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Choose Standard Kubernetes When:
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;✅ 100+ node clusters
✅ Financial / regulated workloads requiring etcd for compliance
✅ You need independent control plane component upgrades
✅ You're using cloud-managed control planes (EKS, GKE, AKS)
✅ You need custom scheduler profiles (ML batch, GPU scheduling)
✅ Multi-tenancy with strong isolation requirements
✅ You need external etcd for ultra-high availability
✅ Team has K8s expertise and infra budget
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Choose K3s When:
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;✅ Edge computing (retail, industrial, remote sites)
✅ IoT / ARM devices (Raspberry Pi clusters)
✅ CI/CD ephemeral clusters (fast startup is critical)
✅ Development environments (minimal resource usage)
✅ Single-node homelab or small on-prem clusters
✅ Clusters behind NAT (reverse tunnel is a killer feature)
✅ Teams that want "it just works" with less Ops overhead
✅ Bare metal without a cloud provider
✅ Air-gapped environments (single binary, easy to ship)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  🔭 The Architecture Decision Tree
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Do you need &amp;gt;50 nodes?
├── YES → Standard K8s (EKS/GKE/AKS or kubeadm)
└── NO
    ├── Are you on edge/IoT/ARM?
    │   └── YES → K3s (purpose-built for this)
    ├── Do you need cloud LoadBalancer integration?
    │   └── YES → Standard K8s with CCM
    ├── Is startup speed critical? (CI/CD, dev envs)
    │   └── YES → K3s
    ├── Do you need etcd for compliance/audit?
    │   └── YES → Standard K8s
    └── Default recommendation for &amp;lt;20 nodes on-prem?
        └── K3s (less to manage, same K8s API)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  🎯 Closing Thoughts
&lt;/h2&gt;

&lt;p&gt;K3s isn't "Kubernetes with stuff removed." It's a &lt;strong&gt;purpose-built reimagining of the control plane&lt;/strong&gt; for constrained environments. Rancher made deliberate trade-offs:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;etcd → Kine/SQLite&lt;/strong&gt;: Sacrificed watch latency and native HA for operational simplicity&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Separate binaries → Single binary&lt;/strong&gt;: Sacrificed independent upgradeability for atomic deployments&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CCM → ServiceLB&lt;/strong&gt;: Sacrificed cloud-native LB for zero-dependency load balancing&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Direct kubelet access → Reverse tunnel&lt;/strong&gt;: Sacrificed simplicity for NAT traversal capability&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The result is a distribution that runs the &lt;strong&gt;full Kubernetes API&lt;/strong&gt; on a Raspberry Pi with 512MB of RAM, starts in 30 seconds, and works behind NAT — things standard K8s simply wasn't designed for.&lt;/p&gt;

&lt;p&gt;Both are Kubernetes. Both run your workloads. The control plane is where the real difference lives.&lt;/p&gt;




</description>
      <category>kubernetes</category>
      <category>k3s</category>
      <category>devops</category>
      <category>docker</category>
    </item>
    <item>
      <title>I Broke My Proxmox Home Lab with a GPU Passthrough - Here's How I Fixed It</title>
      <dc:creator>Pendela BhargavaSai</dc:creator>
      <pubDate>Tue, 19 May 2026 03:30:00 +0000</pubDate>
      <link>https://dev.to/pendelabhargavasai/i-broke-my-proxmox-home-lab-with-a-gpu-passthrough-heres-how-i-fixed-it-45na</link>
      <guid>https://dev.to/pendelabhargavasai/i-broke-my-proxmox-home-lab-with-a-gpu-passthrough-heres-how-i-fixed-it-45na</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;How a Kubernetes worker VM with a passed-through AMD GPU sent my entire home lab into an infinite crash loop — and the GRUB-level trick that saved it.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  What I Was Trying to Do
&lt;/h2&gt;

&lt;p&gt;My home lab runs on a Mini PC with Proxmox as the hypervisor. The setup hosts a mix of LXC containers and KVM virtual machines — Jellyfin for media, a Pi-hole, a Kubernetes cluster (k3s), Cloudflare tunnels, and a bunch of other self-hosted services.&lt;/p&gt;

&lt;p&gt;I was upgrading the hardware configuration of &lt;strong&gt;VM 104 (k3s-vm-worker)&lt;/strong&gt; — one of the worker nodes in my k3s Kubernetes cluster. The goal was straightforward: pass through the host's &lt;strong&gt;AMD GPU directly into the VM&lt;/strong&gt; so the Kubernetes worker could handle GPU-accelerated workloads.&lt;/p&gt;

&lt;p&gt;In Proxmox, GPU passthrough (PCIe passthrough) requires a few things:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;IOMMU enabled in BIOS (&lt;code&gt;AMD-Vi&lt;/code&gt; for AMD platforms)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;amd_iommu=on iommu=pt&lt;/code&gt; kernel parameters in GRUB&lt;/li&gt;
&lt;li&gt;Machine type set to &lt;code&gt;q35&lt;/code&gt; (required for PCIe passthrough)&lt;/li&gt;
&lt;li&gt;The PCI device was added as &lt;code&gt;hostpci0&lt;/code&gt; in the VM's hardware config&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I configured all of this. The hardware tab showed the PCI device (&lt;code&gt;0000:e6:00&lt;/code&gt;, highlighted in orange — a warning sign I probably should have paid more attention to). I saved the config, felt good about it, and rebooted.&lt;/p&gt;

&lt;p&gt;That was my mistake.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fei9tmbglk5q65ok1zmxc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fei9tmbglk5q65ok1zmxc.png" alt=" " width="800" height="339"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  What Happened — The Crash Loop
&lt;/h2&gt;

&lt;p&gt;The Mini PC came back up, Proxmox started loading, and then — nothing. The host became completely unresponsive. No web UI. No SSH. Nothing.&lt;/p&gt;

&lt;p&gt;I power-cycled it. Same thing. It would power on, start booting, and crash before I could even get to the Proxmox web interface.&lt;/p&gt;

&lt;p&gt;Here's what was actually happening, which I only understood after I could finally get in and check the logs:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;VM 104 had "Start at boot" set to Yes.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The moment Proxmox finished loading, it triggered a &lt;strong&gt;"Bulk Start VMs"&lt;/strong&gt; task — its automatic process for starting all VMs flagged to auto-start. VM 104 was in that queue. The second that task reached VM 104, Proxmox tried to initialise the PCIe passthrough, the VM grabbed the GPU from the host, and the entire system crashed — hard.&lt;/p&gt;

&lt;p&gt;I panicked and quickly got back into the web UI during a narrow window and changed "Start at boot" to &lt;strong&gt;No&lt;/strong&gt;. Thought that would fix it.&lt;/p&gt;

&lt;p&gt;It didn't.&lt;/p&gt;

&lt;p&gt;Proxmox had &lt;strong&gt;already queued the bulk start task&lt;/strong&gt; the moment it turned on. Changing the setting mid-flight did nothing. The next boot, same crash. The boot after that, same crash.&lt;/p&gt;

&lt;p&gt;I was locked in an &lt;strong&gt;auto-start crash loop&lt;/strong&gt; with no way out through the web UI.&lt;/p&gt;




&lt;h2&gt;
  
  
  Understanding the Problem — Why It Was So Hard to Escape
&lt;/h2&gt;

&lt;p&gt;This is the part that makes this specific failure mode so nasty:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Boot → Proxmox loads → "Bulk Start VMs" fires instantly →
VM 104 grabs GPU → Host kernel panics → System crashes →
Reboot → repeat forever
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The window between "Proxmox is up enough to serve the web UI" and "VM 104 has already been queued and grabbed the GPU" was too small to intervene through normal means. Every time I managed to get in and change something, the crash had already been scheduled.&lt;/p&gt;

&lt;p&gt;The only way out was to &lt;strong&gt;break the loop before Proxmox had any chance to load its VM management services at all&lt;/strong&gt; — which means intervening at the GRUB level, before the operating system even finishes booting.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Solution — Masking the VM Start Service at Boot
&lt;/h2&gt;

&lt;p&gt;This required plugging a physical keyboard and monitor into the Mini PC. No SSH, no remote — physical access only.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 1: Stop the autoboot timer at GRUB
&lt;/h3&gt;

&lt;p&gt;Power on the machine and watch the screen. The moment the &lt;strong&gt;blue Proxmox GRUB menu&lt;/strong&gt; appears, press the &lt;strong&gt;Down Arrow&lt;/strong&gt; key immediately. This stops the autoboot countdown timer and lets you stay at the GRUB menu.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 2: Edit the boot command
&lt;/h3&gt;

&lt;p&gt;Highlight the top &lt;strong&gt;"Proxmox VE"&lt;/strong&gt; entry and press &lt;strong&gt;&lt;code&gt;e&lt;/code&gt;&lt;/strong&gt; to edit the boot command. You'll see the full kernel command line. Navigate to the end of the &lt;code&gt;linux&lt;/code&gt; line.&lt;/p&gt;

&lt;p&gt;Remove &lt;code&gt;amd_iommu=on iommu=pt&lt;/code&gt; (these were the IOMMU parameters I'd added earlier that enabled the passthrough).&lt;/p&gt;

&lt;p&gt;Then, at the very end of the &lt;code&gt;linux&lt;/code&gt; line, add:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;systemd.mask&lt;span class="o"&gt;=&lt;/span&gt;pve-guests.service
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;So the line ends with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;ro quiet systemd.mask=pve-guests.service
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Press &lt;strong&gt;&lt;code&gt;Ctrl + X&lt;/code&gt;&lt;/strong&gt; to boot with this modified command.&lt;/p&gt;

&lt;h3&gt;
  
  
  What &lt;code&gt;systemd.mask=pve-guests.service&lt;/code&gt; actually does
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;pve-guests.service&lt;/code&gt; is the systemd unit responsible for the &lt;strong&gt;"Bulk Start VMs"&lt;/strong&gt; behaviour in Proxmox. Masking it via the kernel command line at boot tells systemd to completely skip that service for this boot session only — it doesn't persist across reboots.&lt;/p&gt;

&lt;p&gt;The result:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;✅ Proxmox boots normally&lt;/li&gt;
&lt;li&gt;✅ Web UI comes up&lt;/li&gt;
&lt;li&gt;✅ Network is available&lt;/li&gt;
&lt;li&gt;✅ &lt;strong&gt;Zero VMs start automatically&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;✅ Host does not crash&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Step 3: Fix the VM config via web UI
&lt;/h3&gt;

&lt;p&gt;Once safely in the web UI with no VMs running:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;VM 104 → Options → "Start at boot" → No&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;VM 104 → Hardware → PCI Device (hostpci0) → Remove&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;VM 104 → Hardware → Machine → change from &lt;code&gt;q35&lt;/code&gt; back to &lt;code&gt;i440fx&lt;/code&gt;&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;Reboot normally — don't touch the GRUB menu this time&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Plot Twist — The GPU Renamed Itself
&lt;/h2&gt;

&lt;p&gt;After fixing the crash loop and getting the host stable, I started my LXC containers — &lt;strong&gt;CT 204 (jellyfin-arr)&lt;/strong&gt; and &lt;strong&gt;CT 208 (linkwarden)&lt;/strong&gt; — and hit a completely different error:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Error: Device /dev/dri/card0 ...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Both containers refused to start. I checked the DRI devices on the host:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;ls&lt;/span&gt; &lt;span class="nt"&gt;-la&lt;/span&gt; /dev/dri/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;crw-rw---- 1 root video  226,   1 May 13 13:51 card1
crw-rw---- 1 root render 226, 128 May 13 13:51 renderD128
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The AMD GPU loaded perfectly (&lt;code&gt;Kernel driver in use: amdgpu&lt;/code&gt;). But Linux had decided to name it &lt;strong&gt;&lt;code&gt;card1&lt;/code&gt;&lt;/strong&gt; instead of &lt;strong&gt;&lt;code&gt;card0&lt;/code&gt;&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn87t2mdot73ljsp51lu2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn87t2mdot73ljsp51lu2.png" alt=" " width="800" height="309"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This is a surprisingly common Linux behaviour: when you change BIOS IOMMU settings, or reboot without an HDMI cable connected to a GPU, the kernel can enumerate DRM devices in a different order and assign a different &lt;code&gt;cardN&lt;/code&gt; number. The GPU is fine. The driver is fine. The device node just got a different name.&lt;/p&gt;

&lt;p&gt;My LXC containers were hardcoded to pass through &lt;code&gt;/dev/dri/card0&lt;/code&gt;. The fix was trivial once I understood it:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;CT 204 → Resources → Device (dev1)&lt;/strong&gt; — change &lt;code&gt;/dev/dri/card0&lt;/code&gt; to &lt;code&gt;/dev/dri/card1&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CT 208 → Resources → Device (dev1)&lt;/strong&gt; — same change&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Both containers started instantly. Hardware transcoding in Jellyfin came back up immediately.&lt;/p&gt;




&lt;h2&gt;
  
  
  Root Cause Analysis
&lt;/h2&gt;

&lt;p&gt;Looking back, there were actually &lt;strong&gt;three separate issues&lt;/strong&gt; stacked on top of each other:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;#&lt;/th&gt;
&lt;th&gt;Issue&lt;/th&gt;
&lt;th&gt;Root Cause&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;Host crash loop&lt;/td&gt;
&lt;td&gt;VM with PCIe passthrough set to "Start at boot" — GPU grabbed before host could stabilise&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Can't escape via web UI&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;pve-guests.service&lt;/code&gt; fires before web UI is interactive enough to intervene&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;LXC containers broken&lt;/td&gt;
&lt;td&gt;GPU renamed from &lt;code&gt;card0&lt;/code&gt; to &lt;code&gt;card1&lt;/code&gt; after IOMMU BIOS change&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The orange &lt;code&gt;0000:e6:00&lt;/code&gt; text in the VM Hardware tab was the visual warning I missed — in Proxmox, orange on a PCI device entry typically indicates the device may not be properly isolated in its IOMMU group, or that there's a configuration issue worth investigating before saving and booting.&lt;/p&gt;




&lt;h2&gt;
  
  
  Lessons Learned
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1. Never enable "Start at boot" on a VM with PCIe passthrough until you've verified the passthrough works correctly in a manual start first.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Test the VM with a manual start. Confirm the host stays stable. Confirm the VM boots correctly with the passed-through device. Only then enable auto-start.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. &lt;code&gt;systemd.mask=&amp;lt;service&amp;gt;&lt;/code&gt; at GRUB is your emergency brake.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Any time Proxmox is in a state where it crashes before you can intervene through the web UI, this technique gives you a clean boot with surgical control over what services start. &lt;code&gt;pve-guests.service&lt;/code&gt; is the specific one to mask for VM auto-start loops.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. GPU device node names are not stable across reboots in Linux.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;code&gt;/dev/dri/card0&lt;/code&gt; is not guaranteed to be your GPU after every reboot, especially when BIOS settings change. Instead of hardcoding &lt;code&gt;card0&lt;/code&gt;, consider:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Using &lt;code&gt;udev&lt;/code&gt; rules to create a stable symlink to your specific GPU by PCI ID&lt;/li&gt;
&lt;li&gt;Or checking &lt;code&gt;ls /dev/dri/&lt;/code&gt; after any BIOS/IOMMU change before starting dependent containers
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Find which card corresponds to your GPU by PCI ID&lt;/span&gt;
&lt;span class="nb"&gt;ls&lt;/span&gt; &lt;span class="nt"&gt;-la&lt;/span&gt; /dev/dri/by-path/
&lt;span class="c"&gt;# or&lt;/span&gt;
lspci &lt;span class="nt"&gt;-k&lt;/span&gt; | &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-A&lt;/span&gt; 3 VGA
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;4. PCIe passthrough machine type matters — and &lt;code&gt;q35&lt;/code&gt; has implications.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;code&gt;q35&lt;/code&gt; is required for proper PCIe passthrough but behaves differently from &lt;code&gt;i440fx&lt;/code&gt; in several ways. Switching machine types mid-configuration without a working baseline is risky. Start with a known-good VM on &lt;code&gt;i440fx&lt;/code&gt;, verify it's stable, then migrate to &lt;code&gt;q35&lt;/code&gt; and add the PCI device.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. Physical access to your home lab server is non-negotiable.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If you're running a home lab with GPU passthrough or any kind of hardware-level config, have a keyboard and monitor you can plug in. SSH and the web UI are great until they're not. The GRUB console saved this entire setup.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Final State
&lt;/h2&gt;

&lt;p&gt;After all of this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;✅ Proxmox host boots cleanly and stays stable&lt;/li&gt;
&lt;li&gt;✅ All LXC containers (jellyfin-arr, linkwarden) running with GPU access on &lt;code&gt;card1&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;✅ k3s Kubernetes cluster (VMs 103, 104, 105) running normally&lt;/li&gt;
&lt;li&gt;✅ All other services (Plex, Pi-hole, Cloudflare, Pulse, Prowlarr, Pendela) unaffected&lt;/li&gt;
&lt;li&gt;⚠️ GPU passthrough into k3s-vm-worker: parked for now, to be re-approached correctly&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The GPU passthrough into the Kubernetes worker is still on the roadmap — I'll approach it differently: test in a throwaway VM first, verify IOMMU grouping, confirm stability with manual start, and only then wire it into the k3s node.&lt;/p&gt;




&lt;h2&gt;
  
  
  Quick Reference — Commands Used
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Check GPU device assignment on host&lt;/span&gt;
&lt;span class="nb"&gt;ls&lt;/span&gt; &lt;span class="nt"&gt;-la&lt;/span&gt; /dev/dri/
lspci &lt;span class="nt"&gt;-k&lt;/span&gt; | &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-A&lt;/span&gt; 3 &lt;span class="nt"&gt;-i&lt;/span&gt; vga

&lt;span class="c"&gt;# Check IOMMU groups (run as root on Proxmox host)&lt;/span&gt;
&lt;span class="k"&gt;for &lt;/span&gt;d &lt;span class="k"&gt;in&lt;/span&gt; /sys/kernel/iommu_groups/&lt;span class="k"&gt;*&lt;/span&gt;/devices/&lt;span class="k"&gt;*&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;do
  &lt;/span&gt;&lt;span class="nv"&gt;n&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;d&lt;/span&gt;&lt;span class="p"&gt;#*/iommu_groups/*&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nv"&gt;n&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;n&lt;/span&gt;&lt;span class="p"&gt;%%/*&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;
  &lt;span class="nb"&gt;printf&lt;/span&gt; &lt;span class="s1"&gt;'IOMMU Group %s '&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$n&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
  lspci &lt;span class="nt"&gt;-nns&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;d&lt;/span&gt;&lt;span class="p"&gt;##*/&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="k"&gt;done&lt;/span&gt;

&lt;span class="c"&gt;# GRUB emergency boot (type this at the end of linux line in GRUB edit)&lt;/span&gt;
systemd.mask&lt;span class="o"&gt;=&lt;/span&gt;pve-guests.service

&lt;span class="c"&gt;# Find stable GPU device path by PCI address&lt;/span&gt;
&lt;span class="nb"&gt;ls&lt;/span&gt; &lt;span class="nt"&gt;-la&lt;/span&gt; /dev/dri/by-path/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;p&gt;&lt;em&gt;Running a home lab means breaking things and learning from them. This one taught me more about Proxmox internals, systemd masking, and Linux DRM device enumeration than any documentation would have. Hope it saves someone else the same panic.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Feel free to drop questions in the comments — happy to help if you're stuck in a similar loop.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>linux</category>
      <category>devops</category>
      <category>proxmox</category>
    </item>
    <item>
      <title>The Ultimate Guide to Kubernetes Load Balancers in 2026 (K3s Edition)</title>
      <dc:creator>Pendela BhargavaSai</dc:creator>
      <pubDate>Fri, 15 May 2026 03:30:00 +0000</pubDate>
      <link>https://dev.to/pendelabhargavasai/the-ultimate-guide-to-kubernetes-load-balancers-in-2026-k3s-edition-18me</link>
      <guid>https://dev.to/pendelabhargavasai/the-ultimate-guide-to-kubernetes-load-balancers-in-2026-k3s-edition-18me</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;TL;DR&lt;/strong&gt; — Running K3s on bare metal or edge? This guide dissects every major Kubernetes load balancer — NGINX, Traefik, MetalLB, HAProxy, Envoy, Cilium, Istio, Linkerd, and K3s's own Klipper — across architecture, performance, K3s compatibility, and real-world use cases. Pick the right one for your stack, once and for all.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqqbi4mywkwq9g4om2qzs.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqqbi4mywkwq9g4om2qzs.png" alt=" " width="800" height="610"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  🧭 Why This Guide Exists
&lt;/h2&gt;

&lt;p&gt;Kubernetes load balancers are one of the most confusing corners of the cloud-native ecosystem. Search for "best Kubernetes load balancer" and you'll find a dozen blog posts each recommending something different, often without context. When you throw &lt;strong&gt;K3s&lt;/strong&gt; — the lightweight, single-binary Kubernetes distribution from Rancher — into the mix, the confusion compounds further.&lt;/p&gt;

&lt;p&gt;K3s ships with its own built-in load balancer (Klipper/ServiceLB) and its own ingress controller (Traefik). But is that the right choice for your production workload? What if you need BGP routing, service mesh capabilities, or sub-millisecond latency?&lt;/p&gt;

&lt;p&gt;This guide covers every serious option in the market today, with real benchmarks, architecture diagrams, and clear K3s-specific guidance.&lt;/p&gt;




&lt;h2&gt;
  
  
  🗺️ The Landscape: What Are We Even Comparing?
&lt;/h2&gt;

&lt;p&gt;Before diving in, let's clarify the terminology. "Load balancer" in Kubernetes refers to multiple layers:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;What It Does&lt;/th&gt;
&lt;th&gt;Example Tools&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;L4 LoadBalancer (IP/TCP)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Assigns external IPs to Services&lt;/td&gt;
&lt;td&gt;MetalLB, Klipper, Kube-VIP&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;L7 Ingress Controller&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Routes HTTP/HTTPS traffic by host/path&lt;/td&gt;
&lt;td&gt;NGINX, Traefik, HAProxy&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Reverse Proxy / Edge Proxy&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Advanced traffic shaping, retries, circuit breaking&lt;/td&gt;
&lt;td&gt;Envoy, HAProxy&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Service Mesh&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;East-west (pod-to-pod) traffic management + security&lt;/td&gt;
&lt;td&gt;Istio, Linkerd, Cilium&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Most real deployments combine tools from multiple layers. For K3s, a typical production stack might be: &lt;strong&gt;MetalLB&lt;/strong&gt; (L4) + &lt;strong&gt;Traefik&lt;/strong&gt; (L7 Ingress) + optionally &lt;strong&gt;Linkerd&lt;/strong&gt; (mesh).&lt;/p&gt;




&lt;h2&gt;
  
  
  🔬 Competitor Deep-Dive
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. 🏠 Klipper ServiceLB (K3s Built-In)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What it is&lt;/strong&gt;: K3s's embedded load balancer, enabled by default. Uses host ports and iptables rules to forward traffic.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Architecture&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;External Traffic
      │
      ▼
[Node HostPort] ──iptables──► [ClusterIP] ──► [Pod]
      ▲
[DaemonSet: svc-* pods on each node]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;How it works&lt;/strong&gt;: For each &lt;code&gt;LoadBalancer&lt;/code&gt; Service, Klipper creates a DaemonSet with &lt;code&gt;svc-&lt;/code&gt; prefixed pods that bind to the host port. The node's own external IP is reported as the &lt;code&gt;EXTERNAL-IP&lt;/code&gt;. There is no IP announcement to the network — it simply binds ports.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;K3s-specific note&lt;/strong&gt;: Klipper is enabled by default. To run MetalLB or any other LB controller, you must disable it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# During K3s install&lt;/span&gt;
curl &lt;span class="nt"&gt;-sfL&lt;/span&gt; https://get.k3s.io | sh &lt;span class="nt"&gt;-s&lt;/span&gt; - &lt;span class="nt"&gt;--disable&lt;/span&gt; servicelb

&lt;span class="c"&gt;# Or in K3s config file&lt;/span&gt;
disable:
  - servicelb
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;Rating&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Zero config&lt;/td&gt;
&lt;td&gt;✅ Built-in&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;True IP announcement&lt;/td&gt;
&lt;td&gt;❌ No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;BGP support&lt;/td&gt;
&lt;td&gt;❌ No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Multi-node HA&lt;/td&gt;
&lt;td&gt;⚠️ Failover only&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Production-readiness&lt;/td&gt;
&lt;td&gt;⚠️ Dev/small clusters&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Resource usage&lt;/td&gt;
&lt;td&gt;✅ Minimal&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Best for&lt;/strong&gt;: Local dev, single-node K3s, homelab, quick demos.&lt;/p&gt;




&lt;h3&gt;
  
  
  2. 🟢 NGINX Ingress Controller
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What it is&lt;/strong&gt;: The most widely deployed Kubernetes Ingress controller, based on the battle-tested NGINX reverse proxy. Two major variants exist: the community &lt;code&gt;ingress-nginx&lt;/code&gt; and the commercial NGINX Inc. version (&lt;code&gt;nginx-ingress&lt;/code&gt;).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Architecture&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Internet
   │
   ▼
[NGINX Pod]
   │  Reads Ingress rules + Annotations
   ├──► /app-a  ──► Service A ──► Pods
   ├──► /app-b  ──► Service B ──► Pods
   └──► /api    ──► Service C ──► Pods
        │
   [ConfigMap / Annotations drive nginx.conf]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Key features&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Annotation-driven configuration (granular control via &lt;code&gt;nginx.ingress.kubernetes.io/*&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;SSL termination, wildcard certs, HSTS&lt;/li&gt;
&lt;li&gt;Rate limiting, IP allowlisting, custom error pages&lt;/li&gt;
&lt;li&gt;WebSocket support, gRPC proxying&lt;/li&gt;
&lt;li&gt;Prometheus metrics out of the box&lt;/li&gt;
&lt;li&gt;ModSecurity WAF support (community build)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;K3s installation&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# First, disable K3s's default Traefik if you want NGINX instead&lt;/span&gt;
curl &lt;span class="nt"&gt;-sfL&lt;/span&gt; https://get.k3s.io | sh &lt;span class="nt"&gt;-s&lt;/span&gt; - &lt;span class="nt"&gt;--disable&lt;/span&gt; traefik

&lt;span class="c"&gt;# Install NGINX Ingress via Helm&lt;/span&gt;
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm &lt;span class="nb"&gt;install &lt;/span&gt;ingress-nginx ingress-nginx/ingress-nginx &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--namespace&lt;/span&gt; ingress-nginx &lt;span class="nt"&gt;--create-namespace&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Sample Ingress resource&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;networking.k8s.io/v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Ingress&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;my-app&lt;/span&gt;
  &lt;span class="na"&gt;annotations&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;nginx.ingress.kubernetes.io/rewrite-target&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/&lt;/span&gt;
    &lt;span class="na"&gt;nginx.ingress.kubernetes.io/rate-limit&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;100"&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;ingressClassName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;nginx&lt;/span&gt;
  &lt;span class="na"&gt;rules&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;host&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;myapp.example.com&lt;/span&gt;
    &lt;span class="na"&gt;http&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;paths&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/&lt;/span&gt;
        &lt;span class="na"&gt;pathType&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Prefix&lt;/span&gt;
        &lt;span class="na"&gt;backend&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;service&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;my-app-svc&lt;/span&gt;
            &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
              &lt;span class="na"&gt;number&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;80&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Performance&lt;/strong&gt;: NGINX processes ~30,000–40,000 RPS per instance in typical Kubernetes ingress scenarios. Config reloads happen on Ingress updates (brief traffic disruption is possible on busy clusters).&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;Rating&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Community &amp;amp; docs&lt;/td&gt;
&lt;td&gt;✅ Massive&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Annotation flexibility&lt;/td&gt;
&lt;td&gt;✅ Excellent&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Auto TLS (Let's Encrypt)&lt;/td&gt;
&lt;td&gt;⚠️ Needs cert-manager&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Dynamic config (no reload)&lt;/td&gt;
&lt;td&gt;❌ Requires reload&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Performance&lt;/td&gt;
&lt;td&gt;✅ Very good&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;K3s compatibility&lt;/td&gt;
&lt;td&gt;✅ Excellent&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Learning curve&lt;/td&gt;
&lt;td&gt;✅ Low&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Best for&lt;/strong&gt;: Teams migrating from traditional NGINX setups, production HTTP/HTTPS workloads, teams needing extensive annotation-based customization.&lt;/p&gt;




&lt;h3&gt;
  
  
  3. 🐹 Traefik (K3s Default)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What it is&lt;/strong&gt;: A cloud-native reverse proxy and ingress controller written in Go. K3s ships Traefik v2 by default (upgraded to v3 in recent K3s releases). It auto-discovers services via Kubernetes CRDs and annotations.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Architecture&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Internet
   │
   ▼
[Traefik Proxy]
   │  Watches: IngressRoutes, Ingress, Services
   │  Providers: Kubernetes CRD, Kubernetes Ingress
   │
   ├─[Routers]──[Middlewares]──[Services]──► Pods
   │     │            │
   │  Host/Path    RateLimit
   │  rules        Auth
   │               Retry
   │
   └─[Dashboard: :8080]  [Metrics: Prometheus]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Key features&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Zero-config service discovery&lt;/strong&gt; — annotate a Service and Traefik picks it up instantly, no config file reloads&lt;/li&gt;
&lt;li&gt;Automatic Let's Encrypt TLS with ACME challenge support&lt;/li&gt;
&lt;li&gt;Middleware system: auth, rate limiting, headers, circuit breakers, retry&lt;/li&gt;
&lt;li&gt;Native IngressRoute CRDs for full power&lt;/li&gt;
&lt;li&gt;Built-in dashboard and Prometheus metrics&lt;/li&gt;
&lt;li&gt;TCP/UDP routing support (not just HTTP)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;K3s-specific note&lt;/strong&gt;: Traefik is bundled and managed by K3s. To customize it, use a HelmChartConfig:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# /var/lib/rancher/k3s/server/manifests/traefik-config.yaml&lt;/span&gt;
&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;helm.cattle.io/v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;HelmChartConfig&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;traefik&lt;/span&gt;
  &lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;kube-system&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;valuesContent&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|-&lt;/span&gt;
    &lt;span class="s"&gt;dashboard:&lt;/span&gt;
      &lt;span class="s"&gt;enabled: true&lt;/span&gt;
    &lt;span class="s"&gt;additionalArguments:&lt;/span&gt;
      &lt;span class="s"&gt;- "--entrypoints.websecure.http.tls"&lt;/span&gt;
    &lt;span class="s"&gt;ports:&lt;/span&gt;
      &lt;span class="s"&gt;web:&lt;/span&gt;
        &lt;span class="s"&gt;redirectTo: websecure&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Sample IngressRoute&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;traefik.io/v1alpha1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;IngressRoute&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;my-app&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;entryPoints&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;websecure&lt;/span&gt;
  &lt;span class="na"&gt;routes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;match&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Host(`myapp.example.com`)&lt;/span&gt;
    &lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Rule&lt;/span&gt;
    &lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;my-app-svc&lt;/span&gt;
      &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;80&lt;/span&gt;
    &lt;span class="na"&gt;middlewares&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;rate-limit&lt;/span&gt;
  &lt;span class="na"&gt;tls&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;certResolver&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;letsencrypt&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Performance&lt;/strong&gt;: Traefik handles ~19,000 RPS with very stable resource consumption and zero-reload dynamic config — a key advantage over NGINX for fast-moving microservices.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;Rating&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;K3s integration&lt;/td&gt;
&lt;td&gt;✅ Native, bundled&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Auto TLS (Let's Encrypt)&lt;/td&gt;
&lt;td&gt;✅ Built-in ACME&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Dynamic config (no reload)&lt;/td&gt;
&lt;td&gt;✅ Real-time&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Dashboard&lt;/td&gt;
&lt;td&gt;✅ Built-in&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;TCP/UDP routing&lt;/td&gt;
&lt;td&gt;✅ Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Performance vs NGINX&lt;/td&gt;
&lt;td&gt;⚠️ Slightly lower RPS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Enterprise features&lt;/td&gt;
&lt;td&gt;⚠️ Enterprise version needed&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Best for&lt;/strong&gt;: K3s default stack, teams wanting zero-touch TLS, GitOps-friendly pipelines, dev-friendly environments.&lt;/p&gt;




&lt;h3&gt;
  
  
  4. 🔷 MetalLB
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What it is&lt;/strong&gt;: A bare-metal L4 load balancer for Kubernetes. It gives &lt;code&gt;LoadBalancer&lt;/code&gt; type Services an actual external IP from a pool you define, using either &lt;strong&gt;Layer 2 (ARP)&lt;/strong&gt; or &lt;strong&gt;BGP&lt;/strong&gt; protocols.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Architecture (Layer 2 mode)&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;External Network
      │
      │  ARP: "Who has 192.168.1.100?" → Leader Node replies
      ▼
[Leader Node] ──► kube-proxy ──► Service Pods (all nodes)
      │
[MetalLB Speaker DaemonSet] on every node
[MetalLB Controller] handles IP assignment
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Architecture (BGP mode)&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[Router/Switch]
      │  BGP peering
      ▼
[MetalLB Speaker] on each K3s node
      │  Announces /32 routes per service IP
      ▼
[Direct packet routing to node]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;K3s installation&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Step 1: Disable Klipper&lt;/span&gt;
curl &lt;span class="nt"&gt;-sfL&lt;/span&gt; https://get.k3s.io | sh &lt;span class="nt"&gt;-s&lt;/span&gt; - &lt;span class="nt"&gt;--disable&lt;/span&gt; servicelb

&lt;span class="c"&gt;# Step 2: Install MetalLB&lt;/span&gt;
helm repo add metallb https://metallb.github.io/metallb
helm &lt;span class="nb"&gt;install &lt;/span&gt;metallb metallb/metallb &lt;span class="nt"&gt;-n&lt;/span&gt; metallb-system &lt;span class="nt"&gt;--create-namespace&lt;/span&gt;

&lt;span class="c"&gt;# Step 3: Configure IP pool&lt;/span&gt;
kubectl apply &lt;span class="nt"&gt;-f&lt;/span&gt; - &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class="no"&gt;EOF&lt;/span&gt;&lt;span class="sh"&gt;
apiVersion: metallb.io/v1beta1
kind: IPAddressPool
metadata:
  name: k3s-pool
  namespace: metallb-system
spec:
  addresses:
  - 192.168.1.200-192.168.1.220
---
apiVersion: metallb.io/v1beta1
kind: L2Advertisement
metadata:
  name: k3s-l2
  namespace: metallb-system
&lt;/span&gt;&lt;span class="no"&gt;EOF
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Important caveat&lt;/strong&gt;: In L2 mode, MetalLB doesn't truly load-balance at L4 — it elects a leader node that handles ARP for a given IP, and kube-proxy does the actual pod distribution. It's more of a &lt;strong&gt;failover mechanism&lt;/strong&gt; than a true LB. BGP mode provides real per-node distribution but requires BGP-capable routers.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;Rating&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Bare-metal IP assignment&lt;/td&gt;
&lt;td&gt;✅ Core purpose&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;BGP mode&lt;/td&gt;
&lt;td&gt;✅ Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Layer 2 mode&lt;/td&gt;
&lt;td&gt;✅ Yes (ARP/NDP)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;True L4 load balancing&lt;/td&gt;
&lt;td&gt;⚠️ BGP only&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;K3s compatibility&lt;/td&gt;
&lt;td&gt;✅ Excellent (disable Klipper first)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Resource usage&lt;/td&gt;
&lt;td&gt;✅ Very lightweight&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Requires routers&lt;/td&gt;
&lt;td&gt;⚠️ BGP mode does&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Best for&lt;/strong&gt;: Bare-metal K3s clusters that need proper external IPs, homelab with a VLAN IP pool, edge deployments without cloud LB.&lt;/p&gt;




&lt;h3&gt;
  
  
  5. ⚡ HAProxy Ingress Controller
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What it is&lt;/strong&gt;: The Kubernetes ingress controller backed by HAProxy — historically the gold standard for raw TCP/HTTP load balancing performance. HAProxy Technologies' own benchmarks show their ingress controller handling &lt;strong&gt;42,000 RPS&lt;/strong&gt; with the lowest CPU among all competitors.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Architecture&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Internet
   │
   ▼
[HAProxy Pod]
   │  Config generated from Ingress/CRDs by controller
   │
   ├─[Frontend: bind *:80]
   │       │
   │  [ACL rules: path_beg, hdr_dom]
   │       │
   └─[Backend pools] ──► Pod endpoints (health-checked)
         │
   [Stats: :1936]  [Prometheus metrics]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Key features&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Best-in-class raw throughput and lowest latency at scale&lt;/li&gt;
&lt;li&gt;Native support for HTTP/3, QUIC, gRPC&lt;/li&gt;
&lt;li&gt;Fine-grained connection control (timeouts, retries, stick tables)&lt;/li&gt;
&lt;li&gt;Advanced Layer 7 routing: headers, cookies, ACLs&lt;/li&gt;
&lt;li&gt;TCP mode for non-HTTP workloads&lt;/li&gt;
&lt;li&gt;Gateway API support (HAProxy Ingress Controller v3.1+)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;K3s installation&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;helm repo add haproxytech https://haproxytech.github.io/helm-charts
helm &lt;span class="nb"&gt;install &lt;/span&gt;haproxy-ingress haproxytech/kubernetes-ingress &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--namespace&lt;/span&gt; haproxy-controller &lt;span class="nt"&gt;--create-namespace&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--set&lt;/span&gt; controller.service.type&lt;span class="o"&gt;=&lt;/span&gt;LoadBalancer
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Performance edge&lt;/strong&gt;: In head-to-head benchmarks against NGINX, Traefik, and Envoy:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;HAProxy&lt;/strong&gt;: 42,000 RPS, 50% CPU&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;NGINX&lt;/strong&gt;: ~35,000 RPS, ~65% CPU
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Traefik&lt;/strong&gt;: ~19,000 RPS, ~45% CPU (more consistent)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Envoy&lt;/strong&gt;: ~38,000 RPS, 73% CPU&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;Rating&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Raw throughput&lt;/td&gt;
&lt;td&gt;✅ Best-in-class&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;HTTP/3 &amp;amp; gRPC&lt;/td&gt;
&lt;td&gt;✅ Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Advanced ACLs&lt;/td&gt;
&lt;td&gt;✅ Very powerful&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Auto TLS&lt;/td&gt;
&lt;td&gt;⚠️ Needs cert-manager&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Dynamic config&lt;/td&gt;
&lt;td&gt;✅ v2.4+ hitless reload&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;K3s compatibility&lt;/td&gt;
&lt;td&gt;✅ Good&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Complexity&lt;/td&gt;
&lt;td&gt;⚠️ Steeper learning curve&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Best for&lt;/strong&gt;: High-throughput production clusters, financial services, teams needing ultra-low p99 latency, TCP-heavy workloads.&lt;/p&gt;




&lt;h3&gt;
  
  
  6. 🌊 Envoy Proxy
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What it is&lt;/strong&gt;: Originally built at Lyft, Envoy is a high-performance C++ proxy that has become the &lt;strong&gt;de facto data plane&lt;/strong&gt; of the cloud-native ecosystem. It powers Istio, Consul Connect, AWS App Mesh, and is the backbone of the Kubernetes Gateway API ecosystem.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Architecture&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[xDS Control Plane] (e.g., Istio's istiod)
       │  gRPC streaming: LDS, RDS, CDS, EDS
       ▼
[Envoy Proxy Instance]
   │
   ├─ Listeners (ports/protocols)
   │       │
   │  Filter Chains (HTTP, TCP, gRPC filters)
   │       │
   └─ Clusters (upstream endpoints)
         │
      [Circuit Breaker] [Retry] [Outlier Detection]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Key features&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Dynamic configuration via xDS API (zero-downtime updates)&lt;/li&gt;
&lt;li&gt;Built-in circuit breaking, retries, outlier detection&lt;/li&gt;
&lt;li&gt;Excellent observability: detailed stats, tracing (Zipkin/Jaeger/OTLP), access logs&lt;/li&gt;
&lt;li&gt;gRPC-first with HTTP/1.1 and HTTP/2 support&lt;/li&gt;
&lt;li&gt;Mutual TLS (mTLS) between services&lt;/li&gt;
&lt;li&gt;WebAssembly (Wasm) plugin extensibility&lt;/li&gt;
&lt;li&gt;Rate limiting via external services (Ratelimit service)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Standalone on K3s&lt;/strong&gt; (without Istio):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Envoy Gateway — standalone Gateway API implementation&lt;/span&gt;
helm &lt;span class="nb"&gt;install &lt;/span&gt;eg oci://docker.io/envoyproxy/gateway-helm &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--version&lt;/span&gt; v1.2.0 &lt;span class="nt"&gt;-n&lt;/span&gt; envoy-gateway-system &lt;span class="nt"&gt;--create-namespace&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Performance&lt;/strong&gt;: Envoy delivers ~38,000 RPS with excellent handling of dynamic service churn (critical for microservices that scale up/down frequently). Its sub-10ms latency during pod scaling events makes it ideal for Netflix/Uber-style workloads.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;Rating&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Dynamic config (xDS)&lt;/td&gt;
&lt;td&gt;✅ Best-in-class&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Observability&lt;/td&gt;
&lt;td&gt;✅ Exceptional&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;gRPC support&lt;/td&gt;
&lt;td&gt;✅ Native&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Circuit breaking&lt;/td&gt;
&lt;td&gt;✅ Built-in&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Wasm extensibility&lt;/td&gt;
&lt;td&gt;✅ Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Standalone complexity&lt;/td&gt;
&lt;td&gt;⚠️ High (needs control plane)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;K3s standalone use&lt;/td&gt;
&lt;td&gt;⚠️ Via Envoy Gateway&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Best for&lt;/strong&gt;: Microservices architectures with dynamic service discovery, service mesh data plane, teams that need xDS-compatible control plane integration.&lt;/p&gt;




&lt;h3&gt;
  
  
  7. 🕸️ Istio (Service Mesh)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What it is&lt;/strong&gt;: The most feature-complete service mesh for Kubernetes. Istio injects Envoy sidecars into every pod and manages the entire service-to-service communication layer via a centralized control plane (&lt;code&gt;istiod&lt;/code&gt;).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Architecture&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[istiod - Control Plane]
   ├── Pilot (traffic management)
   ├── Citadel (certificate authority)
   └── Galley (config validation)
         │  xDS API
         ▼
[Pod A]                    [Pod B]
  App Container              App Container
  Envoy Sidecar ◄──mTLS──► Envoy Sidecar
  (intercepts all traffic)   (intercepts all traffic)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Istio Ambient Mode&lt;/strong&gt; (2024/2026): The new sidecar-free mode using per-node "ztunnel" proxies + optional Waypoint proxies eliminates the double-hop latency, bringing performance near bare-metal levels.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key features&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Fine-grained traffic management: canary, A/B, weighted routing, fault injection&lt;/li&gt;
&lt;li&gt;Automatic mTLS between all services&lt;/li&gt;
&lt;li&gt;Authorization policies at L7 (RBAC per HTTP path/method)&lt;/li&gt;
&lt;li&gt;Distributed tracing, Kiali topology visualization&lt;/li&gt;
&lt;li&gt;Multi-cluster and VM support&lt;/li&gt;
&lt;li&gt;Gateway API support&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;K3s resource requirements&lt;/strong&gt; (important!):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;istiod: ~500MB RAM&lt;/li&gt;
&lt;li&gt;Per-pod Envoy sidecar: ~50MB RAM each&lt;/li&gt;
&lt;li&gt;At 500 services: &lt;strong&gt;25–50GB extra RAM vs. Linkerd&lt;/strong&gt; — plan accordingly
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Install Istio on K3s&lt;/span&gt;
curl &lt;span class="nt"&gt;-L&lt;/span&gt; https://istio.io/downloadIstio | sh -
istioctl &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;--set&lt;/span&gt; &lt;span class="nv"&gt;profile&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;minimal &lt;span class="nt"&gt;-y&lt;/span&gt;
kubectl label namespace default istio-injection&lt;span class="o"&gt;=&lt;/span&gt;enabled
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;Rating&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Traffic management&lt;/td&gt;
&lt;td&gt;✅ Most advanced&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;mTLS&lt;/td&gt;
&lt;td&gt;✅ Automatic&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Observability&lt;/td&gt;
&lt;td&gt;✅ Full stack (Kiali, Jaeger)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Authorization policies&lt;/td&gt;
&lt;td&gt;✅ L7 RBAC&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Resource usage&lt;/td&gt;
&lt;td&gt;❌ Heavy (per-pod sidecar)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Complexity&lt;/td&gt;
&lt;td&gt;❌ High&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;K3s (small cluster)&lt;/td&gt;
&lt;td&gt;⚠️ Feasible, watch RAM&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Best for&lt;/strong&gt;: Enterprise Kubernetes, SOC 2/PCI-DSS compliance requirements, teams needing canary deployments and fault injection, hybrid VM+K8s environments.&lt;/p&gt;




&lt;h3&gt;
  
  
  8. 🔗 Linkerd (Service Mesh)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What it is&lt;/strong&gt;: The original service mesh (coined the term in 2016). Linkerd uses a Rust-based "microproxy" instead of Envoy — dramatically lighter weight, making it the &lt;strong&gt;fastest and most resource-efficient&lt;/strong&gt; service mesh available.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Architecture&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[Linkerd Control Plane]
  ├── destination (service discovery)
  ├── identity (certificate authority)
  └── proxy-injector (sidecar injection)
         │
[Pod A]                    [Pod B]
  App Container              App Container
  linkerd2-proxy ◄──mTLS──► linkerd2-proxy
  (Rust, ~10MB RAM each)     (tiny overhead!)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Performance benchmarks&lt;/strong&gt; (vs other meshes):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Linkerd: ~5–10% slower than baseline (no mesh) — &lt;strong&gt;best among all meshes&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Istio: ~25–35% slower than baseline&lt;/li&gt;
&lt;li&gt;Cilium Mesh: ~20–30% slower than baseline&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Key features&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Automatic mTLS (on by default, zero config)&lt;/li&gt;
&lt;li&gt;Golden signals dashboard (latency, traffic, errors, saturation)&lt;/li&gt;
&lt;li&gt;Per-route metrics&lt;/li&gt;
&lt;li&gt;Traffic splitting (canary, A/B)&lt;/li&gt;
&lt;li&gt;Multi-cluster support&lt;/li&gt;
&lt;li&gt;FIPS-compliant builds available&lt;/li&gt;
&lt;li&gt;Graduated CNCF project (most mature after Istio)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;K3s installation&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Install Linkerd CLI&lt;/span&gt;
curl &lt;span class="nt"&gt;--proto&lt;/span&gt; &lt;span class="s1"&gt;'=https'&lt;/span&gt; &lt;span class="nt"&gt;--tlsv1&lt;/span&gt;.2 &lt;span class="nt"&gt;-sSfL&lt;/span&gt; https://run.linkerd.io/install | sh

&lt;span class="c"&gt;# Pre-flight check&lt;/span&gt;
linkerd check &lt;span class="nt"&gt;--pre&lt;/span&gt;

&lt;span class="c"&gt;# Install on K3s&lt;/span&gt;
linkerd &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;--crds&lt;/span&gt; | kubectl apply &lt;span class="nt"&gt;-f&lt;/span&gt; -
linkerd &lt;span class="nb"&gt;install&lt;/span&gt; | kubectl apply &lt;span class="nt"&gt;-f&lt;/span&gt; -
linkerd check

&lt;span class="c"&gt;# Inject into a namespace&lt;/span&gt;
kubectl annotate namespace default linkerd.io/inject&lt;span class="o"&gt;=&lt;/span&gt;enabled
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;Rating&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Resource efficiency&lt;/td&gt;
&lt;td&gt;✅ Best among meshes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Performance overhead&lt;/td&gt;
&lt;td&gt;✅ Minimal (5–10%)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;mTLS&lt;/td&gt;
&lt;td&gt;✅ Auto, zero-config&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Simplicity&lt;/td&gt;
&lt;td&gt;✅ Easiest mesh&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Dashboard&lt;/td&gt;
&lt;td&gt;✅ Built-in&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Advanced traffic routing&lt;/td&gt;
&lt;td&gt;⚠️ Less than Istio&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;K3s compatibility&lt;/td&gt;
&lt;td&gt;✅ Excellent&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Best for&lt;/strong&gt;: Teams wanting mesh capabilities without Istio's complexity, K3s clusters with limited RAM, security-first teams, anyone who wants to "just turn it on and have it work."&lt;/p&gt;




&lt;h3&gt;
  
  
  9. 🧬 Cilium (eBPF-based CNI + Service Mesh)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What it is&lt;/strong&gt;: Cilium is fundamentally different from all others — it operates at the &lt;strong&gt;Linux kernel level using eBPF&lt;/strong&gt; (extended Berkeley Packet Filter), replacing traditional iptables networking entirely. It serves as both a CNI (network plugin) and optionally a service mesh.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Architecture&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[Cilium Operator] + [Cilium Agent DaemonSet]
         │  Programs eBPF maps
         ▼
[Linux Kernel - eBPF programs]
   ├── XDP (eXpress Data Path): packet filtering at NIC level
   ├── TC (Traffic Control): L3/L4 policy enforcement
   └── Socket: L7 visibility (HTTP, gRPC, Kafka, DNS)
         │
[Hubble Observability Layer]
   ├── hubble-relay
   └── hubble-ui (real-time network flow visualization)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Key features&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;eBPF-powered networking&lt;/strong&gt;: bypasses kernel overhead, hardware-speed L4&lt;/li&gt;
&lt;li&gt;No iptables — replaces kube-proxy entirely&lt;/li&gt;
&lt;li&gt;Deep observability via Hubble (DNS, HTTP, gRPC, Kafka at kernel level)&lt;/li&gt;
&lt;li&gt;Network policies at L3/L4/L7 in a single CRD&lt;/li&gt;
&lt;li&gt;WireGuard/IPsec transparent encryption&lt;/li&gt;
&lt;li&gt;Service mesh in &lt;strong&gt;per-node Envoy&lt;/strong&gt; model (not sidecar-per-pod)&lt;/li&gt;
&lt;li&gt;Excellent for multi-cluster with Cluster Mesh&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;K3s installation&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Disable K3s's default flannel (Cilium replaces it)&lt;/span&gt;
curl &lt;span class="nt"&gt;-sfL&lt;/span&gt; https://get.k3s.io | sh &lt;span class="nt"&gt;-s&lt;/span&gt; - &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--flannel-backend&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;none &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--disable-network-policy&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--disable&lt;/span&gt; servicelb

&lt;span class="c"&gt;# Install Cilium&lt;/span&gt;
helm repo add cilium https://helm.cilium.io/
helm &lt;span class="nb"&gt;install &lt;/span&gt;cilium cilium/cilium &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--namespace&lt;/span&gt; kube-system &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--set&lt;/span&gt; operator.replicas&lt;span class="o"&gt;=&lt;/span&gt;1 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--set&lt;/span&gt; &lt;span class="nv"&gt;kubeProxyReplacement&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--set&lt;/span&gt; &lt;span class="nv"&gt;k8sServiceHost&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&amp;lt;K3S_SERVER_IP&amp;gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--set&lt;/span&gt; &lt;span class="nv"&gt;k8sServicePort&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;6443

&lt;span class="c"&gt;# Enable Hubble&lt;/span&gt;
cilium hubble &lt;span class="nb"&gt;enable&lt;/span&gt; &lt;span class="nt"&gt;--ui&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;L4 performance&lt;/strong&gt;: Cilium's eBPF datapath is unrivaled for L4 (TCP/UDP) — limited only by hardware NIC speed. For L7 (HTTP), it offloads to per-node Envoy, which introduces some trade-offs vs. per-pod sidecar isolation.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;Rating&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;L4 throughput&lt;/td&gt;
&lt;td&gt;✅ Best (eBPF)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Network observability&lt;/td&gt;
&lt;td&gt;✅ Exceptional (Hubble)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;No iptables&lt;/td&gt;
&lt;td&gt;✅ kube-proxy replacement&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Network policies&lt;/td&gt;
&lt;td&gt;✅ L3/L4/L7 unified&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Service mesh&lt;/td&gt;
&lt;td&gt;⚠️ Per-node (not per-pod)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Complexity&lt;/td&gt;
&lt;td&gt;⚠️ eBPF expertise needed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;K3s integration&lt;/td&gt;
&lt;td&gt;✅ Good (replaces flannel)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Best for&lt;/strong&gt;: High-performance bare-metal clusters, security-intensive environments, teams already investing in eBPF, multi-cluster deployments with Cluster Mesh.&lt;/p&gt;




&lt;h2&gt;
  
  
  📊 The Big Comparison Table
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Type&lt;/th&gt;
&lt;th&gt;OSI Layer&lt;/th&gt;
&lt;th&gt;K3s Default&lt;/th&gt;
&lt;th&gt;Auto TLS&lt;/th&gt;
&lt;th&gt;Performance&lt;/th&gt;
&lt;th&gt;Resource Usage&lt;/th&gt;
&lt;th&gt;Complexity&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Klipper/ServiceLB&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;L4 LB&lt;/td&gt;
&lt;td&gt;L4&lt;/td&gt;
&lt;td&gt;✅ Yes&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;td&gt;Minimal&lt;/td&gt;
&lt;td&gt;Minimal&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;NGINX&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Ingress&lt;/td&gt;
&lt;td&gt;L7&lt;/td&gt;
&lt;td&gt;❌ (opt-out Traefik)&lt;/td&gt;
&lt;td&gt;⚠️ (cert-manager)&lt;/td&gt;
&lt;td&gt;Very High&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Traefik&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Ingress&lt;/td&gt;
&lt;td&gt;L7&lt;/td&gt;
&lt;td&gt;✅ Yes (bundled)&lt;/td&gt;
&lt;td&gt;✅ Built-in&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;MetalLB&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;L4 LB&lt;/td&gt;
&lt;td&gt;L4&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;td&gt;Minimal&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;HAProxy&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Ingress&lt;/td&gt;
&lt;td&gt;L4+L7&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;⚠️ (cert-manager)&lt;/td&gt;
&lt;td&gt;Highest&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Envoy&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Proxy/Mesh DP&lt;/td&gt;
&lt;td&gt;L4+L7&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;✅ (with CP)&lt;/td&gt;
&lt;td&gt;Very High&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Istio&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Service Mesh&lt;/td&gt;
&lt;td&gt;L4+L7&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;✅ Auto mTLS&lt;/td&gt;
&lt;td&gt;Medium (overhead)&lt;/td&gt;
&lt;td&gt;Very High&lt;/td&gt;
&lt;td&gt;Very High&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Linkerd&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Service Mesh&lt;/td&gt;
&lt;td&gt;L4+L7&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;✅ Auto mTLS&lt;/td&gt;
&lt;td&gt;High (least overhead)&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Cilium&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;CNI+Mesh&lt;/td&gt;
&lt;td&gt;L3+L4+L7&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;✅ (WireGuard)&lt;/td&gt;
&lt;td&gt;Highest L4&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  🏗️ Architecture Patterns for K3s
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Pattern 1: Minimal (Single Node / Homelab)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[K3s: Traefik + Klipper built-in]
   │
   └── Just works. Zero extra config needed.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Use when: Local dev, single-node homelab, learning Kubernetes.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pattern 2: Bare-Metal Production (Most Common)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[MetalLB] ──► External IP ──► [Traefik] ──► [Your Services]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Use when: Multiple K3s nodes, need proper external IPs, keep Traefik for simplicity.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pattern 3: High-Performance Production
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[MetalLB] ──► External IP ──► [HAProxy Ingress] ──► [Services]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Use when: High RPS requirements, latency-sensitive APIs, financial/gaming workloads.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pattern 4: Secure Microservices (Security-First)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[MetalLB] ──► [NGINX/Traefik] ──► [Linkerd Mesh] ──► [Services]
                                      (mTLS, observability)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Use when: Multi-service architecture, compliance requirements, need service-to-service encryption.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pattern 5: Maximum Performance + Security (Advanced)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[Cilium CNI + kube-proxy replacement]
   └──► [Cilium Ingress / Envoy Gateway] ──► [Services]
        + Hubble for observability
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Use when: eBPF expertise available, need kernel-level performance, security-intensive platform.&lt;/p&gt;




&lt;h2&gt;
  
  
  🏎️ Performance Benchmarks at a Glance
&lt;/h2&gt;

&lt;p&gt;Based on published benchmarks and production data (2024–2026):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Requests per Second (RPS) at typical K8s ingress workload:

HAProxy    ████████████████████████████  42,000 RPS  (50% CPU)
Envoy      ███████████████████████████   38,000 RPS  (73% CPU)
NGINX      ██████████████████████████    35,000 RPS  (65% CPU)
Traefik    █████████████                 19,000 RPS  (45% CPU)

Service Mesh Overhead (vs no mesh):
Linkerd    ██  5–10% slower   ← Best
Cilium     ████  20–30% slower
Istio      █████  25–35% slower

L4 Raw Throughput:
Cilium (eBPF)  ████████████████████  Hardware-limited ← Best
MetalLB (BGP)  ██████████████████    Near line-rate
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  🎯 Decision Framework: Which One for Your K3s Cluster?
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;START HERE
    │
    ▼
Are you running a single node / homelab?
  YES ──► Use Klipper + Traefik (K3s defaults). You're done.
  NO
    │
    ▼
Do you need external IPs on bare metal?
  YES ──► Add MetalLB (disable Klipper first)
  NO (cloud) ──► Your cloud CCM handles this
    │
    ▼
Replace default Traefik ingress?
  Need max performance ──► HAProxy Ingress
  Need NGINX ecosystem ──► NGINX Ingress
  Happy with defaults   ──► Keep Traefik
    │
    ▼
Do you have multiple microservices needing service-to-service security?
  YES, want simplicity ──► Add Linkerd
  YES, need full features ──► Add Istio (check your RAM budget!)
  YES, eBPF expertise ──► Use Cilium as CNI + mesh
  NO ──► Skip the mesh for now
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  🔧 K3s-Specific Tips &amp;amp; Gotchas
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Traefik version&lt;/strong&gt;: K3s bundles Traefik. Pin the version in your HelmChartConfig if stability matters.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;MetalLB + Traefik&lt;/strong&gt;: A very common combo. MetalLB gives Traefik a real external IP. After MetalLB assigns an IP, Traefik's LoadBalancer service gets &lt;code&gt;EXTERNAL-IP&lt;/code&gt; populated and starts serving traffic.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Cilium on K3s&lt;/strong&gt;: You must disable flannel (&lt;code&gt;--flannel-backend=none&lt;/code&gt;) and network policy (&lt;code&gt;--disable-network-policy&lt;/code&gt;). Cilium replaces both. If you also want to replace kube-proxy, add &lt;code&gt;--disable-kube-proxy&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Linkerd on K3s&lt;/strong&gt;: Works out of the box. K3s's bundled components (Traefik, CoreDNS) can be meshed too — annotate the &lt;code&gt;kube-system&lt;/code&gt; namespace carefully.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Resource planning&lt;/strong&gt;: A 3-node K3s cluster with Linkerd can run comfortably on 3× Raspberry Pi 4 (4GB). Istio needs significantly more — budget at least 8GB per node.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Gateway API&lt;/strong&gt;: The Kubernetes Gateway API is replacing Ingress. Traefik v3, HAProxy v3.1+, Envoy Gateway, and Cilium all support it. Consider Gateway API for new deployments.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  🏁 Final Recommendations
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Your Situation&lt;/th&gt;
&lt;th&gt;Recommended Stack&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Homelab / learning&lt;/td&gt;
&lt;td&gt;K3s defaults (Traefik + Klipper)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Bare-metal small team&lt;/td&gt;
&lt;td&gt;MetalLB + Traefik&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Bare-metal high traffic&lt;/td&gt;
&lt;td&gt;MetalLB + HAProxy&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;NGINX ecosystem familiarity&lt;/td&gt;
&lt;td&gt;MetalLB + NGINX Ingress&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Need service mesh (simple)&lt;/td&gt;
&lt;td&gt;MetalLB + Traefik + Linkerd&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Need service mesh (full features)&lt;/td&gt;
&lt;td&gt;MetalLB + Traefik + Istio (Ambient mode)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Max performance + security&lt;/td&gt;
&lt;td&gt;Cilium CNI + Envoy Gateway&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Edge/IoT K3s&lt;/td&gt;
&lt;td&gt;Klipper + Traefik (minimal resources)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  📚 Further Reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://docs.k3s.io/networking/networking-services" rel="noopener noreferrer"&gt;K3s Networking Docs&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://documentation.suse.com/suse-edge" rel="noopener noreferrer"&gt;MetalLB on K3s (SUSE Edge)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://doc.traefik.io/traefik/providers/kubernetes-crd/" rel="noopener noreferrer"&gt;Traefik K3s Configuration&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://linkerd.io/2.15/getting-started/" rel="noopener noreferrer"&gt;Linkerd Getting Started&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.cilium.io/en/stable/installation/k3s/" rel="noopener noreferrer"&gt;Cilium K3s Setup&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.haproxy.com/documentation/kubernetes-ingress/" rel="noopener noreferrer"&gt;HAProxy Kubernetes Ingress&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://gateway-api.sigs.k8s.io/" rel="noopener noreferrer"&gt;Kubernetes Gateway API&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;Have questions about your specific K3s setup? Drop them in the comments. Running an unusual configuration (Raspberry Pi cluster, edge IoT, air-gapped)? I'd love to hear about it.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;#kubernetes #k3s #devops #cloudnative #loadbalancing #traefik #nginx #metallb #linkerd #cilium&lt;/em&gt;&lt;/p&gt;

</description>
      <category>devops</category>
      <category>infrastructure</category>
      <category>kubernetes</category>
      <category>networking</category>
    </item>
    <item>
      <title>Kubernetes CNI Complete Guide: Flannel vs Cilium vs Calico + Cloud Provider CNIs</title>
      <dc:creator>Pendela BhargavaSai</dc:creator>
      <pubDate>Tue, 12 May 2026 03:30:00 +0000</pubDate>
      <link>https://dev.to/pendelabhargavasai/kubernetes-cni-complete-guide-flannel-vs-cilium-vs-calico-cloud-provider-cnis-5c6c</link>
      <guid>https://dev.to/pendelabhargavasai/kubernetes-cni-complete-guide-flannel-vs-cilium-vs-calico-cloud-provider-cnis-5c6c</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwiu2b1dngo2gmr0srlif.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwiu2b1dngo2gmr0srlif.png" alt=" " width="800" height="432"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;K3s&lt;/strong&gt; v1.29+ &amp;nbsp;|&amp;nbsp; &lt;strong&gt;Flannel&lt;/strong&gt; v0.24+ &amp;nbsp;|&amp;nbsp; &lt;strong&gt;Cilium&lt;/strong&gt; v1.15+ &amp;nbsp;|&amp;nbsp; &lt;strong&gt;Calico&lt;/strong&gt; v3.27+ &amp;nbsp;|&amp;nbsp; &lt;strong&gt;AWS VPC CNI&lt;/strong&gt; v1.18+ &amp;nbsp;|&amp;nbsp; &lt;strong&gt;Azure CNI&lt;/strong&gt; v1.5+ &amp;nbsp;|&amp;nbsp; &lt;strong&gt;GKE Dataplane V2&lt;/strong&gt; (Cilium-based)&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;A definitive comparison of every major Kubernetes CNI — open-source plugins (Flannel, Calico, Cilium, Weave, Antrea, Multus) and cloud-managed defaults (AWS VPC CNI on EKS, Azure CNI on AKS, and GKE's Dataplane V2 on GKE) — across architecture, performance, network policy, observability, encryption, and when to choose each.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;CNI&lt;/th&gt;
&lt;th&gt;Identity&lt;/th&gt;
&lt;th&gt;Core Approach&lt;/th&gt;
&lt;th&gt;Default On&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;🟢 &lt;strong&gt;Flannel&lt;/strong&gt;
&lt;/td&gt;
&lt;td&gt;Simple Overlay&lt;/td&gt;
&lt;td&gt;VXLAN tunnel, zero policy&lt;/td&gt;
&lt;td&gt;K3s&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;🟠 &lt;strong&gt;Calico&lt;/strong&gt;
&lt;/td&gt;
&lt;td&gt;Policy Powerhouse&lt;/td&gt;
&lt;td&gt;BGP routing, iptables/eBPF&lt;/td&gt;
&lt;td&gt;Self-managed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;🔵 &lt;strong&gt;Cilium&lt;/strong&gt;
&lt;/td&gt;
&lt;td&gt;eBPF Native&lt;/td&gt;
&lt;td&gt;Kernel eBPF, replaces kube-proxy&lt;/td&gt;
&lt;td&gt;GKE (Dataplane V2)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;🟡 &lt;strong&gt;Weave Net&lt;/strong&gt;
&lt;/td&gt;
&lt;td&gt;Mesh Overlay&lt;/td&gt;
&lt;td&gt;Gossip-based mesh routing&lt;/td&gt;
&lt;td&gt;Self-managed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;🟣 &lt;strong&gt;Antrea&lt;/strong&gt;
&lt;/td&gt;
&lt;td&gt;VMware-backed&lt;/td&gt;
&lt;td&gt;OVS dataplane, Antrea policies&lt;/td&gt;
&lt;td&gt;Self-managed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;🔶 &lt;strong&gt;AWS VPC CNI&lt;/strong&gt;
&lt;/td&gt;
&lt;td&gt;Cloud-native&lt;/td&gt;
&lt;td&gt;Native VPC IP assignment&lt;/td&gt;
&lt;td&gt;EKS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;🔷 &lt;strong&gt;Azure CNI&lt;/strong&gt;
&lt;/td&gt;
&lt;td&gt;Cloud-native&lt;/td&gt;
&lt;td&gt;Azure VNET IP assignment&lt;/td&gt;
&lt;td&gt;AKS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;♦️ &lt;strong&gt;GKE CNI / Dataplane V2&lt;/strong&gt;
&lt;/td&gt;
&lt;td&gt;Cloud-native + eBPF&lt;/td&gt;
&lt;td&gt;Cilium-based eBPF on GKE&lt;/td&gt;
&lt;td&gt;GKE&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Table of Contents
&lt;/h2&gt;

&lt;h2&gt;
  
  
  Table of Contents
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;What Is a CNI?&lt;/li&gt;
&lt;li&gt;
Open Source CNIs

&lt;ul&gt;
&lt;li&gt;2.1 Flannel — Simple Overlay
&lt;/li&gt;
&lt;li&gt;2.2 Cilium — eBPF Native
&lt;/li&gt;
&lt;li&gt;2.3 Calico — BGP + Flexible Dataplane
&lt;/li&gt;
&lt;li&gt;2.4 Weave Net — Mesh Overlay
&lt;/li&gt;
&lt;li&gt;2.5 Antrea — OVS-based CNI
&lt;/li&gt;
&lt;li&gt;2.6 Multus — Meta CNI
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
Cloud Provider CNIs

&lt;ul&gt;
&lt;li&gt;3.1 AWS VPC CNI — EKS Default
&lt;/li&gt;
&lt;li&gt;3.2 Azure CNI — AKS Default
&lt;/li&gt;
&lt;li&gt;3.3 GKE Dataplane V2 — GKE Default
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Data Plane Comparison&lt;/li&gt;
&lt;li&gt;Network Policy&lt;/li&gt;
&lt;li&gt;Observability&lt;/li&gt;
&lt;li&gt;Performance Benchmarks&lt;/li&gt;
&lt;li&gt;Encryption&lt;/li&gt;
&lt;li&gt;Multi-Cluster&lt;/li&gt;
&lt;li&gt;Resource Usage&lt;/li&gt;
&lt;li&gt;Full Feature Comparison&lt;/li&gt;
&lt;li&gt;When to Choose Each&lt;/li&gt;
&lt;li&gt;K3s-Specific Setup&lt;/li&gt;
&lt;li&gt;Migration Guide on K3s&lt;/li&gt;
&lt;li&gt;Conclusion&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  1. What Is a CNI and Why Does It Matter?
&lt;/h2&gt;

&lt;p&gt;The &lt;strong&gt;Container Network Interface&lt;/strong&gt; (CNI) is the plugin layer every Kubernetes cluster depends on for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Assigning IP addresses to pods from a defined CIDR range&lt;/li&gt;
&lt;li&gt;Creating virtual Ethernet (veth) pairs between pod namespaces and the host&lt;/li&gt;
&lt;li&gt;Programming cross-node routing so pods on Node A can reach pods on Node B&lt;/li&gt;
&lt;li&gt;Optionally enforcing &lt;code&gt;NetworkPolicy&lt;/code&gt; resources to control traffic flow&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Cloud providers like AWS, Azure, and GCP have built proprietary CNI plugins that deeply integrate with their underlying VPC/VNET networking primitives — providing native IP assignment, cloud-aware routing, and tight integration with cloud IAM, load balancers, and security groups.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;💡 &lt;strong&gt;K3s Key Flag&lt;/strong&gt;&lt;br&gt;
To replace the default CNI on K3s, install with &lt;code&gt;--flannel-backend=none --disable-network-policy&lt;/code&gt;. This leaves the CNI slot open for Calico or Cilium to fill.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  2. Open Source CNIs
&lt;/h2&gt;

&lt;h3&gt;
  
  
  2.1 Flannel Simple Overlay
&lt;/h3&gt;

&lt;p&gt;Flannel's design philosophy: do one thing well. A user-space daemon (&lt;code&gt;flanneld&lt;/code&gt;) manages subnet allocation, while the kernel's own VXLAN and bridge code handles all actual forwarding. No policy, no observability — just connectivity.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Pod A (eth0: 10.244.0.2)          Pod B (eth0: 10.244.0.5)
        │                                  │
        │ veth pair                        │ veth pair
        ▼                                  ▼
           cni0 Linux bridge (kernel)
                    │
      iptables PREROUTING / FORWARD / POSTROUTING
                    │
         VXLAN encapsulation — UDP 8472
                    │
     flanneld (user-space) ← etcd / K8s API
                    │
          Physical NIC → Node B
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwyp31b0v2tfpvqia6vgb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwyp31b0v2tfpvqia6vgb.png" alt="Fannel Architecture" width="800" height="525"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Available backends:&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Backend&lt;/th&gt;
&lt;th&gt;Transport&lt;/th&gt;
&lt;th&gt;Use Case&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;vxlan&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;UDP encap (default)&lt;/td&gt;
&lt;td&gt;Works across any network, even routers&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;host-gw&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Direct routing&lt;/td&gt;
&lt;td&gt;Fastest, requires L2 adjacency between nodes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;wireguard-native&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Encrypted WireGuard tunnel&lt;/td&gt;
&lt;td&gt;When you need encryption&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;udp&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Legacy user-space&lt;/td&gt;
&lt;td&gt;Fallback only — very slow&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Network Policy:&lt;/strong&gt; Flannel enforces zero NetworkPolicy. Resources are silently ignored. You must pair it with Calico (Canal) to get policy — adding a second DaemonSet, version compatibility risk, and split ownership between two projects.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Flannel Encryption:&lt;/strong&gt; Flannel encrypts cross-node traffic only — pod-to-pod on the same node travels through the &lt;code&gt;cni0&lt;/code&gt; bridge unencrypted. No auto key rotation; restart &lt;code&gt;flanneld&lt;/code&gt; to rotate keys.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Network"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"10.244.0.0/16"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Backend"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"wireguard"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Dev/CI clusters, Raspberry Pi, edge nodes, K3s defaults.&lt;/p&gt;




&lt;h3&gt;
  
  
  2.2 Cilium — eBPF Native
&lt;/h3&gt;

&lt;p&gt;Cilium compiles and injects eBPF programs into the Linux kernel at TC/XDP hook points. There is no bridge, no iptables — packets are forwarded via &lt;code&gt;bpf_redirect()&lt;/code&gt; at line rate, and policy is enforced via O(1) BPF map lookups.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Pod A (eth0)                         Pod B (eth0)
       │                                  │
       │ veth pair                        │
       ▼                                  ▼
TC eBPF hook ──── bpf_redirect() ──── TC eBPF hook
                  │
BPF maps: identity · policy · NAT · LB
                  │
cilium-agent — compiles eBPF, watches K8s API
                  │
  Physical NIC — GENEVE / native routing
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftei3oiyqewc7l9s5tk6p.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftei3oiyqewc7l9s5tk6p.webp" alt="K8S Network vs Cilium" width="800" height="448"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Datapath modes:&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Mode&lt;/th&gt;
&lt;th&gt;Encapsulation&lt;/th&gt;
&lt;th&gt;Requirement&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;tunnel: geneve&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;GENEVE (default)&lt;/td&gt;
&lt;td&gt;Any network topology&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;native-routing&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;L2 adjacency or BGP underlay&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;wireguard&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;WireGuard transparent&lt;/td&gt;
&lt;td&gt;Kernel ≥ 5.6&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;ipsec&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;IPsec&lt;/td&gt;
&lt;td&gt;FIPS-regulated environments&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Network Policy:&lt;/strong&gt; 4.3 Cilium — L3 Through L7, No Sidecar&lt;/p&gt;

&lt;p&gt;Cilium enforces standard NetworkPolicy and extends it with &lt;code&gt;CiliumNetworkPolicy&lt;/code&gt; (CNP) for Layer 7 rules — no sidecar required:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# CiliumNetworkPolicy — L7 HTTP rule&lt;/span&gt;
&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;cilium.io/v2&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;CiliumNetworkPolicy&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;allow-get-only&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;endpointSelector&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;matchLabels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;api&lt;/span&gt;
  &lt;span class="na"&gt;ingress&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;fromEndpoints&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;matchLabels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;frontend&lt;/span&gt;
    &lt;span class="na"&gt;toPorts&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;8080"&lt;/span&gt;
        &lt;span class="na"&gt;protocol&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;TCP&lt;/span&gt;
      &lt;span class="na"&gt;rules&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;http&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;method&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;GET&lt;/span&gt;
          &lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/api/v1/.*"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  🔭 Cilium + Hubble
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;✅ Per-flow visibility on every packet&lt;/li&gt;
&lt;li&gt;✅ Live service dependency map (Hubble UI)&lt;/li&gt;
&lt;li&gt;✅ L7 HTTP / DNS / Kafka / gRPC flows&lt;/li&gt;
&lt;li&gt;✅ Drop reason per endpoint&lt;/li&gt;
&lt;li&gt;✅ Rich Prometheus metrics
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Enable Hubble and UI&lt;/span&gt;
cilium hubble &lt;span class="nb"&gt;enable&lt;/span&gt; &lt;span class="nt"&gt;--ui&lt;/span&gt;

&lt;span class="c"&gt;# Watch live flows in a namespace&lt;/span&gt;
hubble observe &lt;span class="nt"&gt;--namespace&lt;/span&gt; production &lt;span class="nt"&gt;--follow&lt;/span&gt;

&lt;span class="c"&gt;# Show only policy drops with reason&lt;/span&gt;
hubble observe &lt;span class="nt"&gt;--verdict&lt;/span&gt; DROPPED &lt;span class="nt"&gt;--follow&lt;/span&gt;

&lt;span class="c"&gt;# Sample output:&lt;/span&gt;
&lt;span class="c"&gt;# 12:34:01: default/frontend → default/backend  FORWARDED  TCP:SYN&lt;/span&gt;
&lt;span class="c"&gt;# 12:34:02: default/attacker → default/backend  DROPPED    Policy denied&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Cilium Encryption:&lt;/strong&gt; Cilium WireGuard + IPsec&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# WireGuard with strict mode (drops unencrypted packets)&lt;/span&gt;
cilium &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--encryption&lt;/span&gt; wireguard &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--encryption-strict-mode&lt;/span&gt; &lt;span class="nb"&gt;true&lt;/span&gt;

&lt;span class="c"&gt;# IPsec for FIPS-regulated environments&lt;/span&gt;
cilium &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;--encryption&lt;/span&gt; ipsec
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Large-scale production, L7 policy, observability (Hubble), zero-trust, multi-cluster.&lt;/p&gt;




&lt;h3&gt;
  
  
  2.3 Calico — BGP + Flexible Dataplane
&lt;/h3&gt;

&lt;p&gt;Calico uses &lt;strong&gt;BGP&lt;/strong&gt; (Border Gateway Protocol) to distribute pod routes across nodes — no encapsulation by default. Each node acts as a BGP peer, advertising its pod CIDR to other nodes and upstream routers. Calico's data plane is pluggable: iptables, eBPF, or even Windows HNS.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Pod A (eth0: 192.168.0.2)          Pod B (eth0: 192.168.1.2)
        │                                  │
        │ veth pair                        │ veth pair
        ▼                                  ▼
      Host routing table (no bridge needed)
                    │
      iptables / eBPF policy enforcement
                    │
     Felix (per-node agent) ← Typha (fan-out)
                    │
     BIRD (BGP daemon) — peers with other nodes
                    │
    Physical NIC — direct IP routing (no encap)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm2yz0fuiiezgssjwfb85.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm2yz0fuiiezgssjwfb85.png" alt="Calico Architecture" width="800" height="419"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key Calico components:&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Component&lt;/th&gt;
&lt;th&gt;Role&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Felix&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Per-node agent; programs iptables/eBPF rules and routes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;BIRD&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Open-source BGP daemon; advertises pod subnets to peers&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Typha&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Fan-out proxy for the K8s datastore; recommended at 50+ nodes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;calico-kube-controllers&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Garbage-collects stale Calico resources&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Network Policy:&lt;/strong&gt; 4.2 Calico — L3/L4 Policy Leader&lt;/p&gt;

&lt;p&gt;Calico is widely regarded as the gold standard for L3/L4 NetworkPolicy. It supports standard &lt;code&gt;NetworkPolicy&lt;/code&gt; resources plus its own &lt;code&gt;GlobalNetworkPolicy&lt;/code&gt; and &lt;code&gt;NetworkSet&lt;/code&gt; CRDs:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Calico GlobalNetworkPolicy — cluster-wide deny-all&lt;/span&gt;
&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;projectcalico.org/v3&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;GlobalNetworkPolicy&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;default-deny-all&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;selector&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;all()&lt;/span&gt;
  &lt;span class="na"&gt;types&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;Ingress&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;Egress&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Calico NetworkSet — group external CIDRs&lt;/span&gt;
&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;projectcalico.org/v3&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;NetworkSet&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;trusted-external&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;nets&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;203.0.113.0/24&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;198.51.100.0/24&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;⚠️ Calico does &lt;strong&gt;not&lt;/strong&gt; support L7 HTTP/gRPC policy natively in OSS. For that you need its optional Envoy-based Application Layer Policy (ALP), which adds a sidecar and complexity.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;Calico Encryption:&lt;/strong&gt; Calico supports WireGuard for node-to-node encryption, enabled with a single patch:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kubectl patch felixconfiguration default &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--type&lt;/span&gt; merge &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--patch&lt;/span&gt; &lt;span class="s1"&gt;'{"spec":{"wireguardEnabled":true}}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Starting in Calico v3.26, same-node pod traffic encryption is also supported via host-to-pod WireGuard options.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; BGP-integrated DCs, Windows node support, bare-metal L3, robust L3/L4 policy.&lt;/p&gt;




&lt;h3&gt;
  
  
  2.4 Weave Net — Mesh Overlay
&lt;/h3&gt;

&lt;p&gt;Weave Net uses a gossip protocol to build a full mesh topology between all cluster nodes without any central store. It wraps packets in a sleeve (VXLAN-like) tunnel and can optionally encrypt all traffic with NaCl. Weave is simpler to operate than Calico/Cilium but is no longer under active development (archived by Weaveworks in 2023).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Pod A (eth0)
       │
    weave bridge
       │
  weave daemon (gossip mesh peer discovery)
       │
  Sleeve / Fast Datapath (VXLAN kernel bypass)
       │
    Node B weave daemon
       │
    Pod B (eth0)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Key characteristics:&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;Detail&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Discovery&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Gossip — no external etcd needed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Datapath&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Sleeve (user-space) or Fast Datapath (kernel VXLAN)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Encryption&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;NaCl (enabled per-pod connection)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;NetworkPolicy&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ Standard K8s policy supported&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Status&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;⚠️ Archived/maintenance mode (use Cilium or Calico for new clusters)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;⚠️ &lt;strong&gt;Important:&lt;/strong&gt; Weaveworks ceased active development in 2023. Weave Net is community-maintained but no longer receives feature updates. It is &lt;strong&gt;not recommended&lt;/strong&gt; for new clusters — migrate to Cilium or Calico.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Legacy clusters already running Weave with migration on the roadmap.&lt;/p&gt;




&lt;h3&gt;
  
  
  2.5 Antrea — OVS-based CNI
&lt;/h3&gt;

&lt;p&gt;Antrea is a CNI backed by VMware (now Broadcom) that uses &lt;strong&gt;Open vSwitch (OVS)&lt;/strong&gt; as its dataplane. It supports both Linux and Windows nodes and provides its own &lt;code&gt;AntreaNetworkPolicy&lt;/code&gt; and &lt;code&gt;ClusterNetworkPolicy&lt;/code&gt; CRDs with tiered policy enforcement. Antrea integrates well with NSX-T for enterprise SD-WAN environments.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Pod A (eth0)
       │
   OVS (Open vSwitch) bridge
       │
   antrea-agent (per-node DaemonSet)
       │
   antrea-controller (centralized)
       │
   Encap: Geneve / VXLAN / GRE (configurable)
       │
   Node B OVS bridge → Pod B
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Key features:&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;Antrea&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Dataplane&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Open vSwitch (OVS)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Windows support&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ Full (OVS on Windows)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;NetworkPolicy&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ K8s standard + AntreaNetworkPolicy CRDs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Tiered policy&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ (Emergency / Security / Application tiers)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Encryption&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ IPsec / WireGuard&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Observability&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ Antrea Octant plugin, Prometheus metrics&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;NSX-T integration&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ Enterprise add-on&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;eBPF support&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ AntreaProxy (partial eBPF)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; VMware/NSX-T environments, Windows-heavy clusters, tiered network policy.&lt;/p&gt;




&lt;h3&gt;
  
  
  2.6 Multus — Meta CNI
&lt;/h3&gt;

&lt;p&gt;Multus is not a standalone CNI — it is a &lt;strong&gt;meta CNI&lt;/strong&gt; that allows pods to attach multiple network interfaces simultaneously. A pod can have its primary network (managed by Flannel/Calico/Cilium) and secondary interfaces (SR-IOV, DPDK, Macvlan) for specialized workloads like telco NFV or HPC.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Pod with Multiple NICs:
  eth0 (primary) ← Flannel/Calico/Cilium (cluster network)
  net1 (secondary) ← SR-IOV (high-throughput direct NIC)
  net2 (secondary) ← Macvlan (storage network)

Multus reads NetworkAttachmentDefinition CRDs and delegates
to the correct CNI for each interface.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# NetworkAttachmentDefinition for secondary interface&lt;/span&gt;
&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;k8s.cni.cncf.io/v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;NetworkAttachmentDefinition&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;sriov-net&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;config&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
    &lt;span class="s"&gt;{&lt;/span&gt;
      &lt;span class="s"&gt;"type": "sriov",&lt;/span&gt;
      &lt;span class="s"&gt;"name": "sriov-net",&lt;/span&gt;
      &lt;span class="s"&gt;"ipam": { "type": "static" }&lt;/span&gt;
    &lt;span class="s"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Telco/NFV workloads, HPC, pods that need to straddle multiple network segments.&lt;/p&gt;




&lt;h2&gt;
  
  
  3. Cloud Provider CNIs
&lt;/h2&gt;

&lt;p&gt;Cloud-managed Kubernetes services ship their own CNI plugins that are deeply integrated with the underlying cloud networking fabric. These provide first-class VPC routing, cloud IAM integration, and managed lifecycle — but are typically locked to their respective cloud.&lt;/p&gt;

&lt;h3&gt;
  
  
  3.1 AWS VPC CNI — EKS Default
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Amazon EKS&lt;/strong&gt; uses the &lt;strong&gt;Amazon VPC CNI plugin&lt;/strong&gt; (&lt;code&gt;aws-node&lt;/code&gt; DaemonSet) by default. Instead of an overlay, it assigns &lt;strong&gt;real VPC secondary IP addresses&lt;/strong&gt; directly to pods from Elastic Network Interfaces (ENIs) attached to the worker node.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Worker Node (EC2 instance)
    │
    ├── Primary ENI (node IP: 10.0.1.10)
    │      └── eth0
    │
    ├── Secondary ENI (attached by vpc-cni)
    │      ├── 10.0.1.20 → Pod A (eth0 via veth)
    │      ├── 10.0.1.21 → Pod B (eth0 via veth)
    │      └── 10.0.1.22 → Pod C (eth0 via veth)
    │
    └── vpc-cni (aws-node DaemonSet)
           manages ENI lifecycle via EC2 API
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;How pod IPs work:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Each EC2 instance can attach multiple ENIs; each ENI holds multiple secondary IPs&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;vpc-cni&lt;/code&gt; pre-warms a pool of secondary IPs per node via EC2 API calls&lt;/li&gt;
&lt;li&gt;Pods receive a real VPC IP — &lt;strong&gt;routable natively&lt;/strong&gt; across the VPC, peered VPCs, VPNs, and Direct Connect — with no overlay&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Pod density limits per node (examples):&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Instance Type&lt;/th&gt;
&lt;th&gt;Max ENIs&lt;/th&gt;
&lt;th&gt;Max IPs (pod limit)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;t3.medium&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;17&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;m5.large&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;29&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;m5.xlarge&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;58&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;m5.4xlarge&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;234&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;c5.18xlarge&lt;/td&gt;
&lt;td&gt;15&lt;/td&gt;
&lt;td&gt;750&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;⚠️ &lt;strong&gt;Important:&lt;/strong&gt; Default pod density is capped by the ENI/IP limit per instance type. For IP-constrained environments, use &lt;strong&gt;VPC CNI with prefix delegation&lt;/strong&gt; (&lt;code&gt;ENABLE_PREFIX_DELEGATION=true&lt;/code&gt;) to assign /28 prefixes instead of individual IPs, dramatically increasing pod density.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;Key features:&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;AWS VPC CNI&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;IP assignment&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Native VPC secondary IPs from ENIs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Overlay&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✗ None — native VPC routing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;NetworkPolicy&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✗ Not built-in — requires Calico or Cilium add-on&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Security Groups&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ Security Groups for Pods (SGP) — per-pod AWS SGs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;IPv6&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ Supported&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Prefix delegation&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ /28 prefix per ENI (more pods per node)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Windows nodes&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ Supported&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Custom networking&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ Pods in different subnet than node&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;eBPF acceleration&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ via Cilium add-on (EKS + Cilium mode)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Enabling Network Policy on EKS:&lt;/strong&gt;&lt;br&gt;
AWS VPC CNI itself does not enforce NetworkPolicy. You must add one of:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Calico&lt;/strong&gt; (most common) — install as an add-on alongside vpc-cni&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cilium in chained mode&lt;/strong&gt; — replaces policy enforcement, keeps VPC IP routing&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Amazon VPC CNI Network Policy&lt;/strong&gt; (AWS-native, GA as of 2024) — uses eBPF for policy enforcement
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Enable AWS-native network policy controller (EKS add-on)&lt;/span&gt;
aws eks create-addon &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--cluster-name&lt;/span&gt; my-cluster &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--addon-name&lt;/span&gt; vpc-cni &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--configuration-values&lt;/span&gt; &lt;span class="s1"&gt;'{"nodeAgent":{"enablePolicyEventLogs":"true"}}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;&lt;strong&gt;When to choose AWS VPC CNI:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;✅ Running EKS — it is the default and AWS-managed&lt;/li&gt;
&lt;li&gt;✅ Need pods directly reachable from on-premises via Direct Connect / VPN&lt;/li&gt;
&lt;li&gt;✅ Need per-pod AWS Security Groups (SGP feature)&lt;/li&gt;
&lt;li&gt;✅ Compliance requires no overlay network&lt;/li&gt;
&lt;li&gt;⚠️ Watch instance type ENI limits for large pod densities&lt;/li&gt;
&lt;/ul&gt;


&lt;h3&gt;
  
  
  3.2 Azure CNI — AKS Default
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Azure Kubernetes Service (AKS)&lt;/strong&gt; offers multiple CNI modes. The default for most production clusters is &lt;strong&gt;Azure CNI&lt;/strong&gt;, which assigns pod IPs directly from the Azure Virtual Network (VNET) subnet — similar in concept to AWS VPC CNI but using Azure's networking primitives.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;AKS CNI Modes:&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Mode&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;th&gt;Default?&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;kubenet&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Basic overlay; nodes get VNET IPs, pods get private overlay IPs (NAT)&lt;/td&gt;
&lt;td&gt;Legacy default&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Azure CNI&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Pods get real VNET IPs from a pre-allocated subnet&lt;/td&gt;
&lt;td&gt;Current recommended default&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Azure CNI Overlay&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Pods get overlay IPs (larger scale, fewer VNET IPs needed)&lt;/td&gt;
&lt;td&gt;Recommended for large clusters&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Azure CNI + Cilium&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Azure CNI routing + Cilium eBPF dataplane + Hubble&lt;/td&gt;
&lt;td&gt;Recommended for policy/observability&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Bring Your Own CNI&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Disable Azure CNI; install Calico, Flannel, etc.&lt;/td&gt;
&lt;td&gt;Advanced&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Azure CNI (traditional):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;AKS Worker Node (Azure VM)
    │
    ├── Primary NIC (node IP: 10.240.0.4)
    │      └── VNET: 10.240.0.0/16
    │
    └── Pod IPs pre-allocated from subnet:
           ├── 10.240.0.10 → Pod A
           ├── 10.240.0.11 → Pod B
           └── 10.240.0.12 → Pod C

azure-vnet (CNI plugin) programs routes in Azure SDN
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Azure CNI Overlay (recommended for scale):&lt;/strong&gt;&lt;br&gt;
Introduced to solve IP exhaustion. Pods get IPs from a private overlay CIDR (e.g., 10.244.0.0/16) while nodes get real VNET IPs. Azure SDN handles the translation — no overlay encap at the packet level from the VM's perspective.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Create AKS cluster with Azure CNI Overlay + Cilium dataplane&lt;/span&gt;
az aks create &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--resource-group&lt;/span&gt; myRG &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; myAKS &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--network-plugin&lt;/span&gt; azure &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--network-plugin-mode&lt;/span&gt; overlay &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--network-dataplane&lt;/span&gt; cilium &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--pod-cidr&lt;/span&gt; 192.168.0.0/16
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Key features:&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;kubenet&lt;/th&gt;
&lt;th&gt;Azure CNI&lt;/th&gt;
&lt;th&gt;Azure CNI Overlay&lt;/th&gt;
&lt;th&gt;Azure CNI + Cilium&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Pod IPs&lt;/td&gt;
&lt;td&gt;Overlay (NAT)&lt;/td&gt;
&lt;td&gt;Real VNET IPs&lt;/td&gt;
&lt;td&gt;Overlay (Azure SDN)&lt;/td&gt;
&lt;td&gt;Overlay (Azure SDN)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;IP exhaustion risk&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Direct pod routing&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅ (via Azure SDN)&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;NetworkPolicy&lt;/td&gt;
&lt;td&gt;Basic&lt;/td&gt;
&lt;td&gt;Azure Network Policy / Calico&lt;/td&gt;
&lt;td&gt;Azure NP / Calico&lt;/td&gt;
&lt;td&gt;✅ Cilium (eBPF)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Windows nodes&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;⚠️ Partial&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Hubble observability&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Max pods/node&lt;/td&gt;
&lt;td&gt;110&lt;/td&gt;
&lt;td&gt;250&lt;/td&gt;
&lt;td&gt;250&lt;/td&gt;
&lt;td&gt;250&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Network Policy options on AKS:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Azure Network Policy Manager (NPM)&lt;/strong&gt; — iptables-based, Azure-native, limited feature set&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Calico&lt;/strong&gt; — add-on, full L3/L4 policy, most commonly used&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cilium&lt;/strong&gt; — available with Azure CNI Overlay mode, eBPF enforcement + Hubble&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;When to choose Azure CNI:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;✅ Running AKS — Azure CNI Overlay is the modern recommended choice&lt;/li&gt;
&lt;li&gt;✅ Need pods directly reachable from on-premises via ExpressRoute&lt;/li&gt;
&lt;li&gt;✅ Want Hubble observability → use Azure CNI Overlay + Cilium dataplane&lt;/li&gt;
&lt;li&gt;✅ Large clusters (100+ nodes) → use Overlay mode to avoid VNET IP exhaustion&lt;/li&gt;
&lt;li&gt;⚠️ Traditional Azure CNI requires pre-allocating pod IPs per node — plan subnet size carefully&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  3.3 GKE Dataplane V2 — GKE Default
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Google Kubernetes Engine (GKE)&lt;/strong&gt; introduced &lt;strong&gt;Dataplane V2&lt;/strong&gt; in 2021, which is based on &lt;strong&gt;Cilium's eBPF engine&lt;/strong&gt;. It is the default for new GKE clusters and brings production-grade eBPF networking, built-in NetworkPolicy enforcement, and a subset of Hubble observability — all managed by Google.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;GKE networking modes:&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Mode&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;th&gt;Default?&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Legacy (iptables)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;kube-proxy + iptables, no Dataplane V2&lt;/td&gt;
&lt;td&gt;Older clusters&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Dataplane V2&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Cilium eBPF, managed by GKE, no full Cilium control plane&lt;/td&gt;
&lt;td&gt;Default for new clusters&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Dataplane V2 + Hubble&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Same + network telemetry via Hubble&lt;/td&gt;
&lt;td&gt;Optional add-on&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Architecture:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;GKE Node (GCE VM)
    │
    ├── Alias IP range (VPC-native pod CIDRs)
    │     Pods get real VPC IPs, routed via Google SDN
    │
    └── Dataplane V2 (Cilium eBPF engine)
           ├── TC eBPF hooks on veth interfaces
           ├── BPF maps for policy, NAT, LB
           ├── kube-proxy replaced by eBPF
           └── Hubble telemetry (if enabled)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;GKE uses &lt;strong&gt;VPC-native networking&lt;/strong&gt; (alias IP ranges) — pods get real VPC CIDRs routed natively through Google's Andromeda SDN. Dataplane V2 sits on top, adding eBPF policy enforcement and observability.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Enabling Dataplane V2 on GKE:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Create GKE cluster with Dataplane V2 (default for new clusters)&lt;/span&gt;
gcloud container clusters create my-cluster &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--enable-dataplane-v2&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--enable-ip-alias&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--location&lt;/span&gt; us-central1

&lt;span class="c"&gt;# Enable Hubble observability add-on&lt;/span&gt;
gcloud container clusters update my-cluster &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--enable-dataplane-v2-flow-observability&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--location&lt;/span&gt; us-central1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Key features:&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;GKE Dataplane V2&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Dataplane&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Cilium eBPF (managed subset)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;kube-proxy replacement&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ eBPF&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;NetworkPolicy&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ eBPF-enforced (L3/L4)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;FQDN policy&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ (GKE 1.28+)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Hubble observability&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ Optional add-on&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;L7 policy&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;⚠️ Not exposed (managed limitations)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Pod IPs&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Real VPC IPs (alias ranges)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Windows nodes&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Multi-cluster&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ via GKE Fleet / Anthos&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Managed lifecycle&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ Google manages upgrades&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Dataplane V2 vs self-managed Cilium on GKE:&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Aspect&lt;/th&gt;
&lt;th&gt;GKE Dataplane V2&lt;/th&gt;
&lt;th&gt;Self-managed Cilium on GKE&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Management&lt;/td&gt;
&lt;td&gt;Google-managed&lt;/td&gt;
&lt;td&gt;You manage Helm values/upgrades&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Feature exposure&lt;/td&gt;
&lt;td&gt;Subset of Cilium&lt;/td&gt;
&lt;td&gt;Full Cilium feature set&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Hubble&lt;/td&gt;
&lt;td&gt;Basic (add-on)&lt;/td&gt;
&lt;td&gt;Full Hubble UI + Relay&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cluster Mesh&lt;/td&gt;
&lt;td&gt;✗ (use GKE Fleet)&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;L7 CNP&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Support&lt;/td&gt;
&lt;td&gt;GKE SLA&lt;/td&gt;
&lt;td&gt;Community / Isovalent&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;💡 &lt;strong&gt;GKE Recommendation:&lt;/strong&gt; For most workloads, &lt;strong&gt;Dataplane V2 is the right choice&lt;/strong&gt; — Google manages it, it's eBPF-based, and it covers L3/L4 policy. If you need full CiliumNetworkPolicy L7 rules or Cluster Mesh, consider self-managed Cilium on GKE with &lt;code&gt;--network-plugin=cni&lt;/code&gt; and disabling kube-proxy.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;When to choose GKE Dataplane V2:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;✅ Running GKE — it is the default and Google-managed&lt;/li&gt;
&lt;li&gt;✅ Want eBPF performance without managing Cilium yourself&lt;/li&gt;
&lt;li&gt;✅ NetworkPolicy enforcement at scale (eBPF O(1) lookups)&lt;/li&gt;
&lt;li&gt;✅ Need basic Hubble network telemetry&lt;/li&gt;
&lt;li&gt;⚠️ For full L7 policy or Cluster Mesh, self-manage Cilium on GKE instead&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  4. Data Plane Comparison
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Service Scalability — All CNIs
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Services&lt;/th&gt;
&lt;th&gt;Flannel (iptables)&lt;/th&gt;
&lt;th&gt;Calico (iptables)&lt;/th&gt;
&lt;th&gt;Calico (eBPF)&lt;/th&gt;
&lt;th&gt;Cilium (eBPF)&lt;/th&gt;
&lt;th&gt;AWS VPC CNI&lt;/th&gt;
&lt;th&gt;Azure CNI&lt;/th&gt;
&lt;th&gt;GKE DPv2&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;100&lt;/td&gt;
&lt;td&gt;~10 ms&lt;/td&gt;
&lt;td&gt;~10 ms&lt;/td&gt;
&lt;td&gt;&amp;lt; 1 ms&lt;/td&gt;
&lt;td&gt;&amp;lt; 1 ms&lt;/td&gt;
&lt;td&gt;~10 ms&lt;/td&gt;
&lt;td&gt;~10 ms&lt;/td&gt;
&lt;td&gt;&amp;lt; 1 ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1,000&lt;/td&gt;
&lt;td&gt;~80 ms&lt;/td&gt;
&lt;td&gt;~80 ms&lt;/td&gt;
&lt;td&gt;&amp;lt; 1 ms&lt;/td&gt;
&lt;td&gt;&amp;lt; 1 ms&lt;/td&gt;
&lt;td&gt;~80 ms&lt;/td&gt;
&lt;td&gt;~80 ms&lt;/td&gt;
&lt;td&gt;&amp;lt; 1 ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;10,000&lt;/td&gt;
&lt;td&gt;~800 ms&lt;/td&gt;
&lt;td&gt;~800 ms&lt;/td&gt;
&lt;td&gt;&amp;lt; 1 ms&lt;/td&gt;
&lt;td&gt;&amp;lt; 1 ms&lt;/td&gt;
&lt;td&gt;~800 ms&lt;/td&gt;
&lt;td&gt;~800 ms&lt;/td&gt;
&lt;td&gt;&amp;lt; 1 ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;50,000&lt;/td&gt;
&lt;td&gt;⚠️ drops&lt;/td&gt;
&lt;td&gt;⚠️ drops&lt;/td&gt;
&lt;td&gt;&amp;lt; 1 ms&lt;/td&gt;
&lt;td&gt;&amp;lt; 1 ms&lt;/td&gt;
&lt;td&gt;⚠️ drops&lt;/td&gt;
&lt;td&gt;⚠️ drops&lt;/td&gt;
&lt;td&gt;&amp;lt; 1 ms&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  5. Network Policy
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Policy Feature Comparison
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Policy Feature&lt;/th&gt;
&lt;th&gt;Flannel&lt;/th&gt;
&lt;th&gt;Calico&lt;/th&gt;
&lt;th&gt;Cilium&lt;/th&gt;
&lt;th&gt;Weave&lt;/th&gt;
&lt;th&gt;Antrea&lt;/th&gt;
&lt;th&gt;AWS VPC CNI&lt;/th&gt;
&lt;th&gt;Azure CNI&lt;/th&gt;
&lt;th&gt;GKE DPv2&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Standard NetworkPolicy&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅ (add-on)&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Egress Policy&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GlobalNetworkPolicy&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅ CCNP&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✅ ClusterNetworkPolicy&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;FQDN / DNS policy&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✅ (1.28+)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;L7 HTTP method/path&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;⚠️ ALP&lt;/td&gt;
&lt;td&gt;✅ no sidecar&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Kafka / gRPC policy&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Tiered policy&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Security Groups (cloud)&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✅ SGP&lt;/td&gt;
&lt;td&gt;✅ NSG&lt;/td&gt;
&lt;td&gt;✅ Firewall rules&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  6. Observability
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;Flannel&lt;/th&gt;
&lt;th&gt;Calico&lt;/th&gt;
&lt;th&gt;Cilium&lt;/th&gt;
&lt;th&gt;Weave&lt;/th&gt;
&lt;th&gt;Antrea&lt;/th&gt;
&lt;th&gt;AWS VPC CNI&lt;/th&gt;
&lt;th&gt;Azure CNI&lt;/th&gt;
&lt;th&gt;GKE DPv2&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;L3/L4 flow logs&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅ VPC Flow Logs&lt;/td&gt;
&lt;td&gt;✅ NSG Flow Logs&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;L7 HTTP flows&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗ (OSS)&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Live service map&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✅ Hubble UI&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✅ Octant&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✅ (add-on)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Drop reason&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;⚠️&lt;/td&gt;
&lt;td&gt;⚠️&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Prometheus metrics&lt;/td&gt;
&lt;td&gt;Basic&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅ Rich&lt;/td&gt;
&lt;td&gt;✅ Basic&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅ CloudWatch&lt;/td&gt;
&lt;td&gt;✅ Azure Monitor&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Built-in UI&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗ (OSS)&lt;/td&gt;
&lt;td&gt;✅ Hubble UI&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✅ Octant&lt;/td&gt;
&lt;td&gt;✅ CloudWatch&lt;/td&gt;
&lt;td&gt;✅ Azure Monitor&lt;/td&gt;
&lt;td&gt;✅ Cloud Console&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  7. Performance Benchmarks
&lt;/h2&gt;

&lt;h3&gt;
  
  
  TCP Throughput — iperf3, Pod-to-Pod Same Node
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;CNI&lt;/th&gt;
&lt;th&gt;Mode&lt;/th&gt;
&lt;th&gt;Throughput&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Flannel&lt;/td&gt;
&lt;td&gt;VXLAN&lt;/td&gt;
&lt;td&gt;~8 Gbps&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Flannel&lt;/td&gt;
&lt;td&gt;host-gw&lt;/td&gt;
&lt;td&gt;~9.5 Gbps&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Calico&lt;/td&gt;
&lt;td&gt;BGP direct (iptables)&lt;/td&gt;
&lt;td&gt;~9.3 Gbps&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Calico&lt;/td&gt;
&lt;td&gt;BGP direct (eBPF)&lt;/td&gt;
&lt;td&gt;~9.7 Gbps&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cilium&lt;/td&gt;
&lt;td&gt;GENEVE tunnel&lt;/td&gt;
&lt;td&gt;~8.5 Gbps&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cilium&lt;/td&gt;
&lt;td&gt;native-routing&lt;/td&gt;
&lt;td&gt;~9.8 Gbps&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cilium&lt;/td&gt;
&lt;td&gt;XDP&lt;/td&gt;
&lt;td&gt;line rate&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AWS VPC CNI&lt;/td&gt;
&lt;td&gt;Native VPC routing&lt;/td&gt;
&lt;td&gt;~9.5 Gbps&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Azure CNI&lt;/td&gt;
&lt;td&gt;Native VNET routing&lt;/td&gt;
&lt;td&gt;~9.4 Gbps&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GKE Dataplane V2&lt;/td&gt;
&lt;td&gt;Alias IP + eBPF&lt;/td&gt;
&lt;td&gt;~9.7 Gbps&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;⚠️ Results are representative — hardware, kernel version, and NIC driver all affect real-world numbers.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  p99 Latency — Same Node
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;CNI&lt;/th&gt;
&lt;th&gt;Mode&lt;/th&gt;
&lt;th&gt;p99 Latency&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Flannel&lt;/td&gt;
&lt;td&gt;VXLAN&lt;/td&gt;
&lt;td&gt;~0.35 ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Flannel&lt;/td&gt;
&lt;td&gt;host-gw&lt;/td&gt;
&lt;td&gt;~0.18 ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Calico&lt;/td&gt;
&lt;td&gt;BGP direct (eBPF)&lt;/td&gt;
&lt;td&gt;~0.15 ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cilium&lt;/td&gt;
&lt;td&gt;native-routing&lt;/td&gt;
&lt;td&gt;~0.16 ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AWS VPC CNI&lt;/td&gt;
&lt;td&gt;Native&lt;/td&gt;
&lt;td&gt;~0.17 ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Azure CNI&lt;/td&gt;
&lt;td&gt;Native&lt;/td&gt;
&lt;td&gt;~0.18 ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GKE Dataplane V2&lt;/td&gt;
&lt;td&gt;eBPF&lt;/td&gt;
&lt;td&gt;~0.15 ms&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  8. Encryption
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;Flannel WG&lt;/th&gt;
&lt;th&gt;Calico WG&lt;/th&gt;
&lt;th&gt;Cilium WG&lt;/th&gt;
&lt;th&gt;Cilium IPsec&lt;/th&gt;
&lt;th&gt;Antrea WG/IPsec&lt;/th&gt;
&lt;th&gt;AWS CNI&lt;/th&gt;
&lt;th&gt;Azure CNI&lt;/th&gt;
&lt;th&gt;GKE DPv2&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Cross-node encryption&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅ (NLB/TLS)&lt;/td&gt;
&lt;td&gt;✅ (Azure Firewall)&lt;/td&gt;
&lt;td&gt;✅ (WireGuard, beta)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Same-node encryption&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✅ (v3.26+)&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Strict drop mode&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Auto key rotation&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;Managed&lt;/td&gt;
&lt;td&gt;Managed&lt;/td&gt;
&lt;td&gt;Managed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;FIPS compliance&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅ IPsec&lt;/td&gt;
&lt;td&gt;✅ (AWS FIPS)&lt;/td&gt;
&lt;td&gt;✅ (Azure FIPS)&lt;/td&gt;
&lt;td&gt;✅ (Google FIPS)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  9. Multi-Cluster
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;Flannel&lt;/th&gt;
&lt;th&gt;Calico&lt;/th&gt;
&lt;th&gt;Cilium&lt;/th&gt;
&lt;th&gt;Antrea&lt;/th&gt;
&lt;th&gt;AWS EKS&lt;/th&gt;
&lt;th&gt;Azure AKS&lt;/th&gt;
&lt;th&gt;GKE&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Native multi-cluster&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✅ BGP&lt;/td&gt;
&lt;td&gt;✅ Cluster Mesh&lt;/td&gt;
&lt;td&gt;✅ Antrea Multi-cluster&lt;/td&gt;
&lt;td&gt;✅ EKS Connector&lt;/td&gt;
&lt;td&gt;✅ AKS Fleet&lt;/td&gt;
&lt;td&gt;✅ GKE Fleet&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Unified service DNS&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;⚠️ (manual)&lt;/td&gt;
&lt;td&gt;⚠️ (manual)&lt;/td&gt;
&lt;td&gt;✅ (Anthos)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cross-cluster NetworkPolicy&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗ (OSS)&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✅ (Anthos)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cross-cluster observability&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✅ Hubble&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅ CloudWatch&lt;/td&gt;
&lt;td&gt;✅ Azure Monitor&lt;/td&gt;
&lt;td&gt;✅ Cloud Ops&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Max clusters&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;Unlimited&lt;/td&gt;
&lt;td&gt;255&lt;/td&gt;
&lt;td&gt;Unlimited&lt;/td&gt;
&lt;td&gt;Unlimited&lt;/td&gt;
&lt;td&gt;Unlimited&lt;/td&gt;
&lt;td&gt;Unlimited&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  10. Resource Usage
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Resource&lt;/th&gt;
&lt;th&gt;Flannel&lt;/th&gt;
&lt;th&gt;Calico&lt;/th&gt;
&lt;th&gt;Cilium&lt;/th&gt;
&lt;th&gt;Weave&lt;/th&gt;
&lt;th&gt;Antrea&lt;/th&gt;
&lt;th&gt;AWS VPC CNI&lt;/th&gt;
&lt;th&gt;Azure CNI&lt;/th&gt;
&lt;th&gt;GKE DPv2&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;DaemonSet CPU (idle)&lt;/td&gt;
&lt;td&gt;~5 mCPU&lt;/td&gt;
&lt;td&gt;~20–60 mCPU&lt;/td&gt;
&lt;td&gt;~30–80 mCPU&lt;/td&gt;
&lt;td&gt;~10–30 mCPU&lt;/td&gt;
&lt;td&gt;~20–50 mCPU&lt;/td&gt;
&lt;td&gt;~10–25 mCPU&lt;/td&gt;
&lt;td&gt;~10–30 mCPU&lt;/td&gt;
&lt;td&gt;~30–80 mCPU&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;DaemonSet RAM (idle)&lt;/td&gt;
&lt;td&gt;~30 MB&lt;/td&gt;
&lt;td&gt;~60–150 MB&lt;/td&gt;
&lt;td&gt;~100–300 MB&lt;/td&gt;
&lt;td&gt;~50–100 MB&lt;/td&gt;
&lt;td&gt;~50–100 MB&lt;/td&gt;
&lt;td&gt;~30–80 MB&lt;/td&gt;
&lt;td&gt;~40–80 MB&lt;/td&gt;
&lt;td&gt;~100–300 MB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Startup time&lt;/td&gt;
&lt;td&gt;~5s&lt;/td&gt;
&lt;td&gt;~10–20s&lt;/td&gt;
&lt;td&gt;~30–60s&lt;/td&gt;
&lt;td&gt;~10s&lt;/td&gt;
&lt;td&gt;~10–15s&lt;/td&gt;
&lt;td&gt;~5–10s&lt;/td&gt;
&lt;td&gt;~5–10s&lt;/td&gt;
&lt;td&gt;Managed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Additional CRDs&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;~8&lt;/td&gt;
&lt;td&gt;~15&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;~10&lt;/td&gt;
&lt;td&gt;0–2&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Minimum kernel&lt;/td&gt;
&lt;td&gt;Any&lt;/td&gt;
&lt;td&gt;Any / ≥5.3 (eBPF)&lt;/td&gt;
&lt;td&gt;≥4.9&lt;/td&gt;
&lt;td&gt;Any&lt;/td&gt;
&lt;td&gt;Any&lt;/td&gt;
&lt;td&gt;Any&lt;/td&gt;
&lt;td&gt;Any&lt;/td&gt;
&lt;td&gt;GKE-managed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Operator required&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✅ tigera&lt;/td&gt;
&lt;td&gt;✅ cilium-operator&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✅ antrea-controller&lt;/td&gt;
&lt;td&gt;AWS-managed&lt;/td&gt;
&lt;td&gt;Azure-managed&lt;/td&gt;
&lt;td&gt;GKE-managed&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  11. Full Feature Comparison
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dimension&lt;/th&gt;
&lt;th&gt;Flannel&lt;/th&gt;
&lt;th&gt;Calico&lt;/th&gt;
&lt;th&gt;Cilium&lt;/th&gt;
&lt;th&gt;Weave&lt;/th&gt;
&lt;th&gt;Antrea&lt;/th&gt;
&lt;th&gt;AWS VPC CNI&lt;/th&gt;
&lt;th&gt;Azure CNI&lt;/th&gt;
&lt;th&gt;GKE DPv2&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Data plane&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Bridge + iptables&lt;/td&gt;
&lt;td&gt;BGP + iptables/eBPF&lt;/td&gt;
&lt;td&gt;eBPF kernel-native&lt;/td&gt;
&lt;td&gt;Mesh sleeve/VXLAN&lt;/td&gt;
&lt;td&gt;OVS&lt;/td&gt;
&lt;td&gt;VPC native&lt;/td&gt;
&lt;td&gt;VNET native&lt;/td&gt;
&lt;td&gt;eBPF (Cilium)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;kube-proxy replacement&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✅ (eBPF)&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✅ AntreaProxy&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Encapsulation&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;VXLAN&lt;/td&gt;
&lt;td&gt;None/IPIP/VXLAN&lt;/td&gt;
&lt;td&gt;GENEVE&lt;/td&gt;
&lt;td&gt;Sleeve/VXLAN&lt;/td&gt;
&lt;td&gt;Geneve/VXLAN&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;BGP routing&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✅ native&lt;/td&gt;
&lt;td&gt;✅ optional&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;L3/L4 NetworkPolicy&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅ (add-on)&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;L7 HTTP/gRPC policy&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;⚠️ ALP&lt;/td&gt;
&lt;td&gt;✅ no sidecar&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;FQDN-based policy&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✅ (1.28+)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;GlobalNetworkPolicy&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅ CCNP&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✅ CNP&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Flow observability&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✅ flow logs&lt;/td&gt;
&lt;td&gt;✅ Hubble&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✅ Octant&lt;/td&gt;
&lt;td&gt;✅ VPC Flow&lt;/td&gt;
&lt;td&gt;✅ NSG Flow&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;L7 flow visibility&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗ (OSS)&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Cross-node encryption&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ WG&lt;/td&gt;
&lt;td&gt;✅ WG&lt;/td&gt;
&lt;td&gt;✅ WG/IPsec&lt;/td&gt;
&lt;td&gt;✅ NaCl&lt;/td&gt;
&lt;td&gt;✅ WG/IPsec&lt;/td&gt;
&lt;td&gt;Cloud-layer&lt;/td&gt;
&lt;td&gt;Cloud-layer&lt;/td&gt;
&lt;td&gt;✅ WG (beta)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Same-node encryption&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✅ (v3.26+)&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;FIPS encryption&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✅ IPsec&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✅ IPsec&lt;/td&gt;
&lt;td&gt;✅ (AWS)&lt;/td&gt;
&lt;td&gt;✅ (Azure)&lt;/td&gt;
&lt;td&gt;✅ (GCP)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Multi-cluster&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✅ BGP&lt;/td&gt;
&lt;td&gt;✅ Cluster Mesh&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;EKS Fleet&lt;/td&gt;
&lt;td&gt;AKS Fleet&lt;/td&gt;
&lt;td&gt;GKE Fleet&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Windows nodes&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;⚠️&lt;/td&gt;
&lt;td&gt;✅ HNS&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Cloud default&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;K3s&lt;/td&gt;
&lt;td&gt;Manual&lt;/td&gt;
&lt;td&gt;GKE&lt;/td&gt;
&lt;td&gt;Manual&lt;/td&gt;
&lt;td&gt;Manual&lt;/td&gt;
&lt;td&gt;EKS&lt;/td&gt;
&lt;td&gt;AKS&lt;/td&gt;
&lt;td&gt;GKE&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;RAM per node (idle)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;~30 MB&lt;/td&gt;
&lt;td&gt;~60–150 MB&lt;/td&gt;
&lt;td&gt;~100–300 MB&lt;/td&gt;
&lt;td&gt;~50–100 MB&lt;/td&gt;
&lt;td&gt;~50–100 MB&lt;/td&gt;
&lt;td&gt;~30–80 MB&lt;/td&gt;
&lt;td&gt;~40–80 MB&lt;/td&gt;
&lt;td&gt;~100–300 MB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Operational complexity&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Very low&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;td&gt;Medium–High&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;td&gt;Low (managed)&lt;/td&gt;
&lt;td&gt;Low (managed)&lt;/td&gt;
&lt;td&gt;Low (managed)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Active development&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;⚠️ Archived&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  12. When to Choose Each
&lt;/h2&gt;

&lt;h3&gt;
  
  
  🟢 Choose Flannel when…
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;✅ Dev, CI, or home lab cluster with no production traffic&lt;/li&gt;
&lt;li&gt;✅ No NetworkPolicy requirement whatsoever&lt;/li&gt;
&lt;li&gt;✅ RAM-constrained nodes (Raspberry Pi, 1 GB edge devices)&lt;/li&gt;
&lt;li&gt;✅ You want the absolute lowest operational overhead&lt;/li&gt;
&lt;li&gt;✅ Running a legacy kernel (RHEL 7 / CentOS 7)&lt;/li&gt;
&lt;li&gt;✅ Already using a service mesh (Istio, Linkerd) for policy and observability&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  🟠 Choose Calico when…
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;✅ NetworkPolicy is required and Cilium feels like overkill&lt;/li&gt;
&lt;li&gt;✅ You need BGP peering with upstream physical routers&lt;/li&gt;
&lt;li&gt;✅ Windows nodes exist in your cluster&lt;/li&gt;
&lt;li&gt;✅ No-encap direct routing is preferred for performance&lt;/li&gt;
&lt;li&gt;✅ Your team already has Calico expertise&lt;/li&gt;
&lt;li&gt;✅ Medium cluster size (10–200 nodes) with moderate policy complexity&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  🔵 Choose Cilium when…
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;✅ L7 HTTP/gRPC/Kafka policy without a service mesh sidecar&lt;/li&gt;
&lt;li&gt;✅ Hubble observability and a live service map are needed&lt;/li&gt;
&lt;li&gt;✅ 100+ services with high service churn (eBPF O(1) matters)&lt;/li&gt;
&lt;li&gt;✅ End-to-end pod traffic encryption including same-node&lt;/li&gt;
&lt;li&gt;✅ Multi-cluster federation with unified DNS and policy&lt;/li&gt;
&lt;li&gt;✅ Building toward zero-trust networking inside the cluster&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  🟡 Choose Weave when…
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;⚠️ &lt;strong&gt;Generally not recommended for new clusters&lt;/strong&gt; — Weaveworks is archived&lt;/li&gt;
&lt;li&gt;✅ Only if migrating from an existing Weave deployment with no immediate migration path&lt;/li&gt;
&lt;li&gt;✅ Simple overlay needed with built-in NaCl encryption (short term)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  🟣 Choose Antrea when…
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;✅ VMware NSX-T / Tanzu environment requiring deep SD-WAN integration&lt;/li&gt;
&lt;li&gt;✅ Tiered network policy enforcement (Emergency / Security / Application tiers)&lt;/li&gt;
&lt;li&gt;✅ Windows and Linux mixed clusters in an enterprise VMware stack&lt;/li&gt;
&lt;li&gt;✅ OVS dataplane is a hard requirement (telco, NFV)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  🔶 Choose AWS VPC CNI (EKS) when…
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;✅ Running EKS — it is the default AWS-recommended CNI&lt;/li&gt;
&lt;li&gt;✅ Pods must be natively routable across VPC, VPN, or Direct Connect&lt;/li&gt;
&lt;li&gt;✅ Per-pod AWS Security Groups are required (SGP feature)&lt;/li&gt;
&lt;li&gt;✅ Compliance mandates no overlay network&lt;/li&gt;
&lt;li&gt;✅ Integrate with AWS services that need pod-level VPC routing&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  🔷 Choose Azure CNI (AKS) when…
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;✅ Running AKS — use Azure CNI Overlay mode for most production workloads&lt;/li&gt;
&lt;li&gt;✅ Pods need to be reachable from on-prem via ExpressRoute&lt;/li&gt;
&lt;li&gt;✅ Want eBPF performance + Hubble → choose Azure CNI Overlay + Cilium dataplane&lt;/li&gt;
&lt;li&gt;✅ Large clusters → Azure CNI Overlay avoids VNET IP exhaustion&lt;/li&gt;
&lt;li&gt;✅ Windows node support is required (all Azure CNI modes support it)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  ♦️ Choose GKE Dataplane V2 (GKE) when…
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;✅ Running GKE — it is the default for new clusters&lt;/li&gt;
&lt;li&gt;✅ Want eBPF-based policy without managing Cilium yourself&lt;/li&gt;
&lt;li&gt;✅ Need Hubble network telemetry (enable as add-on)&lt;/li&gt;
&lt;li&gt;✅ FQDN-based NetworkPolicy (GKE 1.28+)&lt;/li&gt;
&lt;li&gt;✅ Google-managed lifecycle and upgrades are preferred&lt;/li&gt;
&lt;li&gt;⚠️ For L7 CNP or Cluster Mesh, self-manage Cilium on GKE instead&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  13. K3s-Specific Setup
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Flannel — Built-In, Nothing to Do
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Flannel ships with K3s — just install&lt;/span&gt;
curl &lt;span class="nt"&gt;-sfL&lt;/span&gt; https://get.k3s.io | sh -

&lt;span class="c"&gt;# Change backend in /etc/rancher/k3s/config.yaml&lt;/span&gt;
flannel-backend: host-gw   &lt;span class="c"&gt;# vxlan | host-gw | wireguard-native | none&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Installing Calico on K3s
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Step 1 — Install K3s without Flannel:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-sfL&lt;/span&gt; https://get.k3s.io | &lt;span class="nv"&gt;INSTALL_K3S_EXEC&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"--flannel-backend=none &lt;/span&gt;&lt;span class="se"&gt;\&lt;/span&gt;&lt;span class="s2"&gt;
  --disable-network-policy &lt;/span&gt;&lt;span class="se"&gt;\&lt;/span&gt;&lt;span class="s2"&gt;
  --cluster-cidr=192.168.0.0/16"&lt;/span&gt; sh -
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Step 2 — Install Calico operator:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kubectl create &lt;span class="nt"&gt;-f&lt;/span&gt; https://raw.githubusercontent.com/projectcalico/calico/v3.27.0/manifests/tigera-operator.yaml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Step 3 — Apply Installation CR:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;operator.tigera.io/v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Installation&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;default&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;calicoNetwork&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;ipPools&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;cidr&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;192.168.0.0/16&lt;/span&gt;
      &lt;span class="na"&gt;encapsulation&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;VXLANCrossSubnet&lt;/span&gt;
      &lt;span class="na"&gt;natOutgoing&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Enabled&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Installing Cilium on K3s
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Step 1 — Install K3s without Flannel:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-sfL&lt;/span&gt; https://get.k3s.io | &lt;span class="nv"&gt;INSTALL_K3S_EXEC&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"--flannel-backend=none &lt;/span&gt;&lt;span class="se"&gt;\&lt;/span&gt;&lt;span class="s2"&gt;
  --disable-network-policy &lt;/span&gt;&lt;span class="se"&gt;\&lt;/span&gt;&lt;span class="s2"&gt;
  --disable=servicelb"&lt;/span&gt; sh -
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Step 2 — Install Cilium via Helm:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;helm repo add cilium https://helm.cilium.io/
helm &lt;span class="nb"&gt;install &lt;/span&gt;cilium cilium/cilium &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--namespace&lt;/span&gt; kube-system &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--set&lt;/span&gt; operator.replicas&lt;span class="o"&gt;=&lt;/span&gt;1 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--set&lt;/span&gt; &lt;span class="nv"&gt;kubeProxyReplacement&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--set&lt;/span&gt; &lt;span class="nv"&gt;k8sServiceHost&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&amp;lt;YOUR_K3S_API_IP&amp;gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--set&lt;/span&gt; &lt;span class="nv"&gt;k8sServicePort&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;6443 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--set&lt;/span&gt; bpf.masquerade&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--set&lt;/span&gt; ipam.mode&lt;span class="o"&gt;=&lt;/span&gt;kubernetes
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Minimum Kernel Requirements
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;Cilium&lt;/th&gt;
&lt;th&gt;Calico eBPF&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Basic CNI&lt;/td&gt;
&lt;td&gt;≥ 4.9&lt;/td&gt;
&lt;td&gt;Any&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;kube-proxy replacement&lt;/td&gt;
&lt;td&gt;≥ 5.2&lt;/td&gt;
&lt;td&gt;≥ 5.3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;WireGuard encryption&lt;/td&gt;
&lt;td&gt;≥ 5.6&lt;/td&gt;
&lt;td&gt;≥ 5.6&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;XDP acceleration&lt;/td&gt;
&lt;td&gt;≥ 5.10&lt;/td&gt;
&lt;td&gt;≥ 5.10&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;✅ Ubuntu 22.04 ships kernel 5.15, Debian 12 ships 6.1, Raspberry Pi OS Bookworm ships 6.1 — all satisfy every requirement.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  14. Migration Guide on K3s
&lt;/h2&gt;

&lt;p&gt;All migrations follow the same pattern:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;drain → clean CNI state → restart K3s with &lt;code&gt;--flannel-backend=none&lt;/code&gt; → install new CNI → uncordon&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Flannel → Calico
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Step 1: Drain the node&lt;/span&gt;
kubectl drain &amp;lt;node&amp;gt; &lt;span class="nt"&gt;--ignore-daemonsets&lt;/span&gt; &lt;span class="nt"&gt;--delete-emptydir-data&lt;/span&gt;

&lt;span class="c"&gt;# Step 2: Remove Flannel state on the node&lt;/span&gt;
systemctl stop k3s
ip &lt;span class="nb"&gt;link &lt;/span&gt;delete flannel.1 2&amp;gt;/dev/null &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nb"&gt;true
&lt;/span&gt;ip &lt;span class="nb"&gt;link &lt;/span&gt;delete cni0 2&amp;gt;/dev/null &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nb"&gt;true
rm&lt;/span&gt; &lt;span class="nt"&gt;-rf&lt;/span&gt; /var/lib/cni /etc/cni/net.d

&lt;span class="c"&gt;# Step 3: Set flannel-backend: none in /etc/rancher/k3s/config.yaml, then restart&lt;/span&gt;
systemctl start k3s

&lt;span class="c"&gt;# Step 4: Install Calico operator&lt;/span&gt;
kubectl create &lt;span class="nt"&gt;-f&lt;/span&gt; https://raw.githubusercontent.com/projectcalico/calico/v3.27.0/manifests/tigera-operator.yaml

&lt;span class="c"&gt;# Step 5: Uncordon&lt;/span&gt;
kubectl uncordon &amp;lt;node&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Flannel → Cilium
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Steps 1–3 same as above (drain, clean, restart with flannel-backend=none)&lt;/span&gt;

&lt;span class="c"&gt;# Step 4: Install Cilium&lt;/span&gt;
helm repo add cilium https://helm.cilium.io/
helm &lt;span class="nb"&gt;install &lt;/span&gt;cilium cilium/cilium &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--namespace&lt;/span&gt; kube-system &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--set&lt;/span&gt; &lt;span class="nv"&gt;kubeProxyReplacement&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--set&lt;/span&gt; &lt;span class="nv"&gt;k8sServiceHost&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&amp;lt;API_IP&amp;gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--set&lt;/span&gt; &lt;span class="nv"&gt;k8sServicePort&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;6443

&lt;span class="c"&gt;# Step 5: Uncordon&lt;/span&gt;
kubectl uncordon &amp;lt;node&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;💡 &lt;strong&gt;Pro Tip:&lt;/strong&gt; For single-node K3s lab environments, a clean reinstall is always faster and safer than a live migration. Run &lt;code&gt;k3s-uninstall.sh&lt;/code&gt;, reinstall with the correct flags, then Helm install your chosen CNI — total time is about 10 minutes.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  15. Conclusion
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Open-Source CNIs
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;🟢 Flannel&lt;/strong&gt; — A masterpiece of minimalism. One job, done perfectly, with near-zero operational overhead. The right choice when simplicity and RAM constraints matter more than policy or observability.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;🟠 Calico&lt;/strong&gt; — The policy-first CNI. BGP-native routing, mature L3/L4 NetworkPolicy, Windows node support, and a pluggable data plane. The right choice when you need robust policy enforcement, prefer no-encap routing, or operate in an environment with existing BGP infrastructure.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;🔵 Cilium&lt;/strong&gt; — The platform CNI. eBPF-native with O(1) service lookup, L7-aware policy with no sidecar, Hubble observability, full pod-traffic encryption, and Cluster Mesh multi-cluster. The most capable networking layer available in Kubernetes today.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;🟡 Weave Net&lt;/strong&gt; — Once a popular choice for simplicity and built-in encryption. Now archived — migrate to Cilium or Calico for any new or long-running cluster.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;🟣 Antrea&lt;/strong&gt; — The VMware-native CNI. OVS dataplane, tiered policy, Windows support, and NSX-T integration. The right choice in Tanzu or NSX environments.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;🔷 Multus&lt;/strong&gt; — Not a CNI replacement but a CNI multiplier. Essential for telco/NFV workloads needing multiple pod network interfaces.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Cloud Provider CNIs
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;🔶 AWS VPC CNI (EKS)&lt;/strong&gt; — Native VPC IP assignment with no overlay. Pods are first-class VPC citizens. Add Calico or the AWS-native policy controller for NetworkPolicy. Choose prefix delegation for high pod density.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;🔷 Azure CNI (AKS)&lt;/strong&gt; — Use &lt;strong&gt;Azure CNI Overlay&lt;/strong&gt; for most production workloads to avoid IP exhaustion, and add the &lt;strong&gt;Cilium dataplane&lt;/strong&gt; for eBPF policy + Hubble observability. Azure CNI traditional still works, but requires careful subnet pre-planning.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;♦️ GKE Dataplane V2 (GKE)&lt;/strong&gt; — Google's managed Cilium eBPF layer. The default for new GKE clusters. Handles NetworkPolicy at scale with eBPF O(1) lookups. Add the Hubble observability add-on for network telemetry. Self-manage Cilium on GKE only if you need L7 CNP or Cluster Mesh.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Bottom line:&lt;/strong&gt; If you run a managed Kubernetes service, use the cloud-default CNI and layer policy/observability on top. If you run self-managed clusters, Cilium is the most capable long-term investment, with Calico as the pragmatic choice if BGP integration or Windows nodes are required.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The networking layer of your cluster is not where you want to cut corners at scale.&lt;br&gt;
&lt;strong&gt;Choose based on where your cluster is going — not just where it is today.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h3&gt;
  
  
  Further Reading
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://docs.cilium.io/en/stable/installation/k3s/" rel="noopener noreferrer"&gt;Cilium K3s Installation Guide&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://cilium.io/blog/2021/05/11/cni-benchmark/" rel="noopener noreferrer"&gt;Cilium Network Performance&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.tigera.io/calico/latest/getting-started/kubernetes/k3s/" rel="noopener noreferrer"&gt;Calico on K3s — Official Docs&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/flannel-io/flannel" rel="noopener noreferrer"&gt;Flannel GitHub&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://mvallim.github.io/kubernetes-under-the-hood/documentation/kube-flannel.html" rel="noopener noreferrer"&gt;Flannel Networking&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/antrea-io/antrea" rel="noopener noreferrer"&gt;Antrea GitHub&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/aws/amazon-vpc-cni-k8s" rel="noopener noreferrer"&gt;AWS VPC CNI GitHub&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://learn.microsoft.com/en-us/azure/aks/azure-cni-overlay" rel="noopener noreferrer"&gt;Azure CNI Overlay Docs&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://cloud.google.com/kubernetes-engine/docs/concepts/dataplane-v2" rel="noopener noreferrer"&gt;GKE Dataplane V2 Overview&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.cilium.io/en/stable/network/clustermesh/" rel="noopener noreferrer"&gt;Cilium Cluster Mesh Docs&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.tigera.io/calico/latest/reference/resources/globalnetworkpolicy" rel="noopener noreferrer"&gt;Calico GlobalNetworkPolicy Reference&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://antrea.io/docs/main/docs/antrea-network-policy/" rel="noopener noreferrer"&gt;Antrea Network Policy Guide&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;Written for K3s v1.29+, Cilium v1.15+, Calico v3.27+, Flannel v0.24+, AWS VPC CNI v1.18+, Azure CNI v1.5+, GKE 1.28+. Benchmark figures are representative — always test with your own hardware and workload before production decisions.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>networking</category>
      <category>cni</category>
      <category>devops</category>
    </item>
    <item>
      <title>🔴 Supply Chain Attacks Are Breaking the Internet in 2026 — Every Major Hack Explained</title>
      <dc:creator>Pendela BhargavaSai</dc:creator>
      <pubDate>Tue, 05 May 2026 04:00:00 +0000</pubDate>
      <link>https://dev.to/pendelabhargavasai/supply-chain-attacks-are-breaking-the-internet-in-2026-every-major-hack-explained-3bln</link>
      <guid>https://dev.to/pendelabhargavasai/supply-chain-attacks-are-breaking-the-internet-in-2026-every-major-hack-explained-3bln</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Your vulnerability scanner is hacking you. Your password manager got weaponized. Your AI coding tool is the new attack surface. Welcome to 2026.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkeo5gx8i2y64sy3z9z3i.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkeo5gx8i2y64sy3z9z3i.png" alt=" "&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The Year Everything Became a Weapon
&lt;/h2&gt;

&lt;p&gt;In 2025, supply chain attacks were a concern. In 2026, they became the dominant threat vector in software security.&lt;/p&gt;

&lt;p&gt;The numbers are staggering: a single compromised maintainer account poisoned a library with &lt;strong&gt;100 million weekly downloads&lt;/strong&gt;. A misconfigured CI/CD workflow cascaded into &lt;strong&gt;five separate tool compromises&lt;/strong&gt; within days. A developer downloaded Roblox exploit scripts, and that mistake eventually exposed &lt;strong&gt;Vercel's internal database&lt;/strong&gt; — which was listed for sale at $2 million on BreachForums.&lt;/p&gt;

&lt;p&gt;This isn't theoretical risk. This is what happened between January and April 2026.&lt;/p&gt;

&lt;p&gt;In this post, I'm going to break down every major supply chain attack that hit the IT and software ecosystem this year — what got compromised, how the attackers did it, what the real blast radius looked like, and most importantly, &lt;strong&gt;what you need to do right now&lt;/strong&gt; to protect your pipelines.&lt;/p&gt;

&lt;p&gt;Let's start with what a supply chain attack actually is — because most explanations bury the lead.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Is a Software Supply Chain Attack?
&lt;/h2&gt;

&lt;p&gt;Here's the mental model that matters:&lt;/p&gt;

&lt;p&gt;Instead of breaking into your house, the attacker bribes your locksmith.&lt;/p&gt;

&lt;p&gt;When you run &lt;code&gt;npm install&lt;/code&gt; or &lt;code&gt;pip install&lt;/code&gt;, you're implicitly trusting thousands of strangers who maintain open-source packages. You're trusting their accounts, their CI/CD pipelines, their GitHub credentials, and their judgment. Every single one of those trust relationships is an attack surface.&lt;/p&gt;

&lt;p&gt;A supply chain attack exploits that trust. Instead of targeting you directly — which requires defeating your firewall, your endpoint detection, your access controls — attackers target the &lt;strong&gt;supplier&lt;/strong&gt;. Compromise one maintainer account, and you've just compromised every developer who installs that package.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The attack chain looks like this:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;1. `Identify` a maintainer of a widely-used package
2. Phish their npm/GitHub credentials, or exploit a misconfigured CI/CD workflow
3. Push backdoored versions — the malware runs at install time or on startup
4. Harvest: cloud credentials, SSH keys, API tokens, Kubernetes configs
5. Cascade: use stolen tokens to compromise more repos, more pipelines, more packages
6. Monetize: ransomware, data sale on BreachForums, cryptomining
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The asymmetry is what makes this so devastating. The attacker breaks in once, at one point in the supply chain, and inherits access to thousands of downstream organizations simultaneously.&lt;/p&gt;

&lt;p&gt;Now let's talk about what actually happened in 2026.&lt;/p&gt;




&lt;h2&gt;
  
  
  January 2026 — Cisco Unified Communications Zero-Day (CVE-2026-20045)
&lt;/h2&gt;

&lt;h3&gt;
  
  
  What got compromised
&lt;/h3&gt;

&lt;p&gt;Cisco's entire enterprise voice stack: Unified Communications Manager, IM &amp;amp; Presence Service, Unity Connection, and Webex Calling Dedicated Instance.&lt;/p&gt;

&lt;h3&gt;
  
  
  How it happened
&lt;/h3&gt;

&lt;p&gt;A critical zero-day in the web-based management interface allowed unauthenticated remote attackers to send crafted HTTP requests and execute arbitrary commands on the underlying OS — then escalate straight to &lt;strong&gt;root&lt;/strong&gt;. No credentials needed. No user interaction required.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why it's a supply chain risk (not just a vulnerability)
&lt;/h3&gt;

&lt;p&gt;This one is subtler than the package ecosystem attacks below, but it's a textbook supply chain risk: &lt;strong&gt;managed service providers&lt;/strong&gt;. Thousands of organizations outsource their voice and UC infrastructure to third parties. If your managed service provider is running vulnerable Cisco UC components, your business communications become a pivot point into your environment — even if your own perimeter is airtight.&lt;/p&gt;

&lt;p&gt;This is the definition of inherited risk. You didn't deploy the vulnerable software. You didn't configure it. But you're exposed because you trusted someone who did.&lt;/p&gt;

&lt;h3&gt;
  
  
  How to protect yourself
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Apply Cisco's emergency patch immediately (see &lt;a href="https://tools.cisco.com/security/center/publicationListing.x" rel="noopener noreferrer"&gt;Cisco Security Advisory cisco-sa-20260115-uc&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Implement continuous vendor monitoring — when a critical advisory drops, you need instant visibility into which of your vendors is exposed&lt;/li&gt;
&lt;li&gt;Restrict management interface access to known IP ranges only&lt;/li&gt;
&lt;li&gt;Map which applications and data flows depend on your vendors' UC components so you can assess blast radius before an attack, not after&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  February 2026 — GitHub Actions: The Misconfiguration That Started Everything
&lt;/h2&gt;

&lt;h3&gt;
  
  
  What got compromised
&lt;/h3&gt;

&lt;p&gt;This is the origin point of the largest multi-tool supply chain campaign of 2026. A threat actor operating under the GitHub handle &lt;strong&gt;hackerbot-claw&lt;/strong&gt; (account created February 20, 2026) ran an automated campaign scanning public repositories for a specific GitHub Actions misconfiguration: the &lt;code&gt;pull_request_target&lt;/code&gt; event trigger with excessive token permissions.&lt;/p&gt;

&lt;p&gt;On &lt;strong&gt;February 27–28&lt;/strong&gt;, the attacker successfully exploited this misconfiguration in Aqua Security's Trivy repository, exfiltrating the &lt;code&gt;aqua-bot&lt;/code&gt; service account's Personal Access Token (PAT). This PAT had write access to release automation — which is everything the attacker needed.&lt;/p&gt;

&lt;h3&gt;
  
  
  How it happened
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;pull_request_target&lt;/code&gt; workflow is a GitHub Actions feature that lets CI pipelines trigger automatically on pull requests from external contributors. The problem: when misconfigured, external code gets access to the repository's internal secrets. The workflow essentially hands an untrusted contributor the keys to your pipeline.&lt;/p&gt;

&lt;p&gt;Aqua detected the intrusion and attempted credential rotation. But here's the critical failure: &lt;strong&gt;the rotation was not atomic&lt;/strong&gt;. Sequential token replacement left a window during which newly issued tokens may have been captured. As Aqua's VP of Open Source, Itay Shakury, later confirmed:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;"We rotated secrets and tokens, but the process wasn't atomic, and attackers may have been privy to refreshed tokens."&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This residual access enabled everything that followed in March.&lt;/p&gt;

&lt;h3&gt;
  
  
  The lesson about &lt;code&gt;pull_request_target&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;This is a well-documented dangerous pattern, but it keeps getting deployed:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# ⚠️ DANGEROUS — external PRs can access your secrets&lt;/span&gt;
&lt;span class="na"&gt;on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;pull_request_target&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;types&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;opened&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;

&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;ci&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;permissions&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;contents&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;write&lt;/span&gt;  &lt;span class="c1"&gt;# ← This is the mistake&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# ✅ SAFE — pin to SHA, restrict permissions&lt;/span&gt;
&lt;span class="na"&gt;on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;pull_request&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;

&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;ci&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;permissions&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;contents&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;read&lt;/span&gt;  &lt;span class="c1"&gt;# minimum required&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  How to protect yourself
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Never&lt;/strong&gt; use &lt;code&gt;pull_request_target&lt;/code&gt; with write permissions for workflows triggered by external contributors&lt;/li&gt;
&lt;li&gt;Pin all GitHub Actions to full 40-character commit SHAs — not version tags (more on why this matters below)&lt;/li&gt;
&lt;li&gt;Rotate credentials atomically — revoke all, reissue all, in a single synchronized operation&lt;/li&gt;
&lt;li&gt;Limit service account tokens to minimum required permissions and scope&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  March 2026 — The Month Everything Went Wrong
&lt;/h2&gt;

&lt;p&gt;March 2026 will go down as the most significant month in software supply chain history. Five major compromises. One threat group. A cascade that went from a misconfigured GitHub workflow to a ransomware operation targeting 1,000+ enterprise SaaS environments.&lt;/p&gt;

&lt;p&gt;Let me break each one down.&lt;/p&gt;




&lt;h3&gt;
  
  
  🔍 Trivy (Aqua Security) — March 19–20, 2026
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;CVE-2026-33634 | Severity: CRITICAL&lt;/strong&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  What happened
&lt;/h4&gt;

&lt;p&gt;At approximately 17:43 UTC on March 19, 2026, an attacker with residual access from the February compromise force-pushed malicious code to &lt;strong&gt;75 of 77 version tags&lt;/strong&gt; in &lt;code&gt;aquasecurity/trivy-action&lt;/code&gt; — the official GitHub Action for Trivy, one of the most widely deployed open-source vulnerability scanners in the world.&lt;/p&gt;

&lt;p&gt;Simultaneously, all 7 tags in &lt;code&gt;aquasecurity/setup-trivy&lt;/code&gt; were poisoned, and a weaponized Trivy binary (v0.69.4) was published to GitHub Releases, Docker Hub, GHCR, ECR Public, and deb/rpm repositories.&lt;/p&gt;

&lt;p&gt;Safe versions: only &lt;code&gt;trivy-action v0.35.0&lt;/code&gt;, &lt;code&gt;setup-trivy v0.2.6&lt;/code&gt;, and &lt;code&gt;trivy v0.69.3&lt;/code&gt; were unaffected.&lt;/p&gt;

&lt;h4&gt;
  
  
  The attack was elegant and terrifying
&lt;/h4&gt;

&lt;p&gt;The malicious &lt;code&gt;entrypoint.sh&lt;/code&gt; ran the credential-harvesting payload &lt;strong&gt;first&lt;/strong&gt;, then ran the legitimate Trivy scan. Workflows completed normally. No errors. No indication of compromise. Developers watching their CI logs saw a clean vulnerability scan — while their secrets were being exfiltrated in the background.&lt;/p&gt;

&lt;p&gt;The malware (named "TeamPCP Cloud Stealer") performed three operations:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Dumped &lt;code&gt;Runner.Worker&lt;/code&gt; process memory to extract GitHub PATs and CI secrets&lt;/li&gt;
&lt;li&gt;Swept SSH keys, cloud credentials (AWS, GCP, Azure), Kubernetes tokens, Docker configs, Git credentials&lt;/li&gt;
&lt;li&gt;Encrypted the bundle with AES-256 + RSA-4096 and exfiltrated to attacker-controlled servers&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If the primary C2 channel failed, the malware fell back to &lt;strong&gt;creating a repository called &lt;code&gt;tpcp-docs&lt;/code&gt; inside the victim's own GitHub organization&lt;/strong&gt; to store stolen secrets. Check your org for that repo right now.&lt;/p&gt;

&lt;h4&gt;
  
  
  The forensic tells (that most teams missed)
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gh"&gt;# Each malicious commit had an impossible timestamp:&lt;/span&gt;
&lt;span class="gh"&gt;# - Claimed to be from 2021/2022&lt;/span&gt;
&lt;span class="gh"&gt;# - But parent commit was dated March 2026&lt;/span&gt;

&lt;span class="gh"&gt;# Additionally:&lt;/span&gt;
&lt;span class="gh"&gt;# - Only entrypoint.sh was modified per commit&lt;/span&gt;
&lt;span class="gh"&gt;# - Original commits touched multiple files&lt;/span&gt;
&lt;span class="gh"&gt;# - GitHub's "Immutable" release badge was present (but meaningless)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  How to protect yourself
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# ❌ VULNERABLE — tag can be rewritten silently&lt;/span&gt;
&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;aquasecurity/trivy-action@v0.34.2&lt;/span&gt;

&lt;span class="c1"&gt;# ✅ SECURE — commit SHA is immutable&lt;/span&gt;
&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;aquasecurity/trivy-action@f781cce5aab226378d021711787766a7d423d18d&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;If you ran Trivy between 17:43 and 23:13 UTC on March 19, 2026:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Search your GitHub org for any repo named &lt;code&gt;tpcp-docs&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Check DNS/network logs for connections to &lt;code&gt;scan.aquasecurtiy[.]org&lt;/code&gt; (note the typo — deliberate)&lt;/li&gt;
&lt;li&gt;Check for connections to &lt;code&gt;45.148.10.212&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Treat all CI/CD secrets from that window as fully compromised — rotate everything&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  🧠 LiteLLM — March 24, 2026
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Severity: CRITICAL | ~3.4M daily downloads | 40-minute exposure window&lt;/strong&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  What happened
&lt;/h4&gt;

&lt;p&gt;LiteLLM is a Python package providing a unified interface for 100+ LLM APIs — OpenAI, Anthropic, Google, AWS Bedrock, Azure OpenAI. Because it sits between your applications and multiple AI providers, it has access to API keys and cloud credentials for all of them. That's exactly why it was targeted.&lt;/p&gt;

&lt;p&gt;The compromise was a cascade from Trivy: LiteLLM's CI/CD pipeline used Trivy for security scanning. When Trivy was poisoned on March 19, the malware in LiteLLM's pipeline exfiltrated its &lt;strong&gt;PyPI publish token&lt;/strong&gt; to TeamPCP. Five days later, attackers used that token to upload two malicious versions directly to PyPI:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;litellm==1.82.7&lt;/code&gt; — published 10:39 UTC&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;litellm==1.82.8&lt;/code&gt; — published 10:52 UTC&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Both were live for approximately &lt;strong&gt;40 minutes&lt;/strong&gt; before PyPI quarantined them. During that window, they accumulated tens of thousands of downloads.&lt;/p&gt;

&lt;h4&gt;
  
  
  The 3-stage payload
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Stage 1: Credential Harvesting
# Exfiltrates to models.litellm.cloud (attacker-controlled, not official BerriAI domain)
&lt;/span&gt;&lt;span class="nf"&gt;collect&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;LLM API keys (OpenAI, Anthropic, Google...)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Cloud credentials (AWS, GCP, Azure)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SSH keys, shell history, .env files&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Crypto wallets&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Kubernetes configs&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;])&lt;/span&gt;

&lt;span class="c1"&gt;# Stage 2: Kubernetes Lateral Movement
# Deploys privileged DaemonSets → full cluster access
&lt;/span&gt;
&lt;span class="c1"&gt;# Stage 3: Persistence
# Installs ~/.config/systemd/user/sysmon.service
# Polls attacker server for additional payloads
# Survives package removal
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;.pth&lt;/code&gt; file mechanism in &lt;code&gt;v1.82.8&lt;/code&gt; was particularly nasty: it placed a &lt;code&gt;litellm_init.pth&lt;/code&gt; file that executed on &lt;strong&gt;every Python interpreter startup&lt;/strong&gt; — meaning the payload fired even when LiteLLM wasn't explicitly imported.&lt;/p&gt;

&lt;h4&gt;
  
  
  Disclosure suppression
&lt;/h4&gt;

&lt;p&gt;When the community opened GitHub issue #24512 to report the compromise, TeamPCP deployed &lt;strong&gt;88 bots from 73 unique compromised developer accounts in a 102-second window&lt;/strong&gt; to spam the thread. They used the compromised maintainer account to close the issue as "not planned." This is one of the first documented uses of AI-assisted bot networks for supply chain attack disclosure suppression.&lt;/p&gt;

&lt;h4&gt;
  
  
  Immediate action
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Check if you're affected&lt;/span&gt;
pip show litellm | &lt;span class="nb"&gt;grep &lt;/span&gt;Version
&lt;span class="c"&gt;# v1.82.7 or v1.82.8 = COMPROMISED&lt;/span&gt;

&lt;span class="c"&gt;# Check for persistence&lt;/span&gt;
&lt;span class="nb"&gt;ls&lt;/span&gt; ~/.config/systemd/user/sysmon.service
&lt;span class="nb"&gt;ls&lt;/span&gt; ~/.config/sysmon/sysmon.py

&lt;span class="c"&gt;# In Kubernetes&lt;/span&gt;
kubectl get pods &lt;span class="nt"&gt;-n&lt;/span&gt; kube-system | &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="s2"&gt;"node-setup"&lt;/span&gt;

&lt;span class="c"&gt;# Purge cache&lt;/span&gt;
pip cache purge
&lt;span class="c"&gt;# or&lt;/span&gt;
&lt;span class="nb"&gt;rm&lt;/span&gt; &lt;span class="nt"&gt;-rf&lt;/span&gt; ~/.cache/uv

&lt;span class="c"&gt;# Safe version&lt;/span&gt;
pip &lt;span class="nb"&gt;install &lt;/span&gt;&lt;span class="nv"&gt;litellm&lt;/span&gt;&lt;span class="o"&gt;==&lt;/span&gt;1.82.6
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  📦 Axios (npm) — March 30–31, 2026
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Severity: CRITICAL | ~100M weekly downloads | Attributed: UNC1069 (North Korea)&lt;/strong&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  What happened
&lt;/h4&gt;

&lt;p&gt;Axios is one of the most depended-upon libraries in the JavaScript ecosystem. At the time of the attack, it was present in approximately 80% of cloud and code environments. The attack didn't exploit any code vulnerability — it was a straightforward account takeover.&lt;/p&gt;

&lt;p&gt;Attackers compromised the npm account of &lt;strong&gt;jasonsaayman&lt;/strong&gt;, Axios's primary maintainer, by changing the account's associated email from &lt;code&gt;jasonsaayman@gmail.com&lt;/code&gt; to &lt;code&gt;ifstap@proton.me&lt;/code&gt;. This bypassed the GitHub Actions OIDC publish flow entirely.&lt;/p&gt;

&lt;p&gt;The attack timeline:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;2026-03-30 05:57 UTC — plain-crypto-js@4.2.0 published (clean decoy, builds registry history)
2026-03-30 23:59 UTC — plain-crypto-js@4.2.1 published (malicious postinstall backdoor)
2026-03-31 00:21 UTC — axios@1.14.1 published (MALICIOUS, tagged: latest)
2026-03-31 01:00 UTC — axios@0.30.4 published (MALICIOUS, tagged: legacy)
2026-03-31 03:29 UTC — Detected and removed
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;39 minutes. Two malicious versions. Both tagged as the default install.&lt;/strong&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  The payload
&lt;/h4&gt;

&lt;p&gt;The malicious dependency &lt;code&gt;plain-crypto-js&lt;/code&gt; contained a &lt;code&gt;postinstall&lt;/code&gt; hook that silently downloaded and executed platform-specific stage-2 RAT implants from &lt;code&gt;sfrclak[.]com:8000&lt;/code&gt;. Cross-platform: macOS, Windows, Linux.&lt;/p&gt;

&lt;p&gt;Google's Threat Intelligence Group attributed this to &lt;strong&gt;UNC1069&lt;/strong&gt;, a financially motivated North Korean threat actor. OpenAI was sufficiently exposed via Axios's dependency chain that it revoked its macOS code-signing certificate on March 31, 2026 as a precaution.&lt;/p&gt;

&lt;h4&gt;
  
  
  Check your lockfiles now
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Check for compromised versions&lt;/span&gt;
&lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-E&lt;/span&gt; &lt;span class="s2"&gt;"axios.*(1&lt;/span&gt;&lt;span class="se"&gt;\.&lt;/span&gt;&lt;span class="s2"&gt;14&lt;/span&gt;&lt;span class="se"&gt;\.&lt;/span&gt;&lt;span class="s2"&gt;1|0&lt;/span&gt;&lt;span class="se"&gt;\.&lt;/span&gt;&lt;span class="s2"&gt;30&lt;/span&gt;&lt;span class="se"&gt;\.&lt;/span&gt;&lt;span class="s2"&gt;4)"&lt;/span&gt; package-lock.json
&lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-E&lt;/span&gt; &lt;span class="s2"&gt;"plain-crypto-js"&lt;/span&gt; package-lock.json yarn.lock bun.lockb

&lt;span class="c"&gt;# Safe versions&lt;/span&gt;
npm &lt;span class="nb"&gt;install &lt;/span&gt;axios@1.14.0  &lt;span class="c"&gt;# Last legitimate 1.x with SLSA provenance&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  How to protect yourself
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Enable phishing-resistant MFA on npm, GitHub, and all cloud platforms — no exceptions&lt;/li&gt;
&lt;li&gt;Use &lt;code&gt;npm ci&lt;/code&gt; with strict lockfiles instead of &lt;code&gt;npm install&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Monitor npm for maintainer email changes on critical dependencies&lt;/li&gt;
&lt;li&gt;Audit and block postinstall scripts in CI environments where possible&lt;/li&gt;
&lt;li&gt;Never run &lt;code&gt;npm install&lt;/code&gt; on production systems from ephemeral runners without lockfile pinning&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  🤖 Anthropic Claude Code — March 31, 2026
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Severity: HIGH | ~512,000 lines of proprietary source code | Root cause: Human error&lt;/strong&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  What happened
&lt;/h4&gt;

&lt;p&gt;This one is different from the others — it wasn't a malicious actor compromising a third party. Anthropic accidentally shipped the &lt;strong&gt;entire source code of Claude Code&lt;/strong&gt; to the public npm registry.&lt;/p&gt;

&lt;p&gt;When Anthropic published &lt;code&gt;@anthropic-ai/claude-code&lt;/code&gt; version 2.1.88, a missing exclusion rule in the build configuration caused a 59.8 MB JavaScript source map file (&lt;code&gt;cli.js.map&lt;/code&gt;) to be bundled into the package. That source map pointed to a zip archive on Anthropic's Cloudflare R2 storage containing the full, unobfuscated TypeScript source — &lt;strong&gt;512,000 lines across 1,906 files&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Security researcher Chaofan Shou spotted it on X within hours. By the time Anthropic pulled the package at ~08:00 UTC, the code had been downloaded from their own cloud storage, mirrored to GitHub, and forked tens of thousands of times.&lt;/p&gt;

&lt;h4&gt;
  
  
  What was exposed
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Complete multi-agent orchestration architecture&lt;/li&gt;
&lt;li&gt;Self-healing memory system (&lt;code&gt;MEMORY.md&lt;/code&gt; architecture with lazy-load topic files)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;"Undercover Mode"&lt;/strong&gt; — suppresses Anthropic-internal metadata in commits to public repos&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Anti-distillation controls&lt;/strong&gt; — injects fake tool definitions into API responses to poison competitor training data&lt;/li&gt;
&lt;li&gt;44 feature flags, including an unreleased Tamagotchi easter egg planned for April 1–7&lt;/li&gt;
&lt;li&gt;Bidirectional CLI-to-IDE communication layer&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  The cascading danger
&lt;/h4&gt;

&lt;p&gt;The leak coincided — entirely coincidentally — with the Axios RAT attack. Anyone who updated Claude Code via npm between &lt;strong&gt;00:21 and 03:29 UTC on March 31&lt;/strong&gt; may have simultaneously pulled a trojanized version of Axios.&lt;/p&gt;

&lt;p&gt;Additionally, attackers immediately registered npm packages mimicking Anthropic's internal tooling (&lt;code&gt;audio-capture-napi&lt;/code&gt;, &lt;code&gt;color-diff-napi&lt;/code&gt;, &lt;code&gt;image-processor-napi&lt;/code&gt;) to stage dependency confusion attacks against developers trying to compile the leaked source.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Do not download, fork, build, or run any GitHub repository claiming to be "leaked Claude Code."&lt;/strong&gt; Many of these repositories are active malware lures delivering Vidar Stealer and GhostSocks.&lt;/p&gt;

&lt;h4&gt;
  
  
  Anthropic's official statement
&lt;/h4&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;"Earlier today, a Claude Code release included some internal source code. No sensitive customer data or credentials were involved or exposed. This was a release packaging issue caused by human error, not a security breach. We're rolling out measures to prevent this from happening again."&lt;/em&gt;&lt;br&gt;
— Anthropic Spokesperson&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h4&gt;
  
  
  What this means for your build pipeline
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gh"&gt;# The failure point: Bun generates source maps by default.&lt;/span&gt;
&lt;span class="gh"&gt;# A single missing line in build config exposed 512K lines of IP.&lt;/span&gt;

&lt;span class="gh"&gt;# Lesson: Add this to your CI/CD pre-publish checklist:&lt;/span&gt;
✓ Verify .npmignore excludes &lt;span class="err"&gt;*&lt;/span&gt;.map files
✓ Verify &lt;span class="sb"&gt;`files`&lt;/span&gt; field in package.json is allowlist-based, not denylist
✓ Run &lt;span class="sb"&gt;`npm pack --dry-run`&lt;/span&gt; and inspect the manifest before every publish
✓ Set up automated secret/source scanning on all npm publish workflows
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  April 2026 — The Attacks Keep Coming
&lt;/h2&gt;




&lt;h3&gt;
  
  
  ▲ Vercel — April 19, 2026
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Severity: CRITICAL | Entry point: AI productivity tool | Dwell time: ~2 months&lt;/strong&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  What happened
&lt;/h4&gt;

&lt;p&gt;This attack is a masterclass in how OAuth trust relationships create invisible lateral movement paths.&lt;/p&gt;

&lt;p&gt;The chain:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;February 2026&lt;/strong&gt;: A Context.ai employee downloaded Roblox game exploit scripts. Those scripts installed &lt;strong&gt;Lumma Stealer&lt;/strong&gt; malware.&lt;/li&gt;
&lt;li&gt;Lumma Stealer exfiltrated the employee's Google Workspace OAuth tokens.&lt;/li&gt;
&lt;li&gt;Context.ai's Chrome Extension had been granted full Google Drive read access by users during onboarding.&lt;/li&gt;
&lt;li&gt;A Vercel enterprise employee had used Context.ai and connected their Vercel Google account.&lt;/li&gt;
&lt;li&gt;Attackers pivoted from the stolen tokens → Context.ai's AWS environment → OAuth tokens for their product → the Vercel employee's workspace → Vercel's internal systems.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Vercel disclosed the breach on &lt;strong&gt;April 19, 2026&lt;/strong&gt;. By then, the attacker had approximately 2 months of dwell time. Vercel's CEO Guillermo Rauch confirmed the attack chain publicly on X and named Context.ai as the compromised third party.&lt;/p&gt;

&lt;p&gt;The stolen Vercel internal database was listed for sale at &lt;strong&gt;$2 million on BreachForums&lt;/strong&gt; by ShinyHunters.&lt;/p&gt;

&lt;h4&gt;
  
  
  The env variable problem
&lt;/h4&gt;

&lt;p&gt;Vercel's environment variable model left variables not explicitly marked as "sensitive" unencrypted at rest. Once an attacker had team-scoped OAuth access, they could read all non-sensitive environment variables — connection strings, API keys, third-party service credentials — stored by developers who assumed they were protected.&lt;/p&gt;

&lt;h4&gt;
  
  
  Key takeaway for developers
&lt;/h4&gt;

&lt;p&gt;You can have perfect security in your own systems and still get breached because an AI productivity tool you gave full Drive access to got compromised via an employee who downloaded Roblox scripts.&lt;/p&gt;

&lt;p&gt;This is the supply chain threat model in its purest form. The attack surface is no longer just your code — it's every OAuth permission you've ever granted.&lt;/p&gt;

&lt;h4&gt;
  
  
  How to protect yourself
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Immediate actions:
✓ Audit all OAuth app permissions in your Google Workspace — revoke apps with excessive access
✓ Mark ALL Vercel environment variables as "sensitive" explicitly (not just secrets)
✓ Query database connection logs for IPs outside known egress ranges, Feb–Apr 2026 window
✓ Rotate all API keys and secrets stored in Vercel project environment variables

Systemic changes:
✓ Never grant AI tools full-read workspace access — use scoped permissions
✓ Implement OAuth token monitoring to detect abnormal access patterns
✓ Treat third-party AI tools with the same vendor risk assessment as any SaaS platform
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  🔐 Bitwarden CLI — April 22, 2026
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Severity: CRITICAL | Window: 90 minutes | Notable: First supply chain attack targeting AI coding tools&lt;/strong&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  What happened
&lt;/h4&gt;

&lt;p&gt;The Shai-Hulud worm's "Third Coming." At 5:57 PM ET on April 22, 2026, attackers published &lt;code&gt;@bitwarden/cli@2026.4.0&lt;/code&gt; — a malicious version of the CLI tool for the world's most popular open-source password manager (10M+ users, 50,000 business customers). By 7:30 PM ET, it was gone.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;90 minutes.&lt;/strong&gt; That's the entire attack window.&lt;/p&gt;

&lt;p&gt;The attack vector: Bitwarden's repository uses &lt;code&gt;checkmarx/ast-github-action&lt;/code&gt; — one of the GitHub Actions compromised in the ongoing Checkmarx supply chain campaign (also attributed to TeamPCP). Attackers hijacked Bitwarden's CI/CD pipeline, editing the &lt;code&gt;publish-cli.yml&lt;/code&gt; workflow five consecutive times to inject a prebuilt malicious tarball containing the payload &lt;code&gt;bw1.js&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Bitwarden confirmed: no user vault data was accessed. The web extension, desktop apps, and all other clients were unaffected. Only the CLI npm package was compromised.&lt;/p&gt;

&lt;h4&gt;
  
  
  The payload was remarkable
&lt;/h4&gt;

&lt;p&gt;The malware targeted &lt;strong&gt;six distinct credential surfaces&lt;/strong&gt; and introduced two novel capabilities:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Credential targets:&lt;/span&gt;
&lt;span class="nx"&gt;targets&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;AWS access keys + SSM/Secrets Manager&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Azure credentials + Key Vault&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;GCP service account keys + Secret Manager&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;GitHub PATs + npm publish tokens&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;SSH keys + shell history + .env files&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;AI coding assistant configurations&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;  &lt;span class="c1"&gt;// ← NEW in 2026&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="c1"&gt;// Novel capability 1: AI tool targeting&lt;/span&gt;
&lt;span class="c1"&gt;// Explicitly probed for: Claude, Cursor, Codex CLI, Aider&lt;/span&gt;
&lt;span class="c1"&gt;// If authenticated session found → extract credentials + inject persistence&lt;/span&gt;

&lt;span class="c1"&gt;// Novel capability 2: Self-propagating worm&lt;/span&gt;
&lt;span class="c1"&gt;// Uses victim's npm publish tokens to backdoor ALL packages they can publish to&lt;/span&gt;
&lt;span class="c1"&gt;// Exfiltrates to public GitHub repos (RSA-encrypted) as dead-drop C2&lt;/span&gt;
&lt;span class="c1"&gt;// GitHub traffic not flagged by security tools → effective evasion&lt;/span&gt;

&lt;span class="c1"&gt;// Kill switch: skips if Russian locale detected&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  This changes the threat model for AI coding tools
&lt;/h4&gt;

&lt;p&gt;The Bitwarden CLI attack — combined with the Vercel breach via Context.ai — confirms a clear pattern that security teams need to internalize:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;AI coding tools (Claude, Cursor, Copilot, Aider) sit at the intersection of everything attackers want: source code access, command execution, API credentials, and cloud service connections.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;These tools are now explicitly named in supply chain attack malware. Your AI coding assistant's authentication state is a credential worth stealing.&lt;/p&gt;

&lt;h4&gt;
  
  
  Immediate response
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Check if affected (installed between 5:57–7:30 PM ET, April 22)&lt;/span&gt;
npm list @bitwarden/cli  &lt;span class="c"&gt;# 2026.4.0 = COMPROMISED&lt;/span&gt;

&lt;span class="c"&gt;# Clean install&lt;/span&gt;
npm uninstall &lt;span class="nt"&gt;-g&lt;/span&gt; @bitwarden/cli
npm cache clean &lt;span class="nt"&gt;--force&lt;/span&gt;
npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; @bitwarden/cli@2026.4.1  &lt;span class="c"&gt;# verified clean&lt;/span&gt;

&lt;span class="c"&gt;# Find C2 artifacts&lt;/span&gt;
find / &lt;span class="nt"&gt;-name&lt;/span&gt; &lt;span class="s2"&gt;"bw1.js"&lt;/span&gt; &lt;span class="nt"&gt;-o&lt;/span&gt; &lt;span class="nt"&gt;-name&lt;/span&gt; &lt;span class="s2"&gt;"bw_setup.js"&lt;/span&gt; 2&amp;gt;/dev/null

&lt;span class="c"&gt;# Search for data exfil repos&lt;/span&gt;
&lt;span class="c"&gt;# Check public GitHub for repos containing: "Shai-Hulud: The Third Coming"&lt;/span&gt;

&lt;span class="c"&gt;# Rotate if affected:&lt;/span&gt;
&lt;span class="c"&gt;# → GitHub PATs&lt;/span&gt;
&lt;span class="c"&gt;# → npm tokens  &lt;/span&gt;
&lt;span class="c"&gt;# → AWS access keys&lt;/span&gt;
&lt;span class="c"&gt;# → GCP service account keys&lt;/span&gt;
&lt;span class="c"&gt;# → Azure credentials&lt;/span&gt;
&lt;span class="c"&gt;# → SSH keys&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  The Big Picture: TeamPCP and the Campaign Architecture
&lt;/h2&gt;

&lt;p&gt;Most of the March–April attacks trace back to a single threat group: &lt;strong&gt;TeamPCP&lt;/strong&gt; (also operating as DeadCatx3, PCPcat, Persy_PCP, ShellForce, and CipherForce).&lt;/p&gt;

&lt;p&gt;TeamPCP first appeared in late December 2025 as a group focused on cloud-native infrastructure exploitation. Their 2026 campaign was methodical:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Phase 1 (Feb 27–28):  Exploit pull_request_target in Trivy → steal aqua-bot PAT
Phase 2 (Mar 1):      Aqua rotates credentials → incomplete rotation
Phase 3 (Mar 19):     Use residual access → poison 75 Trivy tags + Docker images
Phase 4 (Mar 21):     Use stolen PATs from Trivy → poison KICS GitHub Actions
Phase 5 (Mar 24):     Use LiteLLM CI's Trivy → steal PyPI token → poison LiteLLM
Phase 6 (Mar 27):     Telnyx Python SDK compromised
Phase 7 (Mar 30–31):  Axios npm package poisoned (separate North Korean actor)
Phase 8 (Apr 15):     Vect ransomware lists first victim from Trivy campaign
Phase 9 (Apr 22):     Bitwarden CLI poisoned via Checkmarx GitHub Action
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The campaign spanned &lt;strong&gt;PyPI, npm, Docker Hub, GitHub Actions, and OpenVSX&lt;/strong&gt; in a single coordinated multi-ecosystem operation.&lt;/p&gt;




&lt;h2&gt;
  
  
  Systemic Defenses: What Actually Works
&lt;/h2&gt;

&lt;p&gt;After cataloging all of this, here's what the evidence shows actually works:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Pin to commit SHAs, not version tags
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# This is the single highest-impact change you can make:&lt;/span&gt;

&lt;span class="c1"&gt;# ❌ VULNERABLE (both of these)&lt;/span&gt;
&lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;some-action@v2.0&lt;/span&gt;
&lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;some-action@main&lt;/span&gt;

&lt;span class="c1"&gt;# ✅ IMMUTABLE — cannot be silently changed&lt;/span&gt;
&lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;some-action@a1b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4e5f6a1b2&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The 2025 tj-actions attack and the 2026 Trivy attack both succeeded because developers referenced actions by tag. Both would have been completely immune with SHA pinning. One line of config change. That's it.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Use lockfiles strictly
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# In CI/CD pipelines:&lt;/span&gt;
npm ci          &lt;span class="c"&gt;# NOT npm install&lt;/span&gt;
pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;--require-hashes&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; requirements.txt

&lt;span class="c"&gt;# Never allow unpinned transitive dependencies in production&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. Atomic credential rotation
&lt;/h3&gt;

&lt;p&gt;When you detect a compromise and rotate credentials, the rotation must be a single synchronized operation — revoke all active tokens, generate new ones, update all consumers simultaneously. Sequential rotation leaves a window. TeamPCP exploited exactly this window in the Trivy incident.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Principle of least privilege for service accounts
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Your CI service account should not have:&lt;/span&gt;
&lt;span class="c1"&gt;# - write access to multiple repositories&lt;/span&gt;
&lt;span class="c1"&gt;# - admin access to package registries&lt;/span&gt;
&lt;span class="c1"&gt;# - broad cloud IAM roles&lt;/span&gt;

&lt;span class="c1"&gt;# It should have exactly:&lt;/span&gt;
&lt;span class="c1"&gt;# - read access to the specific repos needed for this job&lt;/span&gt;
&lt;span class="c1"&gt;# - publish access to the specific package this job publishes&lt;/span&gt;
&lt;span class="c1"&gt;# - no persistent credentials (use OIDC/short-lived tokens)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  5. Behavior-based CI monitoring
&lt;/h3&gt;

&lt;p&gt;The LiteLLM incident was caught first by a developer whose machine started stuttering — their CPU was pegged because the malware's fork bomb behavior crashed the system. That's not monitoring; that's luck.&lt;/p&gt;

&lt;p&gt;What you actually need: alerts for Python processes making &lt;strong&gt;outbound POST requests at install time&lt;/strong&gt;. Package installation should pull from PyPI — it should never POST encrypted binary payloads to external endpoints.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;Alert rule&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; 
  &lt;span class="na"&gt;process&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;python (via pip subprocess)&lt;/span&gt;
  &lt;span class="na"&gt;direction&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;outbound&lt;/span&gt;
  &lt;span class="na"&gt;method&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;POST&lt;/span&gt;
  &lt;span class="na"&gt;payload&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;encrypted binary&lt;/span&gt;
  &lt;span class="na"&gt;action&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ALERT + BLOCK&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  6. Audit OAuth permissions regularly
&lt;/h3&gt;

&lt;p&gt;The Vercel breach started with a productivity tool that was granted full Google Drive read access. Every OAuth integration in your organization is a potential pivot point. Audit them. Scope them to minimum required permissions. Revoke anything that hasn't been used recently.&lt;/p&gt;

&lt;h3&gt;
  
  
  7. Treat your AI coding tools as high-privilege systems
&lt;/h3&gt;

&lt;p&gt;Given the Bitwarden CLI attack explicitly targeted Claude, Cursor, Codex, and Aider credentials, it's time to treat AI coding assistant authentication state with the same security posture as cloud access keys:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Don't leave AI tools authenticated in unmonitored environments&lt;/li&gt;
&lt;li&gt;Rotate AI tool API keys on the same schedule as cloud credentials&lt;/li&gt;
&lt;li&gt;Monitor for abnormal AI tool usage patterns (large data transfers, unusual API calls)&lt;/li&gt;
&lt;li&gt;Be aware that your AI coding assistant may have access to your entire codebase, your git credentials, and your cloud service connections simultaneously&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  The Developer's Quick Reference Checklist
&lt;/h2&gt;

&lt;p&gt;Here's a condensed action list you can use right now:&lt;/p&gt;

&lt;h3&gt;
  
  
  For your CI/CD pipelines
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;☐ Pin all GitHub Actions to full commit SHAs
☐ Use npm ci / pip install --require-hashes (not npm install / pip install)  
☐ Audit pull_request_target workflows for excessive permissions
☐ Limit service account tokens to minimum required scope
☐ Enable GitHub's SHA pinning organizational policy
☐ Set up behavior-based alerts for install-time network requests
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  For your dependencies
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;☐ Check for plain-crypto-js in any lockfile (Axios RAT indicator)
☐ Check for litellm==1.82.7 or 1.82.8 in any Python environment
☐ Check @bitwarden/cli for version 2026.4.0 (rotate if found)
☐ Search your GitHub org for repos named "tpcp-docs" (Trivy compromise indicator)
☐ Audit all GitHub Actions for recent unexpected workflow edits
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  For your organization
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;☐ Audit all OAuth app permissions — revoke excessive access
☐ Mark all Vercel environment variables as "sensitive"
☐ Rotate credentials from any CI pipeline that ran Trivy on March 19, 2026
☐ Implement vendor monitoring with automated CVE-to-vendor mapping
☐ Document your complete dependency tree (can you answer: what packages did production install in the last 30 days?)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Closing Thoughts
&lt;/h2&gt;

&lt;p&gt;The pattern across all of these attacks is the same: &lt;strong&gt;attackers are targeting trust, not systems.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;They're not breaking through your firewall. They're getting invited through the front door — via a trusted package, a trusted OAuth app, a trusted GitHub Action, a trusted vulnerability scanner.&lt;/p&gt;

&lt;p&gt;The question isn't whether your perimeter is secure. The question is whether you know every entity you trust, what access you've granted them, and what happens to your environment if any one of them is compromised.&lt;/p&gt;

&lt;p&gt;Supply chain security in 2026 isn't a specialized discipline anymore. It's table stakes for any team that ships software.&lt;/p&gt;

&lt;p&gt;The next compromised package is already on its way to your CI pipeline. The question is whether you'll see it land.&lt;/p&gt;




&lt;h2&gt;
  
  
  Resources and Further Reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.aquasec.com/blog/trivy-supply-chain-attack-what-you-need-to-know/" rel="noopener noreferrer"&gt;Aqua Security Trivy Incident Report&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.litellm.ai/blog/security-update-march-2026" rel="noopener noreferrer"&gt;LiteLLM Official Security Update&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.elastic.co/security-labs/axios-one-rat-to-rule-them-all" rel="noopener noreferrer"&gt;Elastic Security Labs: Axios Supply Chain Analysis&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://vercel.com/kb/bulletin/vercel-april-2026-security-incident" rel="noopener noreferrer"&gt;Vercel April 2026 Security Bulletin&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://community.bitwarden.com/t/bitwarden-statement-on-checkmarx-supply-chain-incident/96127" rel="noopener noreferrer"&gt;Bitwarden Statement on Checkmarx Supply Chain Incident&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://snyk.io/articles/trivy-github-actions-supply-chain-compromise/" rel="noopener noreferrer"&gt;Snyk: Trivy GitHub Actions Supply Chain Compromise&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://ccunpacked.dev/" rel="noopener noreferrer"&gt;Claude Code Unpacked&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.endorlabs.com/learn/shai-hulud-the-third-coming" rel="noopener noreferrer"&gt;Endor Labs: Shai-Hulud Third Coming Analysis&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.group-ib.com/blog/supply-chain-attack-groups-2026/" rel="noopener noreferrer"&gt;Group-IB: Supply Chain Attack Groups 2026&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;If you're responsible for a CI/CD pipeline, share this with your team — the SHA pinning point alone is worth the read.*&lt;/p&gt;

&lt;p&gt;&lt;em&gt;All technical details sourced from public security disclosures, vendor incident reports, and independent researcher analysis.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tags:&lt;/strong&gt; &lt;code&gt;#security&lt;/code&gt; &lt;code&gt;#cybersecurity&lt;/code&gt; &lt;code&gt;#devops&lt;/code&gt; &lt;code&gt;#opensource&lt;/code&gt; &lt;code&gt;#supplychain&lt;/code&gt; &lt;code&gt;#javascript&lt;/code&gt; &lt;code&gt;#python&lt;/code&gt; &lt;code&gt;#npm&lt;/code&gt; &lt;code&gt;#github&lt;/code&gt;&lt;/p&gt;

</description>
    </item>
    <item>
      <title>The Definitive Guide to Lightweight Kubernetes: KIND, Minikube, MicroK8s, K3s, Vcluster, k0s, and RKE2 Compared</title>
      <dc:creator>Pendela BhargavaSai</dc:creator>
      <pubDate>Thu, 23 Apr 2026 03:18:00 +0000</pubDate>
      <link>https://dev.to/pendelabhargavasai/the-definitive-guide-to-lightweight-kubernetes-kind-minikube-microk8s-k3s-vcluster-k0s-and-3be1</link>
      <guid>https://dev.to/pendelabhargavasai/the-definitive-guide-to-lightweight-kubernetes-kind-minikube-microk8s-k3s-vcluster-k0s-and-3be1</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;TL;DR&lt;/strong&gt; — There is no single "best" lightweight Kubernetes. KIND wins CI/CD, Minikube wins local dev UX, MicroK8s wins on Ubuntu, K3s wins edge and production, Vcluster wins multi-tenancy, k0s wins zero-dependency ops, and RKE2 wins enterprise compliance. This post explains why — with architecture diagrams, feature tables, and real-world guidance.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4ixolu1jgi9xokd9cw9k.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4ixolu1jgi9xokd9cw9k.png" alt=" " width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Table of Contents
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Why Lightweight Kubernetes Matters&lt;/li&gt;
&lt;li&gt;The Contenders at a Glance&lt;/li&gt;
&lt;li&gt;KIND — Kubernetes IN Docker&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/pendelabhargavasai/the-definitive-guide-to-lightweight-kubernetes-kind-minikube-microk8s-k3s-vcluster-k0s-and-3o5e-temp-slug-8924697/edit#https://dev.to/pendelabhargavasai/the-definitive-guide-to-lightweight-kubernetes-kind-minikube-microk8s-k3s-vcluster-k0s-and-3o5e-temp-slug-8924697/edit#KIND-Kubernetes-IN-Docker:~:text=2.%20Minikube%20%2D%20The%20Developer%27s%20Workhorse"&gt;Minikube — The Developer's Workhorse&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/pendelabhargavasai/the-definitive-guide-to-lightweight-kubernetes-kind-minikube-microk8s-k3s-vcluster-k0s-and-3o5e-temp-slug-8924697/edit#https://dev.to/pendelabhargavasai/the-definitive-guide-to-lightweight-kubernetes-kind-minikube-microk8s-k3s-vcluster-k0s-and-3o5e-temp-slug-8924697/edit#KIND-Kubernetes-IN-Docker:~:text=3.%20MicroK8s%20%2D%20Zero%2DOps%20by%20Canonical"&gt;MicroK8s — Zero-Ops by Canonical&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/pendelabhargavasai/the-definitive-guide-to-lightweight-kubernetes-kind-minikube-microk8s-k3s-vcluster-k0s-and-3o5e-temp-slug-8924697/edit#https://dev.to/pendelabhargavasai/the-definitive-guide-to-lightweight-kubernetes-kind-minikube-microk8s-k3s-vcluster-k0s-and-3o5e-temp-slug-8924697/edit#KIND-Kubernetes-IN-Docker:~:text=4.%20K3s%20%2D%20Production%2DGrade%20at%20the%20Edge"&gt;K3s — Production-Grade at the Edge&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Vcluster — Kubernetes Inside Kubernetes&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/pendelabhargavasai/the-definitive-guide-to-lightweight-kubernetes-kind-minikube-microk8s-k3s-vcluster-k0s-and-3o5e-temp-slug-8924697/edit#https://dev.to/pendelabhargavasai/the-definitive-guide-to-lightweight-kubernetes-kind-minikube-microk8s-k3s-vcluster-k0s-and-3o5e-temp-slug-8924697/edit#https://dev.to/pendelabhargavasai/the-definitive-guide-to-lightweight-kubernetes-kind-minikube-microk8s-k3s-vcluster-k0s-and-3o5e-temp-slug-8924697/edit#KIND-Kubernetes-IN-Docker:~:text=6.%20k0s%20%E2%80%94%20Zero%20Dependencies%2C%20Zero%20Friction"&gt;k0s — Zero Dependencies, Zero Friction&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/pendelabhargavasai/the-definitive-guide-to-lightweight-kubernetes-kind-minikube-microk8s-k3s-vcluster-k0s-and-3o5e-temp-slug-8924697/edit#https://dev.to/pendelabhargavasai/the-definitive-guide-to-lightweight-kubernetes-kind-minikube-microk8s-k3s-vcluster-k0s-and-3o5e-temp-slug-8924697/edit#https://dev.to/pendelabhargavasai/the-definitive-guide-to-lightweight-kubernetes-kind-minikube-microk8s-k3s-vcluster-k0s-and-3o5e-temp-slug-8924697/edit#KIND-Kubernetes-IN-Docker:~:text=7.%20RKE2%20%E2%80%94%20Security%2DFirst%20Enterprise%20K8s"&gt;RKE2 — Security-First Enterprise K8s&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/pendelabhargavasai/the-definitive-guide-to-lightweight-kubernetes-kind-minikube-microk8s-k3s-vcluster-k0s-and-3o5e-temp-slug-8924697/edit#https://dev.to/pendelabhargavasai/the-definitive-guide-to-lightweight-kubernetes-kind-minikube-microk8s-k3s-vcluster-k0s-and-3o5e-temp-slug-8924697/edit#https://dev.to/pendelabhargavasai/the-definitive-guide-to-lightweight-kubernetes-kind-minikube-microk8s-k3s-vcluster-k0s-and-3o5e-temp-slug-8924697/edit#KIND-Kubernetes-IN-Docker:~:text=are%20non%2Dnegotiable.-,Scoring%20Across%208%20Dimensions,-Scores%20are%20relative"&gt;Scoring Across 8 Dimensions&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/pendelabhargavasai/the-definitive-guide-to-lightweight-kubernetes-kind-minikube-microk8s-k3s-vcluster-k0s-and-3o5e-temp-slug-8924697/edit#final-verdict:~:text=K3s-,The%20Decision%20Tree,-Do%20you%20need"&gt;Use Case Decision Guide&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Final Verdict&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Why Lightweight Kubernetes Matters
&lt;/h2&gt;

&lt;p&gt;Full-fat Kubernetes — the kind you run on a 3-master, 6-worker production cluster — is extraordinary infrastructure. It is also deeply impractical when you need to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Spin up a throwaway cluster in a GitHub Actions runner in under 30 seconds&lt;/li&gt;
&lt;li&gt;Run Kubernetes on a Raspberry Pi with 1 GB of RAM&lt;/li&gt;
&lt;li&gt;Give every developer on your team their own isolated cluster without buying new hardware&lt;/li&gt;
&lt;li&gt;Deploy to a factory floor where the "server" is an ARM SBC with no internet access&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The Kubernetes ecosystem responded by producing a rich family of lightweight distributions, each making different trade-offs. By 2025, the major players are &lt;strong&gt;KIND&lt;/strong&gt;, &lt;strong&gt;Minikube&lt;/strong&gt;, &lt;strong&gt;MicroK8s&lt;/strong&gt;, &lt;strong&gt;K3s&lt;/strong&gt;, &lt;strong&gt;Vcluster&lt;/strong&gt;, &lt;strong&gt;k0s&lt;/strong&gt;, and &lt;strong&gt;RKE2&lt;/strong&gt; — and choosing between them is genuinely consequential.&lt;/p&gt;

&lt;p&gt;This guide gives you the full picture: architecture, components, features, limitations, scoring, and concrete use-case guidance, all in one place.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Contenders at a Glance
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Creator&lt;/th&gt;
&lt;th&gt;Year&lt;/th&gt;
&lt;th&gt;Primary Use Case&lt;/th&gt;
&lt;th&gt;Min RAM&lt;/th&gt;
&lt;th&gt;Binary Size&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;KIND&lt;/td&gt;
&lt;td&gt;Kubernetes SIG Testing&lt;/td&gt;
&lt;td&gt;2019&lt;/td&gt;
&lt;td&gt;CI/CD testing&lt;/td&gt;
&lt;td&gt;2 GB&lt;/td&gt;
&lt;td&gt;N/A (uses Docker)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Minikube&lt;/td&gt;
&lt;td&gt;Kubernetes Community&lt;/td&gt;
&lt;td&gt;2016&lt;/td&gt;
&lt;td&gt;Local development&lt;/td&gt;
&lt;td&gt;2 GB&lt;/td&gt;
&lt;td&gt;~100 MB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MicroK8s&lt;/td&gt;
&lt;td&gt;Canonical (Ubuntu)&lt;/td&gt;
&lt;td&gt;2018&lt;/td&gt;
&lt;td&gt;Ubuntu / Edge&lt;/td&gt;
&lt;td&gt;540 MB&lt;/td&gt;
&lt;td&gt;Snap package&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;K3s&lt;/td&gt;
&lt;td&gt;Rancher Labs (SUSE)&lt;/td&gt;
&lt;td&gt;2019&lt;/td&gt;
&lt;td&gt;Edge / Production&lt;/td&gt;
&lt;td&gt;512 MB&lt;/td&gt;
&lt;td&gt;&amp;lt; 100 MB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Vcluster&lt;/td&gt;
&lt;td&gt;Loft Labs&lt;/td&gt;
&lt;td&gt;2021&lt;/td&gt;
&lt;td&gt;Multi-tenancy&lt;/td&gt;
&lt;td&gt;Host-dependent&lt;/td&gt;
&lt;td&gt;Helm chart&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;k0s&lt;/td&gt;
&lt;td&gt;Mirantis&lt;/td&gt;
&lt;td&gt;2020&lt;/td&gt;
&lt;td&gt;Zero-dependency ops&lt;/td&gt;
&lt;td&gt;1 GB&lt;/td&gt;
&lt;td&gt;~230 MB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;RKE2&lt;/td&gt;
&lt;td&gt;Rancher (SUSE)&lt;/td&gt;
&lt;td&gt;2021&lt;/td&gt;
&lt;td&gt;Enterprise / Compliance&lt;/td&gt;
&lt;td&gt;4 GB&lt;/td&gt;
&lt;td&gt;~300 MB&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Each of these is CNCF-compatible and capable of running real Kubernetes workloads. The differences are in &lt;em&gt;where&lt;/em&gt;, &lt;em&gt;how&lt;/em&gt;, and &lt;em&gt;at what cost&lt;/em&gt; they do it.&lt;/p&gt;




&lt;h2&gt;
  
  
  1. KIND — Kubernetes IN Docker
&lt;/h2&gt;




&lt;h2&gt;
  
  
  What It Is
&lt;/h2&gt;

&lt;p&gt;KIND (Kubernetes IN Docker) was built by the Kubernetes SIG Testing team for one purpose: to test Kubernetes itself. Every node in a KIND cluster is a Docker container. The control plane runs in one container, worker nodes in others, and they communicate over a Docker bridge network called &lt;code&gt;kindnet&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;KIND&lt;/strong&gt; runs every Kubernetes node as a Docker container. There is no VM, no hypervisor, no separate OS. The &lt;code&gt;kindnet&lt;/code&gt; CNI is a purpose-built bridge that understands this container-as-node topology. The practical effect is that KIND clusters are disposable, fast, and completely ephemeral — perfect for testing but incapable of persistence.&lt;/p&gt;

&lt;p&gt;Because there is no VM involved, KIND clusters start in about 30 seconds and use only Docker's existing networking and storage. You can run a dozen isolated clusters on a single laptop.&lt;/p&gt;

&lt;h3&gt;
  
  
  Architecture
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;
┌─────────────────────────────────────────────────────---┐
│                   Docker Host                          │
│                                                        │
│  ┌─────────────────────┐   ┌──────────────────────┐    │
│  │   Control Plane     │   │     Worker 1         │    │
│  │   (container)       │──▶│     (container)      │   │
│  │                     │   │                      │    │
│  │  • API Server       │   │  • kubelet           │    │
│  │  • etcd             │   │  • kube-proxy        │    │
│  │  • Scheduler        │──▶│  • Pod A  • Pod B    │    │
│  │  • Controller Mgr   │   └──────────────────────┘    │
│  │  • kindnet CNI      │     ┌──────────────────────┐  │
│  └─────────────────────┘     │     Worker 2         │  │
│                              │     (container)      │  │
│  ┌──────────────────┐        │  • kubelet + pods    │  │
│  │  Port-forwarding │        └──────────────────────┘  │
│  │  localhost:6443  │                                  │
│  └──────────────────┘                                  │
└─────────────────────────────────────────────────────---┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Core Components
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;kindnet&lt;/strong&gt; — Custom CNI using a kernel bridge, purpose-built for KIND's container-as-node model&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;etcd&lt;/strong&gt; — Full etcd running inside the control-plane container&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;containerd&lt;/strong&gt; — Container runtime inside each node-container (Docker-in-Docker)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;kubeadm&lt;/strong&gt; — KIND uses kubeadm internally to bootstrap the cluster&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Key Features
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;True multi-node clusters (control plane + N workers) on a single host&lt;/li&gt;
&lt;li&gt;Custom node images — test against any Kubernetes version&lt;/li&gt;
&lt;li&gt;Rootless mode via rootless Docker/Podman&lt;/li&gt;
&lt;li&gt;IPv6 and dual-stack support&lt;/li&gt;
&lt;li&gt;Create multiple isolated clusters simultaneously&lt;/li&gt;
&lt;li&gt;Parallel cluster creation&lt;/li&gt;
&lt;li&gt;KUBECONFIG auto-export&lt;/li&gt;
&lt;li&gt;Optimised for GitHub Actions, GitLab CI, and Jenkins&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Quick Start
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;
&lt;span class="c"&gt;# Install&lt;/span&gt;

curl  &lt;span class="nt"&gt;-Lo&lt;/span&gt;  ./kind  &amp;lt;https://kind.sigs.k8s.io/dl/v0.22.0/kind-linux-amd64&amp;gt;

&lt;span class="nb"&gt;chmod&lt;/span&gt;  +x  ./kind &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;sudo  mv&lt;/span&gt;  ./kind  /usr/local/bin/kind

&lt;span class="c"&gt;# Create a single-node cluster&lt;/span&gt;

kind  create  cluster

&lt;span class="c"&gt;# Create a multi-node cluster&lt;/span&gt;

&lt;span class="nb"&gt;cat&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class="no"&gt;EOF&lt;/span&gt;&lt;span class="sh"&gt; | kind  create  cluster  --config=-

kind: Cluster

apiVersion: kind.x-k8s.io/v1alpha4

nodes:

- role: control-plane

- role: worker

- role: worker
&lt;/span&gt;&lt;span class="no"&gt;
EOF

&lt;/span&gt;&lt;span class="c"&gt;# Delete cluster&lt;/span&gt;

kind  delete  cluster
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Pros
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Blazing fast — 30-second cluster creation, no hypervisor boot time&lt;/li&gt;
&lt;li&gt;Zero VM overhead — runs entirely inside Docker containers&lt;/li&gt;
&lt;li&gt;True multi-node topology on one host&lt;/li&gt;
&lt;li&gt;Exact Kubernetes version control via node images&lt;/li&gt;
&lt;li&gt;Perfect for ephemeral CI environments&lt;/li&gt;
&lt;li&gt;No LoadBalancer hacks needed for testing (use NodePort)&lt;/li&gt;
&lt;li&gt;Widely supported in CI platforms&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Cons
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Requires Docker or Podman to be running&lt;/li&gt;
&lt;li&gt;Not production-ready under any circumstances&lt;/li&gt;
&lt;li&gt;No GPU passthrough&lt;/li&gt;
&lt;li&gt;LoadBalancer type services need MetalLB or similar&lt;/li&gt;
&lt;li&gt;Volumes are lost when the cluster is deleted&lt;/li&gt;
&lt;li&gt;No addon ecosystem&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Best For
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;CI/CD pipelines&lt;/strong&gt; — specifically integration testing that needs a real multi-node Kubernetes topology without the boot time of a VM-based solution.&lt;/p&gt;




&lt;h2&gt;
  
  
  2. Minikube - The Developer's Workhorse
&lt;/h2&gt;




&lt;h3&gt;
  
  
  What It Is
&lt;/h3&gt;

&lt;p&gt;Minikube is the original local Kubernetes project, released in 2016 and still the most feature-rich local development option. It runs a Kubernetes cluster inside a VM, a container, or directly on the host, and brings an unmatched addon ecosystem of 30+ pre-packaged integrations.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Minikube&lt;/strong&gt; is the only distribution that abstracts over &lt;em&gt;drivers&lt;/em&gt; — it runs identically whether the underlying host is a VM (VirtualBox, HyperKit, KVM), a container (Docker, Podman), or bare metal. This flexibility comes at the cost of startup time and memory, but it means Minikube works for every developer on every operating system.&lt;/p&gt;

&lt;p&gt;If you've ever run &lt;code&gt;kubectl apply -f&lt;/code&gt; on your laptop, you've probably used Minikube.&lt;/p&gt;

&lt;h3&gt;
  
  
  Architecture
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;┌──────────────────────────────────────────────────────────-┐
│             VM / Docker / Podman Driver                   │
│                                                           │
│  ┌─────────────────────────────────────────────────────┐  │
│  │            Single Node (All-in-One)                 │  │
│  │                                                     │  │
│  │  Control Plane                  Data Plane          │  │
│  │  ┌──────────┐  ┌────────────┐   ┌─────────────────┐ │  │
│  │  │API Server│  │etcd        │   │kubelet          │ │  │
│  │  └──────────┘  └────────────┘   │kube-proxy       │ │  │
│  │  ┌──────────┐  ┌────────────┐   │Pod A • Pod B    │ │  │
│  │  │Scheduler │  │Ctrl Manager│   └─────────────────┘ │  │
│  │  └──────────┘  └────────────┘                       │  │
│  └─────────────────────────────────────────────────────┘  │
│                                                           │
│  ┌─────────────────────────────────────────────────────┐  │
│  │                   Addons Layer                      │  │
│  │  Dashboard │ Ingress │ Metrics │ Registry │ Istio   │  │
│  └─────────────────────────────────────────────────────┘  │
└──────────────────────────────────────────────────────────-┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Core Components
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Multiple drivers&lt;/strong&gt; — HyperKit, VirtualBox, KVM2, Docker, Podman, SSH&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;etcd&lt;/strong&gt; — Full etcd as the backing store&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Calico or Flannel&lt;/strong&gt; — CNI (configurable per driver)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Addon controller&lt;/strong&gt; — Manages the 30+ available addon services&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Key Features
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;30+ addons including Istio, Knative, Linkerd, GPU operator, registry, and more&lt;/li&gt;
&lt;li&gt;Built-in Kubernetes dashboard (&lt;code&gt;minikube dashboard&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;GPU passthrough in VM mode&lt;/li&gt;
&lt;li&gt;LoadBalancer via &lt;code&gt;minikube tunnel&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Multiple profile management (run several clusters simultaneously)&lt;/li&gt;
&lt;li&gt;Image caching to speed up repeated pulls&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;minikube service&lt;/code&gt; command for easy port access&lt;/li&gt;
&lt;li&gt;Built-in image loading (&lt;code&gt;minikube image load&lt;/code&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Quick Start
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;
&lt;span class="c"&gt;# Install (Linux)&lt;/span&gt;

curl  &lt;span class="nt"&gt;-LO&lt;/span&gt;  &amp;lt;https://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64&amp;gt;

&lt;span class="nb"&gt;sudo  install  &lt;/span&gt;minikube-linux-amd64  /usr/local/bin/minikube

&lt;span class="c"&gt;# Start with Docker driver&lt;/span&gt;

minikube  start  &lt;span class="nt"&gt;--driver&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;docker

&lt;span class="c"&gt;# Enable addons&lt;/span&gt;

minikube  addons  &lt;span class="nb"&gt;enable  &lt;/span&gt;ingress

minikube  addons  &lt;span class="nb"&gt;enable  &lt;/span&gt;metrics-server

minikube  addons  &lt;span class="nb"&gt;enable  &lt;/span&gt;dashboard

&lt;span class="c"&gt;# Open dashboard&lt;/span&gt;

minikube  dashboard

&lt;span class="c"&gt;# LoadBalancer support&lt;/span&gt;

minikube  tunnel  &lt;span class="c"&gt;# Run in separate terminal&lt;/span&gt;

&lt;span class="c"&gt;# Delete&lt;/span&gt;

minikube  delete
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Pros
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Easiest getting-started experience of any K8s tool&lt;/li&gt;
&lt;li&gt;Unmatched addon ecosystem (30+ addons)&lt;/li&gt;
&lt;li&gt;GPU passthrough support (VirtualBox/KVM drivers)&lt;/li&gt;
&lt;li&gt;Built-in dashboard requires zero configuration&lt;/li&gt;
&lt;li&gt;Works on macOS, Linux, and Windows&lt;/li&gt;
&lt;li&gt;Multiple profiles = multiple clusters&lt;/li&gt;
&lt;li&gt;Best documentation and community support&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Cons
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Slow startup in VM mode (~2 minutes)&lt;/li&gt;
&lt;li&gt;High memory consumption, especially with VM driver&lt;/li&gt;
&lt;li&gt;Primarily a single-node environment&lt;/li&gt;
&lt;li&gt;Not production-ready&lt;/li&gt;
&lt;li&gt;LoadBalancer requires keeping &lt;code&gt;minikube tunnel&lt;/code&gt; running separately&lt;/li&gt;
&lt;li&gt;Battery-intensive on laptops&lt;/li&gt;
&lt;li&gt;Multi-node support exists but is limited and buggy&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Best For
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Local development&lt;/strong&gt; — especially developers who want a full Kubernetes experience with addons, dashboards, and GPU support without deep infrastructure expertise.&lt;/p&gt;




&lt;h2&gt;
  
  
  3. MicroK8s - Zero-Ops by Canonical
&lt;/h2&gt;




&lt;h3&gt;
  
  
  What It Is
&lt;/h3&gt;

&lt;p&gt;MicroK8s is Canonical's packaging of Kubernetes as a snap. It installs as a single command, self-heals via systemd, updates automatically through snap channels, and has the lowest memory footprint of any full-featured Kubernetes distribution at just 540 MB.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;MicroK8s&lt;/strong&gt; is unique in using &lt;strong&gt;dqlite&lt;/strong&gt; — a distributed SQLite engine developed by Canonical — as an alternative to etcd for HA mode. This dramatically simplifies the operational burden of running a multi-master cluster: no external etcd cluster needed, just &lt;code&gt;microk8s add-node&lt;/code&gt; on each machine.&lt;/p&gt;

&lt;p&gt;Unlike KIND and Minikube, MicroK8s is designed for both development &lt;em&gt;and&lt;/em&gt; light production workloads. Its HA mode using dqlite (a distributed version of SQLite) supports clustering without requiring a full etcd setup.&lt;/p&gt;

&lt;h3&gt;
  
  
  Architecture
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;
┌──────────────────────────────────────────────────────────-┐
│                  Snap Package (systemd)                   │
│                                                           │
│  ┌──────────────────────┐   ┌────────────────────────┐    │
│  │    Node 1 (Master)    │   │      Node 2           │    │
│  │                       │   │                       │    │
│  │  • API Server         │──▶│  • kubelet            │    │
│  │  • dqlite (HA store)  │   │  • kube-proxy         │    │
│  │  • Scheduler          │   │  • Calico CNI          │   │
│  │  • Controller Manager │   │  • Pods                │   │
│  │  • Calico CNI         │   └────────────────────────┘   │
│  │  • Auto-updater       │                                │
│  └──────────────────────┘                                 │
│                                                           │
│  ┌──────────────────────────────────────────────────────┐ │
│  │         Addon Engine (microk8s enable &lt;span class="nt"&gt;&amp;lt;addon&amp;gt;&lt;/span&gt;)       │ │
│  │  Istio │ Knative │ GPU │ Registry │ Dashboard │ More │ │
│  └──────────────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────-┘

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Core Components
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;dqlite&lt;/strong&gt; — Distributed SQLite for HA without the operational burden of etcd&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Calico CNI&lt;/strong&gt; — Production-grade networking with network policy support&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Snap daemon&lt;/strong&gt; — Manages the entire lifecycle including automatic updates&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Addon engine&lt;/strong&gt; — &lt;code&gt;microk8s enable &amp;lt;name&amp;gt;&lt;/code&gt; installs curated addons&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Key Features
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Lowest memory footprint: 540 MB minimum&lt;/li&gt;
&lt;li&gt;HA clustering via &lt;code&gt;microk8s add-node&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Automatic channel-based updates with rollback&lt;/li&gt;
&lt;li&gt;GPU operator addon for ML/AI workloads&lt;/li&gt;
&lt;li&gt;Strict snap confinement for security&lt;/li&gt;
&lt;li&gt;ARM64 and x86 native support&lt;/li&gt;
&lt;li&gt;Observability stack addon (Prometheus, Grafana)&lt;/li&gt;
&lt;li&gt;Built-in image registry&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Quick Start
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;
&lt;span class="c"&gt;# Install via snap&lt;/span&gt;

&lt;span class="nb"&gt;sudo  &lt;/span&gt;snap  &lt;span class="nb"&gt;install  &lt;/span&gt;microk8s  &lt;span class="nt"&gt;--classic&lt;/span&gt;

&lt;span class="c"&gt;# Add your user to the microk8s group&lt;/span&gt;

&lt;span class="nb"&gt;sudo  &lt;/span&gt;usermod  &lt;span class="nt"&gt;-aG&lt;/span&gt;  microk8s  &lt;span class="nv"&gt;$USER&lt;/span&gt;

newgrp  microk8s

&lt;span class="c"&gt;# Check status&lt;/span&gt;

microk8s  status  &lt;span class="nt"&gt;--wait-ready&lt;/span&gt;

&lt;span class="c"&gt;# Enable core addons&lt;/span&gt;

microk8s  &lt;span class="nb"&gt;enable  &lt;/span&gt;dns  ingress  metrics-server  dashboard

&lt;span class="c"&gt;# Use kubectl&lt;/span&gt;

microk8s  kubectl  get  nodes

&lt;span class="c"&gt;# Add worker node (run on master, then copy join command to worker)&lt;/span&gt;

microk8s  add-node

&lt;span class="c"&gt;# Uninstall&lt;/span&gt;

&lt;span class="nb"&gt;sudo  &lt;/span&gt;snap  remove  microk8s
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Pros
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Lowest RAM usage of all full-featured distributions (540 MB)&lt;/li&gt;
&lt;li&gt;Best Ubuntu and Linux integration through the snap ecosystem&lt;/li&gt;
&lt;li&gt;Self-healing via systemd — restarts automatically on failure&lt;/li&gt;
&lt;li&gt;HA multi-node with a simple &lt;code&gt;add-node&lt;/code&gt; workflow&lt;/li&gt;
&lt;li&gt;Automatic updates through snap channels (stable, candidate, beta)&lt;/li&gt;
&lt;li&gt;Production-capable for light workloads&lt;/li&gt;
&lt;li&gt;ARM64 support for Raspberry Pi and ARM servers&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Cons
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Snap packaging limits portability to non-Ubuntu systems&lt;/li&gt;
&lt;li&gt;Ubuntu-centric design — snap is not available everywhere&lt;/li&gt;
&lt;li&gt;Addon conflicts can occur (Istio + other service meshes, for example)&lt;/li&gt;
&lt;li&gt;Strict snap confinement can block some host filesystem operations&lt;/li&gt;
&lt;li&gt;dqlite is still maturing compared to battle-tested etcd&lt;/li&gt;
&lt;li&gt;Automatic updates can cause unplanned restarts without configuration&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Best For
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Ubuntu workstations and edge servers&lt;/strong&gt; — if you're on Ubuntu, MicroK8s is the most native Kubernetes experience available.&lt;/p&gt;




&lt;h2&gt;
  
  
  4. K3s - Production-Grade at the Edge
&lt;/h2&gt;




&lt;h3&gt;
  
  
  What It Is
&lt;/h3&gt;

&lt;p&gt;K3s is the single most consequential lightweight Kubernetes project of the past five years. Released by Rancher Labs (now SUSE) in 2019, it packs a complete, CNCF-certified Kubernetes distribution into a single binary under 100 MB. It runs on 512 MB of RAM, boots in 30 seconds, and runs identically on a Raspberry Pi, a factory floor ARM controller, and a cloud VM.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;K3s&lt;/strong&gt; achieves its sub-100 MB size by bundling everything into a single Go binary with no external dependencies, using SQLite as a default backing store (which requires no cluster management), and removing upstream K8s features that aren't needed in its target environments (Windows nodes, cloud-provider integrations, certain alpha features).&lt;/p&gt;

&lt;p&gt;K3s is not a toy. It is used in production by thousands of organisations worldwide.&lt;/p&gt;

&lt;h3&gt;
  
  
  Architecture
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;
┌────────────────────────────────────────────────────────────────┐
│                      k3s binary (&amp;lt; 100 MB)                     │
│                                                                │
│  ┌─────────────────────────────────┐                           │
│  │          k3s Server             │                           │
│  │  (Control Plane + Optional DP)  │──────────┐                │
│  │                                 │          │                │
│  │  • API Server                   │          ▼                │
│  │  • SQLite (default) / etcd / PG │   ┌─────────────────┐     │
│  │  • Scheduler                    │   │   k3s Agent 1   │     │
│  │  • Controller Manager           │   │   (Worker Node) │     │
│  │  • Flannel CNI (built-in)       │   │  • kubelet      │     │
│  │  • Traefik Ingress              │   │  • kube-proxy   │     │
│  │  • CoreDNS                      │──▶│  • Flannel     │     │
│  │  • local-path-provisioner       │   │  • Pods         │     │
│  │  • Helm controller              │   └─────────────────┘     │
│  └─────────────────────────────────┘          │                │
│                                               ▼                │
│                                        ┌─────────────────┐     │
│                                        │ k3s Agent 2     │     │
│                                        │ (ARM / IoT)     │     │
│                                        └─────────────────┘     │
└────────────────────────────────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Core Components
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Single binary&lt;/strong&gt; — Packages containerd, CNI plugins, CoreDNS, Traefik, and more&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SQLite&lt;/strong&gt; — Default data store, ideal for single-server or small clusters&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Embedded etcd&lt;/strong&gt; — Available for HA clusters (3+ servers)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;External DB&lt;/strong&gt; — PostgreSQL, MySQL, or etcd for larger deployments&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Flannel CNI&lt;/strong&gt; — Built-in overlay networking, zero extra configuration&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Traefik&lt;/strong&gt; — Ingress controller included out of the box&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Helm controller&lt;/strong&gt; — Manage Helm charts via CRDs&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;local-path-provisioner&lt;/strong&gt; — Dynamic PVC provisioning on local disk&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Key Features
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;CNCF-certified — passes full Kubernetes conformance tests&lt;/li&gt;
&lt;li&gt;Single binary &amp;lt; 100 MB with everything bundled&lt;/li&gt;
&lt;li&gt;Multiple storage backends: SQLite, etcd, PostgreSQL, MySQL&lt;/li&gt;
&lt;li&gt;ARM64 and ARMv7 first-class support&lt;/li&gt;
&lt;li&gt;Air-gap / offline install support (critical for edge deployments)&lt;/li&gt;
&lt;li&gt;Auto TLS with Let's Encrypt for Traefik&lt;/li&gt;
&lt;li&gt;Server + Agent role split for control/data plane separation&lt;/li&gt;
&lt;li&gt;Automatic certificate rotation&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Quick Start
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;
&lt;span class="c"&gt;# Install server (master) — one command&lt;/span&gt;

curl  &lt;span class="nt"&gt;-sfL&lt;/span&gt;  &amp;lt;https://get.k3s.io&amp;gt; | sh  -

&lt;span class="c"&gt;# Check status&lt;/span&gt;

&lt;span class="nb"&gt;sudo  &lt;/span&gt;systemctl  status  k3s

&lt;span class="nb"&gt;sudo  &lt;/span&gt;kubectl  get  nodes

&lt;span class="c"&gt;# Get the node join token&lt;/span&gt;

&lt;span class="nb"&gt;sudo  cat&lt;/span&gt;  /var/lib/rancher/k3s/server/node-token

&lt;span class="c"&gt;# Join a worker node (run on the worker)&lt;/span&gt;

curl  &lt;span class="nt"&gt;-sfL&lt;/span&gt;  &amp;lt;https://get.k3s.io&amp;gt; | &lt;span class="se"&gt;\\&lt;/span&gt;

&lt;span class="nv"&gt;K3S_URL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;https://&amp;lt;SERVER_IP&amp;gt;:6443  &lt;span class="se"&gt;\\&lt;/span&gt;

&lt;span class="nv"&gt;K3S_TOKEN&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&amp;lt;NODE_TOKEN&amp;gt; &lt;span class="se"&gt;\\&lt;/span&gt;

sh  -

&lt;span class="c"&gt;# Use kubectl without sudo&lt;/span&gt;

&lt;span class="nb"&gt;mkdir&lt;/span&gt;  &lt;span class="nt"&gt;-p&lt;/span&gt;  ~/.kube

&lt;span class="nb"&gt;sudo  cp&lt;/span&gt;  /etc/rancher/k3s/k3s.yaml  ~/.kube/config

&lt;span class="nb"&gt;sudo  chown&lt;/span&gt; &lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;  &lt;span class="nt"&gt;-u&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;:&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;  &lt;span class="nt"&gt;-g&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt; ~/.kube/config

&lt;span class="c"&gt;# Uninstall&lt;/span&gt;

/usr/local/bin/k3s-uninstall.sh  &lt;span class="c"&gt;# server&lt;/span&gt;

/usr/local/bin/k3s-agent-uninstall.sh  &lt;span class="c"&gt;# agent&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  HA Setup (Embedded etcd)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;
&lt;span class="c"&gt;# First server node&lt;/span&gt;

curl  &lt;span class="nt"&gt;-sfL&lt;/span&gt;  &amp;lt;https://get.k3s.io&amp;gt; | sh  &lt;span class="nt"&gt;-s&lt;/span&gt;  -  server  &lt;span class="se"&gt;\\&lt;/span&gt;

&lt;span class="nt"&gt;--cluster-init&lt;/span&gt;

&lt;span class="c"&gt;# Additional server nodes&lt;/span&gt;

curl  &lt;span class="nt"&gt;-sfL&lt;/span&gt;  &amp;lt;https://get.k3s.io&amp;gt; | sh  &lt;span class="nt"&gt;-s&lt;/span&gt;  -  server  &lt;span class="se"&gt;\\&lt;/span&gt;

&lt;span class="nt"&gt;--server&lt;/span&gt; https://&amp;lt;FIRST_SERVER_IP&amp;gt;:6443 &lt;span class="se"&gt;\\&lt;/span&gt;

&lt;span class="nt"&gt;--token&lt;/span&gt; &amp;lt;NODE_TOKEN&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Pros
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;CNCF-certified — genuine, conformant Kubernetes, not a cut-down imitation&lt;/li&gt;
&lt;li&gt;Single binary under 100 MB — deploy to anything&lt;/li&gt;
&lt;li&gt;512 MB RAM minimum — runs on Raspberry Pi 3&lt;/li&gt;
&lt;li&gt;30-second cold start&lt;/li&gt;
&lt;li&gt;SQLite for small clusters, etcd for HA — right tool for every scale&lt;/li&gt;
&lt;li&gt;Traefik ingress out of the box — production workloads with zero extra config&lt;/li&gt;
&lt;li&gt;ARM64 and ARMv7 native — best IoT Kubernetes support in the market&lt;/li&gt;
&lt;li&gt;Air-gap install — works in completely offline environments&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Cons
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;SQLite backend not suitable for clusters exceeding ~50 nodes&lt;/li&gt;
&lt;li&gt;Some upstream Kubernetes features are stripped (Alpha features, some cloud integrations)&lt;/li&gt;
&lt;li&gt;Default CNI is Flannel only (using Calico requires additional configuration)&lt;/li&gt;
&lt;li&gt;No built-in dashboard&lt;/li&gt;
&lt;li&gt;Less rich addon ecosystem than Minikube or MicroK8s&lt;/li&gt;
&lt;li&gt;Limited Windows node support&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Best For
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Edge computing, IoT, production on resource-constrained hardware, and any environment where the binary size and startup time of a traditional Kubernetes distribution is prohibitive.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  5. Vcluster — Kubernetes Inside Kubernetes
&lt;/h2&gt;




&lt;h3&gt;
  
  
  What It Is
&lt;/h3&gt;

&lt;p&gt;Vcluster takes a completely different approach to "lightweight Kubernetes." Rather than running alongside a host operating system, it runs &lt;em&gt;inside&lt;/em&gt; an existing Kubernetes cluster. Each virtual cluster is a set of pods in a namespace, but from the user's perspective it is a completely isolated Kubernetes cluster with its own API server, etcd, and full Kubernetes API.&lt;/p&gt;

&lt;p&gt;This makes Vcluster the definitive answer to the multi-tenancy problem: instead of giving teams namespace isolation (which shares the API server and exposes blast radius), you give each team their own cluster for the cost of a few pods.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Vcluster&lt;/strong&gt; is architecturally unique in the field. Its virtual control plane (API server + etcd + scheduler + controller manager) runs as pods &lt;em&gt;inside&lt;/em&gt; a host cluster namespace. A component called the &lt;strong&gt;Syncer&lt;/strong&gt; watches the virtual cluster's API and translates virtual resources into real host resources — a virtual Pod becomes a real Pod in the host namespace with a remapped name.&lt;/p&gt;

&lt;h3&gt;
  
  
  Architecture
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌──────────────────────────────────────────────────────────────┐
│               Host Kubernetes Cluster (any provider)         │
│                                                              │
│  ┌────────────────────┐  ┌────────────────────┐              │
│  │    vcluster 1      │  │    vcluster 2       │             │
│  │  (Team A ns)       │  │  (Team B ns)        │             │
│  │                    │  │                     │             │
│  │  Virtual API Srv   │  │  Virtual API Srv    │             │
│  │  In-process etcd   │  │  In-process etcd    │             │
│  │  Syncer pod        │  │  Syncer pod         │             │
│  │  ┌────┐  ┌────┐    │  │  ┌────┐  ┌────┐    │              │
│  │  │PodA│  │PodB│    │  │  │PodC│  │PodD│    │              │
│  │  └─┬──┘  └─┬──┘    │  │  └─┬──┘  └─┬──┘    │              │
│  │    │sync   │sync   │  │    │sync   │sync    │             │
│  └────┼───────┼───────┘  └────┼───────┼────────┘             │
│       ▼       ▼               ▼       ▼                      │
│  ┌──────────────────────────────────────────────────────┐    │
│  │  Shared Worker Nodes — Host CNI, Storage, Hardware   │    │
│  └──────────────────────────────────────────────────────┘    │
└──────────────────────────────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;strong&gt;Syncer&lt;/strong&gt; is the key innovation: it translates virtual cluster resources into real host cluster resources. A Pod created in vcluster 1 becomes a real Pod in the host cluster's namespace, but with a remapped name that prevents conflicts.&lt;/p&gt;

&lt;h3&gt;
  
  
  Core Components
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Virtual API Server&lt;/strong&gt; — Full Kubernetes API, runs as a pod in the host cluster&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;In-process etcd&lt;/strong&gt; — Embedded etcd for the virtual cluster's state&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Syncer&lt;/strong&gt; — Reconciles virtual resources to host cluster resources&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;vcluster CLI&lt;/strong&gt; — Manages lifecycle: create, connect, delete, list&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Key Features
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Full Kubernetes API isolation per virtual cluster&lt;/li&gt;
&lt;li&gt;Works on top of any Kubernetes (EKS, GKE, AKS, K3s, RKE2, etc.)&lt;/li&gt;
&lt;li&gt;~10 second spin-up time — fastest of all solutions&lt;/li&gt;
&lt;li&gt;No extra hardware — uses existing cluster nodes&lt;/li&gt;
&lt;li&gt;CRD isolation — each vcluster has its own CRDs&lt;/li&gt;
&lt;li&gt;RBAC isolation — separate RBAC per vcluster&lt;/li&gt;
&lt;li&gt;Helm chart deployment — deploy via standard Helm&lt;/li&gt;
&lt;li&gt;On-demand creation and deletion&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Quick Start
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;
&lt;span class="c"&gt;# Install vcluster CLI&lt;/span&gt;

curl  &lt;span class="nt"&gt;-L&lt;/span&gt;  &lt;span class="nt"&gt;-o&lt;/span&gt;  vcluster  &lt;span class="s2"&gt;"&amp;lt;https://github.com/loft-sh/vcluster/releases/latest/download/vcluster-linux-amd64&amp;gt;"&lt;/span&gt;

&lt;span class="nb"&gt;chmod&lt;/span&gt;  +x  vcluster &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;sudo  mv  &lt;/span&gt;vcluster  /usr/local/bin

&lt;span class="c"&gt;# Create a virtual cluster&lt;/span&gt;

vcluster  create  my-vcluster  &lt;span class="nt"&gt;--namespace&lt;/span&gt;  team-a

&lt;span class="c"&gt;# Connect to it (sets KUBECONFIG automatically)&lt;/span&gt;

vcluster  connect  my-vcluster  &lt;span class="nt"&gt;--namespace&lt;/span&gt;  team-a

&lt;span class="c"&gt;# Now kubectl talks to the vcluster&lt;/span&gt;

kubectl  get  nodes

kubectl  create  deployment  nginx  &lt;span class="nt"&gt;--image&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;nginx

&lt;span class="c"&gt;# Disconnect&lt;/span&gt;

vcluster  disconnect

&lt;span class="c"&gt;# Delete&lt;/span&gt;

vcluster  delete  my-vcluster  &lt;span class="nt"&gt;--namespace&lt;/span&gt;  team-a
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Pros
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Full Kubernetes API isolation per tenant — no shared API server blast radius&lt;/li&gt;
&lt;li&gt;10-second spin-up — fastest cluster creation of all solutions reviewed&lt;/li&gt;
&lt;li&gt;No extra hardware — reuses host cluster's nodes entirely&lt;/li&gt;
&lt;li&gt;Works on any cloud or on-premises Kubernetes&lt;/li&gt;
&lt;li&gt;Cost-efficient multi-tenancy at scale&lt;/li&gt;
&lt;li&gt;Each team gets the full &lt;code&gt;kubectl&lt;/code&gt; experience&lt;/li&gt;
&lt;li&gt;Easy to create and delete on demand for short-lived environments&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Cons
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Not standalone — requires a host Kubernetes cluster to exist first&lt;/li&gt;
&lt;li&gt;Cannot create real nodes — virtual only&lt;/li&gt;
&lt;li&gt;Advanced networking between vclusters is complex&lt;/li&gt;
&lt;li&gt;Some cluster-scoped resources (like ClusterRoles and CRDs) are not fully isolated&lt;/li&gt;
&lt;li&gt;Requires privileged pod access on the host cluster&lt;/li&gt;
&lt;li&gt;Newer project — less battle-tested than K3s or Minikube&lt;/li&gt;
&lt;li&gt;Node-level debugging is limited&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Best For
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Multi-tenant development environments, per-team isolated clusters, and CI/CD environments where many short-lived clusters need to be spun up and torn down rapidly on existing infrastructure.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  6. k0s — Zero Dependencies, Zero Friction
&lt;/h2&gt;




&lt;h3&gt;
  
  
  What It Is
&lt;/h3&gt;

&lt;p&gt;k0s (pronounced "kay-zeros") from Mirantis lives up to its name: zero host OS dependencies. It is a single binary that includes everything needed to run Kubernetes without requiring any specific kernel modules, swap configuration, or package manager. It works on any Linux distribution out of the box.&lt;/p&gt;

&lt;p&gt;k0s uses an eBPF-based CNI called kube-router, includes Autopilot for automated upgrades, and offers FIPS 140-2 compliance — a feature set that appeals strongly to regulated industries.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;k0s&lt;/strong&gt; prioritises deployment universality. By bundling containerd and all CNI plugins into the binary itself and requiring no kernel module configuration from the host OS, it can be dropped onto virtually any Linux system and run. The eBPF-based kube-router CNI offers modern packet processing without iptables overhead.&lt;/p&gt;

&lt;h3&gt;
  
  
  Architecture
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
┌──────────────────────────────────────────────────────┐
│             k0s binary (systemd / OpenRC)            │
│                                                      │
│  ┌──────────────────────────┐                        │
│  │     k0s controller       │                        │
│  │   (Control Plane)        │───────────────┐        │
│  │                          │               │        │
│  │  • API Server            │               ▼        │
│  │  • etcd (embedded)       │   ┌─────────────────┐  │
│  │  • Scheduler             │   │  k0s worker 1   │  │
│  │  • Controller Manager    │   │                 │  │
│  │  • containerd            │   │  • kubelet      │  │
│  │  • kube-router (eBPF)    │──▶│  • kube-router  │  │
│  │  • Autopilot updater     │   │  • containerd   │  │
│  └──────────────────────────┘   │  • Pods         │  │
│                                  └─────────────────┘ │
│  k0sctl tool → manages cluster lifecycle             │
└──────────────────────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Key Features
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Truly zero host OS dependencies — no kernel module requirements&lt;/li&gt;
&lt;li&gt;FIPS 140-2 compliance mode available&lt;/li&gt;
&lt;li&gt;eBPF-based networking via kube-router&lt;/li&gt;
&lt;li&gt;Autopilot automated upgrades&lt;/li&gt;
&lt;li&gt;k0sctl for full cluster lifecycle management&lt;/li&gt;
&lt;li&gt;ARM64 native support&lt;/li&gt;
&lt;li&gt;Air-gap install support&lt;/li&gt;
&lt;li&gt;Works on any Linux OS (Debian, RHEL, Alpine, CoreOS, etc.)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Quick Start
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;
&lt;span class="c"&gt;# Download k0s&lt;/span&gt;

curl  &lt;span class="nt"&gt;-sSLf&lt;/span&gt;  &amp;lt;https://get.k0s.sh&amp;gt; | &lt;span class="nb"&gt;sudo  &lt;/span&gt;sh

&lt;span class="c"&gt;# Install and start as a service&lt;/span&gt;

&lt;span class="nb"&gt;sudo  &lt;/span&gt;k0s  &lt;span class="nb"&gt;install  &lt;/span&gt;controller  &lt;span class="nt"&gt;--single&lt;/span&gt;

&lt;span class="nb"&gt;sudo  &lt;/span&gt;k0s  start

&lt;span class="c"&gt;# Get kubeconfig&lt;/span&gt;

&lt;span class="nb"&gt;sudo  &lt;/span&gt;k0s  kubeconfig  admin &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; ~/.kube/config

&lt;span class="c"&gt;# Check cluster&lt;/span&gt;

kubectl  get  nodes

&lt;span class="c"&gt;# Add a worker node — generate join token on controller&lt;/span&gt;

&lt;span class="nb"&gt;sudo  &lt;/span&gt;k0s  token  create  &lt;span class="nt"&gt;--role&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;worker

&lt;span class="c"&gt;# On the worker node&lt;/span&gt;

&lt;span class="nb"&gt;sudo  &lt;/span&gt;k0s  &lt;span class="nb"&gt;install  &lt;/span&gt;worker  &lt;span class="nt"&gt;--token-file&lt;/span&gt;  /path/to/token

&lt;span class="nb"&gt;sudo  &lt;/span&gt;k0s  start
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Pros
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Truly zero host OS dependencies — works on any Linux, no special kernel configuration&lt;/li&gt;
&lt;li&gt;FIPS 140-2 compliance for regulated industries&lt;/li&gt;
&lt;li&gt;eBPF-based networking with kube-router is modern and efficient&lt;/li&gt;
&lt;li&gt;Autopilot handles automated upgrades safely&lt;/li&gt;
&lt;li&gt;k0sctl provides a proper cluster lifecycle management tool&lt;/li&gt;
&lt;li&gt;No swap or kernel module pre-requirements&lt;/li&gt;
&lt;li&gt;Air-gap support&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Cons
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Smaller community than K3s or MicroK8s&lt;/li&gt;
&lt;li&gt;Less rich addon ecosystem&lt;/li&gt;
&lt;li&gt;k0sctl adds an additional tool to the workflow&lt;/li&gt;
&lt;li&gt;Some CNI plugins need manual configuration beyond kube-router&lt;/li&gt;
&lt;li&gt;Enterprise support is a paid product from Mirantis&lt;/li&gt;
&lt;li&gt;Fewer third-party integrations and tutorials&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Best For
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Environments where host OS diversity is a challenge&lt;/strong&gt; — mixed Linux distributions, heavily locked-down servers, or compliance-driven deployments needing FIPS 140-2.&lt;/p&gt;




&lt;h2&gt;
  
  
  7. RKE2 — Security-First Enterprise K8s
&lt;/h2&gt;




&lt;h3&gt;
  
  
  What It Is
&lt;/h3&gt;

&lt;p&gt;RKE2 (Rancher Kubernetes Engine 2) is the enterprise evolution of K3s. Where K3s optimises for minimal resource usage and edge deployability, RKE2 optimises for security hardening and compliance. It ships hardened by default with CIS Kubernetes Benchmark compliance, FIPS 140-2 support, automatic etcd snapshots, and deep Rancher integration.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;RKE2&lt;/strong&gt; starts from K3s's architecture and adds a hardening layer: Pod Security Admission enforced by default, etcd encryption at rest, CIS-compliant API server flags, audit logging enabled, and Canal CNI with network policy enforcement. It is Kubernetes made appropriate for government and financial sector requirements.&lt;/p&gt;

&lt;p&gt;If K3s is the lightweight sports car, RKE2 is the armoured vehicle. More resource-intensive, harder to damage.&lt;/p&gt;

&lt;h3&gt;
  
  
  Architecture
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌───────────────────────────────────────────────────────────┐
│              RKE2 Server (Hardened Control Plane)         │
│                                                           │
│  ┌──────────────────────────────────────────────────────┐ │
│  │  CIS-Hardened Kubernetes                             │ │
│  │                                                      │ │
│  │  • Hardened API Server (PSP enforced)                │ │
│  │  • etcd with automated snapshots                     │ │
│  │  • Hardened Scheduler &amp;amp; Controller Manager           │ │
│  │  • Canal / Calico / Cilium CNI (configurable)        │ │
│  │  • containerd runtime                                │ │
│  │  • Cert-manager + auto rotation                      │ │
│  └──────────────────────────────────────────────────────┘ │
│                    │                                      │
│          ┌─────────┴──────────┐                           │
│          ▼                    ▼                           │
│  ┌────────────────┐  ┌────────────────┐                   │
│  │  RKE2 Agent 1  │  │  RKE2 Agent 2  │                   │
│  │  (Worker)      │  │  (Worker)      │                   │
│  └────────────────┘  └────────────────┘                   │
│                                                           │
│  ┌──────────────────────────────────────────────────────┐ │
│  │  Rancher Management Plane (optional)                 │ │
│  └──────────────────────────────────────────────────────┘ │
└───────────────────────────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Key Features
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;CIS Kubernetes Benchmark v1.6 compliant by default&lt;/li&gt;
&lt;li&gt;FIPS 140-2 cryptographic compliance&lt;/li&gt;
&lt;li&gt;etcd with automated periodic snapshots and restoration&lt;/li&gt;
&lt;li&gt;Multiple CNI options: Canal (default), Calico, Cilium&lt;/li&gt;
&lt;li&gt;Automated certificate rotation&lt;/li&gt;
&lt;li&gt;Helm chart integration&lt;/li&gt;
&lt;li&gt;Air-gap install support&lt;/li&gt;
&lt;li&gt;Deep Rancher management platform integration&lt;/li&gt;
&lt;li&gt;Role-based node configuration&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Quick Start
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;
&lt;span class="c"&gt;# Install RKE2 server&lt;/span&gt;

curl  &lt;span class="nt"&gt;-sfL&lt;/span&gt;  &amp;lt;https://get.rke2.io&amp;gt; | sh  -

systemctl  &lt;span class="nb"&gt;enable  &lt;/span&gt;rke2-server.service

systemctl  start  rke2-server.service

&lt;span class="c"&gt;# Get kubeconfig&lt;/span&gt;

&lt;span class="nb"&gt;export  &lt;/span&gt;&lt;span class="nv"&gt;KUBECONFIG&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;/etc/rancher/rke2/rke2.yaml

&lt;span class="c"&gt;# Get join token for workers&lt;/span&gt;

&lt;span class="nb"&gt;cat&lt;/span&gt;  /var/lib/rancher/rke2/server/node-token

&lt;span class="c"&gt;# On worker nodes&lt;/span&gt;

curl  &lt;span class="nt"&gt;-sfL&lt;/span&gt;  &amp;lt;https://get.rke2.io&amp;gt; | &lt;span class="nv"&gt;INSTALL_RKE2_TYPE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"agent"&lt;/span&gt;  sh  -

&lt;span class="nb"&gt;mkdir&lt;/span&gt;  &lt;span class="nt"&gt;-p&lt;/span&gt;  /etc/rancher/rke2/

&lt;span class="nb"&gt;cat&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; /etc/rancher/rke2/config.yaml &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class="no"&gt;EOF&lt;/span&gt;&lt;span class="sh"&gt;

server: https://&amp;lt;SERVER_IP&amp;gt;:9345

token: &amp;lt;NODE_TOKEN&amp;gt;
&lt;/span&gt;&lt;span class="no"&gt;
EOF

&lt;/span&gt;systemctl  &lt;span class="nb"&gt;enable  &lt;/span&gt;rke2-agent.service

systemctl  start  rke2-agent.service
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Pros
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;CIS Kubernetes Benchmark compliance out of the box — no manual hardening&lt;/li&gt;
&lt;li&gt;FIPS 140-2 for regulated environments (finance, government, healthcare)&lt;/li&gt;
&lt;li&gt;Automated etcd snapshots — point-in-time restore capability&lt;/li&gt;
&lt;li&gt;Multiple CNI choices (Canal, Calico, Cilium) for varied network requirements&lt;/li&gt;
&lt;li&gt;Excellent Rancher multi-cluster management integration&lt;/li&gt;
&lt;li&gt;Automated certificate rotation&lt;/li&gt;
&lt;li&gt;Strong air-gap support for isolated environments&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Cons
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;4 GB RAM minimum makes it unsuitable for edge/IoT&lt;/li&gt;
&lt;li&gt;Longer startup time (~2 minutes)&lt;/li&gt;
&lt;li&gt;More operationally complex than K3s&lt;/li&gt;
&lt;li&gt;Overkill for non-compliance use cases&lt;/li&gt;
&lt;li&gt;Tightly coupled to the Rancher ecosystem&lt;/li&gt;
&lt;li&gt;Larger binary and resource footprint&lt;/li&gt;
&lt;li&gt;etcd only — no SQLite lightweight option&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Best For
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Enterprise, compliance-driven, and government workloads&lt;/strong&gt; where security hardening and audit-readiness are non-negotiable.&lt;/p&gt;




&lt;h2&gt;
  
  
  Scoring Across 8 Dimensions
&lt;/h2&gt;




&lt;p&gt;Scores are relative (1–10, higher is better for most dimensions):&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dimension&lt;/th&gt;
&lt;th&gt;KIND&lt;/th&gt;
&lt;th&gt;Minikube&lt;/th&gt;
&lt;th&gt;MicroK8s&lt;/th&gt;
&lt;th&gt;K3s&lt;/th&gt;
&lt;th&gt;Vcluster&lt;/th&gt;
&lt;th&gt;k0s&lt;/th&gt;
&lt;th&gt;RKE2&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Ease of use&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Production readiness&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Resource efficiency&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Multi-node support&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Addon ecosystem&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Edge / IoT fit&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Multi-tenancy&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;CI/CD suitability&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Use Case Decision Guide
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Your Situation&lt;/th&gt;
&lt;th&gt;Best Choice&lt;/th&gt;
&lt;th&gt;Runner-Up&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;GitHub Actions / GitLab CI pipelines&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;KIND&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Vcluster&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Local development on macOS/Windows/Linux&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Minikube&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;MicroK8s&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Developer on Ubuntu workstation&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;MicroK8s&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;K3s&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Raspberry Pi cluster at home&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;K3s&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;MicroK8s&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Industrial IoT / factory floor&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;K3s&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;k0s&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ARM-based edge server&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;K3s&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;MicroK8s&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Production workload on lightweight infra&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;K3s&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;MicroK8s&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Government / regulated enterprise&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;RKE2&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;k0s&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;FIPS 140-2 compliance required&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;RKE2 or k0s&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Multi-tenant dev environments&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Vcluster&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Namespace isolation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Per-team isolated clusters&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Vcluster&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;KIND&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Mixed Linux OS fleet&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;k0s&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;K3s&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Air-gap / offline environment&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;K3s&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;k0s or RKE2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Testing Kubernetes itself&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;KIND&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;HA on bare metal with minimal ops&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;MicroK8s&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;K3s embedded etcd&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Kubernetes with Rancher management&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;RKE2&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;K3s&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h3&gt;
  
  
  The Decision Tree
&lt;/h3&gt;






&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;Do you need production-grade?
├── No → Is it for CI/CD testing?
│         ├── Yes → KIND
│         └── No  → Are you on Ubuntu?
│                   ├── Yes → MicroK8s
│                   └── No  → Minikube
└── Yes → Do you need compliance (FIPS/CIS)?
          ├── Yes → RKE2 (CIS+FIPS) or k0s (FIPS)
          └── No  → Is it edge/IoT/ARM?
                    ├── Yes → K3s
                    └── No  → Need multi-tenancy?
                              ├── Yes → Vcluster
                              └── No  → K3s or MicroK8s
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Final Verdict
&lt;/h2&gt;




&lt;p&gt;After a thorough review, the landscape shakes out clearly:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;K3s&lt;/strong&gt; is the most remarkable project in the lightweight Kubernetes space. It delivers a complete, CNCF-certified Kubernetes distribution in under 100 MB, runs on 512 MB of RAM, and works in air-gapped ARM environments. For the vast majority of production lightweight Kubernetes use cases, K3s is the correct answer.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Vcluster&lt;/strong&gt; solves a problem no other distribution addresses: genuine Kubernetes API-level multi-tenancy without dedicated hardware. If you need to give 10 teams their own isolated clusters, Vcluster is the only sensible approach.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;KIND&lt;/strong&gt; is indispensable for CI/CD. If you run Kubernetes integration tests in any CI system, KIND's 30-second, Docker-native, multi-node clusters are the right tool with no close competitor.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Minikube&lt;/strong&gt; remains the best onboarding experience for developers who are new to Kubernetes. The addon ecosystem and built-in dashboard lower the barrier to entry substantially.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;MicroK8s&lt;/strong&gt; is the best Kubernetes for Ubuntu. If your team lives on Ubuntu workstations and servers, snap-based installation, self-healing, and dqlite HA make it the most frictionless operational experience on that platform.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;k0s&lt;/strong&gt; fills an important niche: mixed Linux fleets and environments where zero host OS dependencies matter more than community size or addon richness.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;RKE2&lt;/strong&gt; is the right answer when your compliance officer needs CIS Kubernetes Benchmark and FIPS 140-2. The resource overhead is the price of admission to heavily regulated sectors.&lt;/p&gt;




&lt;h3&gt;
  
  
  Resources
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://kind.sigs.k8s.io/" rel="noopener noreferrer"&gt;KIND Documentation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://minikube.sigs.k8s.io/docs/" rel="noopener noreferrer"&gt;Minikube Documentation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://microk8s.io/docs" rel="noopener noreferrer"&gt;MicroK8s Documentation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.k3s.io/" rel="noopener noreferrer"&gt;K3s Documentation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.vcluster.com/docs" rel="noopener noreferrer"&gt;Vcluster Documentation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.k0sproject.io/" rel="noopener noreferrer"&gt;k0s Documentation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.rke2.io/" rel="noopener noreferrer"&gt;RKE2 Documentation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.cncf.io/certification/software-conformance/" rel="noopener noreferrer"&gt;CNCF Certified Kubernetes Conformance&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;This post was written in April 2025. Kubernetes moves fast — always check the official documentation for the latest version information.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Tags:&lt;/strong&gt;  &lt;code&gt;kubernetes&lt;/code&gt;  &lt;code&gt;k8s&lt;/code&gt;  &lt;code&gt;k3s&lt;/code&gt;  &lt;code&gt;kind&lt;/code&gt;  &lt;code&gt;minikube&lt;/code&gt;  &lt;code&gt;microk8s&lt;/code&gt;  &lt;code&gt;vcluster&lt;/code&gt;  &lt;code&gt;k0s&lt;/code&gt;  &lt;code&gt;rke2&lt;/code&gt;  &lt;code&gt;devops&lt;/code&gt;  &lt;code&gt;infrastructure&lt;/code&gt;  &lt;code&gt;edge-computing&lt;/code&gt;  &lt;code&gt;cloud-native&lt;/code&gt;  &lt;code&gt;containers&lt;/code&gt;  &lt;code&gt;cncf&lt;/code&gt;&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Running k3s on Proxmox: A Multi-Node Cluster with a VM and LXC Worker — The Hard Way and Back</title>
      <dc:creator>Pendela BhargavaSai</dc:creator>
      <pubDate>Tue, 21 Apr 2026 03:30:00 +0000</pubDate>
      <link>https://dev.to/pendelabhargavasai/running-k3s-on-proxmox-a-multi-node-cluster-with-a-vm-and-lxc-worker-the-hard-way-and-back-1cb4</link>
      <guid>https://dev.to/pendelabhargavasai/running-k3s-on-proxmox-a-multi-node-cluster-with-a-vm-and-lxc-worker-the-hard-way-and-back-1cb4</guid>
      <description>&lt;p&gt;&lt;em&gt;A practical guide covering installation, troubleshooting, and the real story of getting k3s to run inside an LXC container&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvo4jx0xy1tzo3m10e1pj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvo4jx0xy1tzo3m10e1pj.png" alt=" " width="800" height="431"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;




&lt;p&gt;Kubernetes is powerful but notorious for being heavy. k3s, the lightweight Kubernetes distribution from Rancher, fixes that. It strips out legacy APIs, bundles containerd, and ships as a single binary under 100MB. It is perfect for homelabs, edge deployments, and resource-constrained environments.&lt;br&gt;
(more about k3s: &lt;a href="https://traefik.io/glossary/k3s-explained/?ref=adventuresintech.org" rel="noopener noreferrer"&gt;https://traefik.io/glossary/k3s-explained/&lt;/a&gt;)&lt;/p&gt;

&lt;p&gt;This is the first of a series of posts describing how to bootstrap a Kubernetes cluster on &lt;a href="https://proxmox.com/?ref=adventuresintech.org" rel="noopener noreferrer"&gt;Proxmox&lt;/a&gt; using ubuntu VM and LXC containers. By the end of the series, the aim is to have a fully working Kubernetes (&lt;a href="https://k3s.io/?ref=adventuresintech.org" rel="noopener noreferrer"&gt;K3S&lt;/a&gt;) install including &lt;a href="https://metallb.universe.tf/?ref=adventuresintech.org" rel="noopener noreferrer"&gt;MetalLB&lt;/a&gt; load balancer, &lt;a href="https://gateway-api.sigs.k8s.io/guides/getting-started/" rel="noopener noreferrer"&gt;Gateway API&lt;/a&gt; controller and an Istio service mesh. I’ll also have some sample applications installed for good measure.&lt;/p&gt;


&lt;h2&gt;
  
  
  Basically why do I need a Kubernetes cluster ?
&lt;/h2&gt;



&lt;p&gt;At work, I’ve used large K8S clusters in production environments (AWS), clusters are abstracted away behind platform teams, which is efficient for delivery but leaves gaps in understanding how scheduling, networking, storage, and controllers really behave under the hood. Setting up your own cluster gives you that missing layer of operational intuition: you get to break things, debug them, and understand why they broke. For someone already running a fairly complex home setup, using Kubernetes as a unifying platform to experiment, whether or not you fully migrate all your Docker Compose stacks—is less about necessity and more about building practical, transferable expertise.&lt;/p&gt;

&lt;p&gt;In this post I document how I built a three-node k3s cluster on Proxmox VE with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;1 master node&lt;/strong&gt; — a Proxmox VM running Ubuntu&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;1 VM worker node&lt;/strong&gt; — a standard Proxmox VM (worker1)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;1 LXC worker node&lt;/strong&gt; — a Proxmox LXC container (worker2)&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The VM setup was straightforward. The LXC setup was not. This post focuses heavily on the LXC journey — the errors, the fixes, the Linux internals involved, and what it finally took to make it work.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjkgf71akebq4knbgx4fi.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjkgf71akebq4knbgx4fi.png" width="476" height="106"&gt;&lt;/a&gt;&lt;/p&gt;


&lt;h2&gt;
  
  
  Part 1: Setting Up the Master Node
&lt;/h2&gt;


&lt;h3&gt;
  
  
  &lt;em&gt;Installing k3s Server&lt;/em&gt;
&lt;/h3&gt;

&lt;p&gt;On the master VM, installing k3s is a single command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;
curl  &lt;span class="nt"&gt;-sfL&lt;/span&gt;  https://get.k3s.io | sh  -

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;k3s sets up a systemd service, installs containerd, and bootstraps a single-node Kubernetes cluster automatically.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;u&gt;Fixing kubectl Access&lt;/u&gt;
&lt;/h3&gt;

&lt;p&gt;After installation, running &lt;code&gt;kubectl get nodes&lt;/code&gt; immediately fails:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;The connection to the server localhost:8080 was refused&lt;/code&gt; &lt;/p&gt;

&lt;p&gt;This happens because kubectl defaults to &lt;code&gt;localhost:8080&lt;/code&gt; when no kubeconfig is set. k3s stores its kubeconfig at &lt;code&gt;/etc/rancher/k3s/k3s.yaml&lt;/code&gt;. The fix:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;
&lt;span class="nb"&gt;mkdir&lt;/span&gt;  &lt;span class="nt"&gt;-p&lt;/span&gt;  ~/.kube

&lt;span class="nb"&gt;sudo  cp&lt;/span&gt;  /etc/rancher/k3s/k3s.yaml  ~/.kube/config

&lt;span class="nb"&gt;sudo  chown&lt;/span&gt;  &lt;span class="nv"&gt;$USER&lt;/span&gt;:&lt;span class="nv"&gt;$USER&lt;/span&gt;  ~/.kube/config

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or export it permanently:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt;  &lt;span class="s1"&gt;'export KUBECONFIG=/etc/rancher/k3s/k3s.yaml'&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; ~/.bashrc

&lt;span class="nb"&gt;source&lt;/span&gt;  ~/.bashrc

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Retrieve the Node Token
&lt;/h3&gt;

&lt;p&gt;Worker nodes need a token to join the cluster. Grab it from the master:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;
&lt;span class="nb"&gt;sudo  cat&lt;/span&gt;  /var/lib/rancher/k3s/server/node-token

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Keep this value — it is used in every worker join command.&lt;/p&gt;




&lt;h2&gt;
  
  
  Part 2: Adding the VM Worker (worker1)
&lt;/h2&gt;




&lt;h3&gt;
  
  
  &lt;em&gt;Joining the Cluster&lt;/em&gt;
&lt;/h3&gt;

&lt;p&gt;On the worker VM, run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;
curl  &lt;span class="nt"&gt;-sfL&lt;/span&gt;  https://get.k3s.io | &lt;span class="se"&gt;\&lt;/span&gt;

&lt;span class="nv"&gt;K3S_URL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;https://192.168.1.44:6443  &lt;span class="se"&gt;\&lt;/span&gt;

&lt;span class="nv"&gt;K3S_TOKEN&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&amp;lt;node-token&amp;gt; &lt;span class="se"&gt;\&lt;/span&gt;

sh  -

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  &lt;u&gt;Problem: Node Password Rejected&lt;/u&gt;
&lt;/h3&gt;

&lt;p&gt;The agent started but immediately logged:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
Node password rejected, duplicate hostname or contents of

'/etc/rancher/node/password' may not match server node-passwd entry

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This happened because the worker VM had previously joined the cluster. k3s stores a node password on both the node (&lt;code&gt;/etc/rancher/node/password&lt;/code&gt;) and the master (as a Kubernetes secret). When they don't match, the server rejects the node.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fix — on the worker:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;
&lt;span class="nb"&gt;sudo  &lt;/span&gt;systemctl  stop  k3s-agent

&lt;span class="nb"&gt;sudo  rm&lt;/span&gt;  &lt;span class="nt"&gt;-f&lt;/span&gt;  /etc/rancher/node/password

&lt;span class="nb"&gt;sudo  rm&lt;/span&gt;  &lt;span class="nt"&gt;-rf&lt;/span&gt;  /var/lib/rancher/k3s/agent/

&lt;span class="nb"&gt;sudo  &lt;/span&gt;systemctl  start  k3s-agent

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Fix — on the master, delete the stale secret:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;
kubectl  get  secrets  &lt;span class="nt"&gt;-n&lt;/span&gt;  kube-system | &lt;span class="nb"&gt;grep  &lt;/span&gt;node-password

kubectl  delete  secret  worker1.node-password.k3s  &lt;span class="nt"&gt;-n&lt;/span&gt;  kube-system

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  &lt;u&gt;Problem: Duplicate Hostname&lt;/u&gt;
&lt;/h3&gt;

&lt;p&gt;Both the master and worker had the hostname &lt;code&gt;k3s&lt;/code&gt;. k3s uses the hostname as the node name, so the server rejected the second node as a duplicate.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fix — rename the worker:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;
&lt;span class="nb"&gt;sudo  &lt;/span&gt;hostnamectl  set-hostname  worker1

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After renaming and cleaning up the stale secret, the worker joined successfully.&lt;/p&gt;




&lt;h2&gt;
  
  
  Part 3: The LXC Worker — The Real Story
&lt;/h2&gt;




&lt;h3&gt;
  
  
  What is an LXC Container?
&lt;/h3&gt;

&lt;p&gt;LXC (Linux Containers) is a lightweight virtualisation technology. Unlike VMs which emulate full hardware, LXC containers share the host kernel directly. They use Linux namespaces for isolation and cgroups for resource control. They are faster and more efficient than VMs but have less isolation.&lt;/p&gt;

&lt;p&gt;Proxmox LXC containers can be &lt;strong&gt;privileged&lt;/strong&gt; (root inside = root on host) or &lt;strong&gt;unprivileged&lt;/strong&gt; (root inside maps to a regular user on host via UID namespacing). Unprivileged is the default and more secure option.&lt;/p&gt;

&lt;h3&gt;
  
  
  Creating the LXC Container
&lt;/h3&gt;

&lt;p&gt;In Proxmox, I created a Debian Trixie LXC container with:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0xq11x12p45pm40cx5l9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0xq11x12p45pm40cx5l9.png" width="800" height="487"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;em&gt;Joining the Cluster&lt;/em&gt;
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;
curl  &lt;span class="nt"&gt;-sfL&lt;/span&gt;  https://get.k3s.io | &lt;span class="se"&gt;\&lt;/span&gt;

&lt;span class="nv"&gt;K3S_URL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;https://192.168.1.44:6443  &lt;span class="se"&gt;\&lt;/span&gt;

&lt;span class="nv"&gt;K3S_TOKEN&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&amp;lt;node-token&amp;gt; &lt;span class="se"&gt;\&lt;/span&gt;

sh  -

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The install script ran and printed &lt;code&gt;[INFO] systemd: Starting k3s-agent&lt;/code&gt; — and then nothing. It just hung.&lt;/p&gt;

&lt;p&gt;Checking the journal:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;
journalctl  &lt;span class="nt"&gt;-u&lt;/span&gt;  k3s-agent  &lt;span class="nt"&gt;-f&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  &lt;u&gt;Error 1: &lt;code&gt;/dev/kmsg: no such file or directory&lt;/code&gt;&lt;/u&gt;
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;
Error: failed to run Kubelet: failed to create kubelet: open /dev/kmsg: no such file or directory

&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;What is &lt;code&gt;/dev/kmsg&lt;/code&gt;?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;code&gt;/dev/kmsg&lt;/code&gt; is the kernel message buffer device. The Linux kernel uses it to log messages (this is what &lt;code&gt;dmesg&lt;/code&gt; reads). kubelet uses it to watch for OOM (Out of Memory) kill events via the &lt;code&gt;oomWatcher&lt;/code&gt;. Without it, kubelet refuses to start.&lt;/p&gt;

&lt;p&gt;In an unprivileged LXC container, &lt;code&gt;/dev/kmsg&lt;/code&gt; does not exist because the container does not have access to kernel devices.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fix — bind mount from host:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;In &lt;code&gt;/etc/pve/lxc/209.conf&lt;/code&gt; on the Proxmox host:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight conf"&gt;&lt;code&gt;
&lt;span class="n"&gt;lxc&lt;/span&gt;.&lt;span class="n"&gt;mount&lt;/span&gt;.&lt;span class="n"&gt;entry&lt;/span&gt;: /&lt;span class="n"&gt;dev&lt;/span&gt;/&lt;span class="n"&gt;kmsg&lt;/span&gt; &lt;span class="n"&gt;dev&lt;/span&gt;/&lt;span class="n"&gt;kmsg&lt;/span&gt; &lt;span class="n"&gt;none&lt;/span&gt; &lt;span class="n"&gt;bind&lt;/span&gt;,&lt;span class="n"&gt;create&lt;/span&gt;=&lt;span class="n"&gt;file&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This bind mounts the host's &lt;code&gt;/dev/kmsg&lt;/code&gt; into the container. Stop and start (not restart) the LXC:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;
pct  stop  209

pct  start  209

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  &lt;u&gt;Error 2: &lt;code&gt;/dev/kmsg: operation not permitted&lt;/code&gt;&lt;/u&gt;
&lt;/h3&gt;

&lt;p&gt;After adding the bind mount, the error changed slightly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;
open /dev/kmsg: operation not permitted

&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The file now existed in the container but the process was not allowed to open it. The container was still running in user namespace mode (unprivileged), and AppArmor was blocking the access.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fix — disable AppArmor restriction:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight conf"&gt;&lt;code&gt;
&lt;span class="n"&gt;lxc&lt;/span&gt;.&lt;span class="n"&gt;apparmor&lt;/span&gt;.&lt;span class="n"&gt;profile&lt;/span&gt;: &lt;span class="n"&gt;unconfined&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;AppArmor is a Linux Security Module that applies mandatory access control policies. The default Proxmox LXC AppArmor profile blocks access to kernel devices like &lt;code&gt;/dev/kmsg&lt;/code&gt;. Setting it to &lt;code&gt;unconfined&lt;/code&gt; removes all AppArmor restrictions for this container.&lt;/p&gt;




&lt;h3&gt;
  
  
  &lt;u&gt;Error 3: &lt;code&gt;/proc/sys/kernel/panic: read-only file system&lt;/code&gt;&lt;/u&gt;
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;
Failed to start ContainerManager:

open /proc/sys/kernel/panic: read-only file system

open /proc/sys/kernel/panic_on_oops: read-only file system

open /proc/sys/vm/overcommit_memory: read-only file system

&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;What is &lt;code&gt;/proc/sys&lt;/code&gt;?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;code&gt;/proc&lt;/code&gt; is a virtual filesystem the kernel exposes so userspace can read and write kernel parameters. &lt;code&gt;/proc/sys/&lt;/code&gt; specifically contains sysctl values — tuneable kernel settings.&lt;/p&gt;

&lt;p&gt;kubelet needs to write to these on startup:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;kernel/panic&lt;/code&gt; — configure kernel panic timeout&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;kernel/panic_on_oops&lt;/code&gt; — whether a kernel oops causes a panic&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;vm/overcommit_memory&lt;/code&gt; — memory overcommit policy&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In an unprivileged LXC container, &lt;code&gt;/proc&lt;/code&gt; is mounted read-only for safety. Any process inside the container (even root inside) cannot modify these values.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fix — mount proc and sys as read-write:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight conf"&gt;&lt;code&gt;
&lt;span class="n"&gt;lxc&lt;/span&gt;.&lt;span class="n"&gt;mount&lt;/span&gt;.&lt;span class="n"&gt;auto&lt;/span&gt;: &lt;span class="s2"&gt;"proc:rw sys:rw"&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This tells LXC to mount &lt;code&gt;/proc&lt;/code&gt; and &lt;code&gt;/sys&lt;/code&gt; with read-write access instead of the default read-only.&lt;/p&gt;




&lt;h3&gt;
  
  
  &lt;u&gt;Error 4: Various Permission Denied Errors&lt;/u&gt;
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;
write /proc/self/oom_score_adj: permission denied

Failed to set sysctl: open /proc/sys/net/netfilter/nf_conntrack_max: permission denied

&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;These were caused by the container still running as unprivileged — the process was root inside the container but mapped to a normal user on the host, so many privileged operations were blocked.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fix — switch to privileged container:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight properties"&gt;&lt;code&gt;
&lt;span class="py"&gt;unprivileged&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;0&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is the most significant change. A privileged container maps root inside to actual root on the host. This removes the UID namespace remapping that caused most of the permission errors.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Also needed:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight conf"&gt;&lt;code&gt;
&lt;span class="n"&gt;lxc&lt;/span&gt;.&lt;span class="n"&gt;cgroup2&lt;/span&gt;.&lt;span class="n"&gt;devices&lt;/span&gt;.&lt;span class="n"&gt;allow&lt;/span&gt;: &lt;span class="n"&gt;a&lt;/span&gt;

&lt;span class="n"&gt;lxc&lt;/span&gt;.&lt;span class="n"&gt;cap&lt;/span&gt;.&lt;span class="n"&gt;drop&lt;/span&gt;:

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;cgroup2.devices.allow: a&lt;/code&gt; — allows the container access to all devices via the cgroup device controller&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;cap.drop:&lt;/code&gt; (empty) — prevents Proxmox from dropping any Linux capabilities. By default, Proxmox drops capabilities like &lt;code&gt;CAP_SYS_ADMIN&lt;/code&gt;, &lt;code&gt;CAP_NET_ADMIN&lt;/code&gt;, and &lt;code&gt;CAP_SYS_PTRACE&lt;/code&gt; from LXC containers. k3s needs these.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Also needed: &lt;code&gt;features: keyctl=1,nesting=1&lt;/code&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;keyctl=1&lt;/code&gt; — enables the Linux kernel keyring inside the container. containerd uses this to securely store credentials and keys for image pulls.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;nesting=1&lt;/code&gt; — enables nested containerisation. k3s runs containerd inside the LXC container, and containerd runs pods (more containers) inside itself. Without nesting enabled, Proxmox blocks the inner container creation.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  &lt;u&gt;Final Working LXC Config&lt;/u&gt;
&lt;/h3&gt;

&lt;p&gt;After applying all these changes and doing a full &lt;code&gt;pct stop&lt;/code&gt; / &lt;code&gt;pct start&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;
journalctl  &lt;span class="nt"&gt;-u&lt;/span&gt;  k3s-agent  &lt;span class="nt"&gt;-f&lt;/span&gt;

&lt;span class="c"&gt;# ... containerd is now running&lt;/span&gt;

&lt;span class="c"&gt;# ... Server ACTIVE&lt;/span&gt;

&lt;span class="c"&gt;# ... Started kubelet&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Summary: What Each Modification Does
&lt;/h2&gt;




&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjchnlx7oapagd55avh30.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjchnlx7oapagd55avh30.png" width="800" height="1022"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Part 4: LXC as a k3s Worker — Features and Limitations
&lt;/h2&gt;




&lt;h3&gt;
  
  
  &lt;u&gt;&lt;em&gt;Features / Advantages&lt;/em&gt;&lt;/u&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Resource efficiency&lt;/strong&gt; — LXC containers consume significantly less memory and CPU than VMs. A VM needs a full OS kernel in memory. An LXC container shares the host kernel, so the overhead is minimal. worker2 running k3s uses around 250–300MB RAM idle versus a VM which would use 500MB+ for the OS alone.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fast startup&lt;/strong&gt; — LXC containers start in 1–3 seconds versus 15–30 seconds for a VM. For ephemeral worker nodes or autoscaling scenarios this matters.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Storage efficiency&lt;/strong&gt; — LXC uses the host filesystem directly (with a root filesystem overlay). No separate virtual disk emulation layer. I/O is closer to bare metal performance.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Simple networking&lt;/strong&gt; — LXC containers participate in the same Proxmox bridge (&lt;code&gt;vmbr0&lt;/code&gt;) as VMs. No extra networking configuration is needed for k3s to communicate between the master VM and the LXC worker.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Density&lt;/strong&gt; — you can run more LXC containers on the same Proxmox host than VMs, making it ideal for testing multi-node cluster topologies on limited hardware.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;u&gt;&lt;em&gt;Limitations&lt;/em&gt;&lt;/u&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Shared kernel — no kernel version isolation&lt;/strong&gt; — all LXC containers on a host run the same kernel version as the host. You cannot run a different kernel inside an LXC container. This matters if you need a specific kernel feature or version for your workloads.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Privileged mode is a security trade-off&lt;/strong&gt; — to get k3s working we had to switch to a privileged container and disable AppArmor. In a privileged container, a root escape inside the container gives root on the host. For a homelab or trusted environment this is acceptable; for production or multi-tenant setups it is a significant risk.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No hardware virtualisation&lt;/strong&gt; — LXC containers cannot run nested VMs. If your workloads need hardware-level isolation or GPU passthrough in the container, a VM is required.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Kernel module limitations&lt;/strong&gt; — the LXC container cannot load kernel modules that aren't already loaded on the host. During setup we saw:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;
modprobe: FATAL: Module br_netfilter not found

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;These modules need to be loaded on the Proxmox host, not inside the container.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Some syscalls are blocked&lt;/strong&gt; — even in privileged mode, certain syscalls that could affect the host are restricted. This can cause subtle compatibility issues with some container workloads.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Not suitable for untrusted workloads&lt;/strong&gt; — because the kernel is shared, a kernel exploit inside an LXC container could theoretically affect the host and all other containers. Never run untrusted code in a privileged LXC container.&lt;/p&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;




&lt;p&gt;Getting k3s running on a Proxmox LXC container is absolutely possible, but it requires understanding why each restriction exists and selectively removing the ones that conflict with k3s's requirements. The journey from a blank LXC to a working cluster node touched on AppArmor, Linux capabilities, cgroups, kernel device access, namespace nesting, and virtual filesystem permissions.&lt;/p&gt;

&lt;p&gt;The key takeaway: LXC containers are not VMs. They share the host kernel, and every security restriction that makes them safe is also a potential blocker for complex software like k3s that expects a full OS environment. The solution is not to blindly disable everything — it is to understand each error, trace it to the underlying Linux feature, and make the minimal change required to unblock it.&lt;/p&gt;

&lt;p&gt;The final cluster — one control plane VM and two workers (one VM, one LXC) — runs stably with k3s managing scheduling, networking, and DNS across all three nodes via CoreDNS.&lt;/p&gt;

&lt;p&gt;I now have a vanilla multi-node Kubernetes cluster running in a Ubuntu VM and an LXC container and accessible from my machine. It’s got nothing deployed inside it yet, but that’s easily fixed.... see u in part 2.&lt;/p&gt;




&lt;p&gt;*Built on Proxmox VE with k3s v1.34.6+k3s1 — Debian Trixie LXC — Ubuntu VM nodes&lt;/p&gt;

</description>
      <category>devops</category>
      <category>kubernetes</category>
      <category>linux</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>"Why can’t I just mount S3 like a drive?” AWS finally answering that question in 2026</title>
      <dc:creator>Pendela BhargavaSai</dc:creator>
      <pubDate>Sun, 12 Apr 2026 13:35:35 +0000</pubDate>
      <link>https://dev.to/pendelabhargavasai/why-cant-i-just-mount-s3-like-a-drive-aws-finally-answering-that-question-in-2026-4g00</link>
      <guid>https://dev.to/pendelabhargavasai/why-cant-i-just-mount-s3-like-a-drive-aws-finally-answering-that-question-in-2026-4g00</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;From "why can't I just mount S3 like a drive?" to AWS finally answering that question in 2026.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;p&gt;I've had that conversation more times than I can count.&lt;/p&gt;

&lt;p&gt;A developer joins a new AWS project, looks at the architecture, and asks: &lt;em&gt;"We're already storing everything in S3 — why do we also need EFS? Can't we just mount S3 directly?"&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;And every time, the answer was the same patient explanation about object storage vs file systems, why they're fundamentally different, and why you need separate services for separate workloads. It was the right answer. It just wasn't a satisfying one.&lt;/p&gt;

&lt;p&gt;That changed in April 2026 when AWS launched &lt;strong&gt;S3 Files&lt;/strong&gt; — and suddenly that conversation got a lot shorter.&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fne1ezqqr8ls1axsuyqwh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fne1ezqqr8ls1axsuyqwh.png" alt=" " width="800" height="467"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;But before we get there, let's start from the beginning. Because understanding &lt;em&gt;why&lt;/em&gt; S3 Files matters requires understanding the problem it's solving. And that means understanding the full AWS storage landscape.&lt;/p&gt;


&lt;h2&gt;
  
  
  The AWS Storage Trinity (Before S3 Files)
&lt;/h2&gt;

&lt;p&gt;AWS has three primary storage services, each built for a completely different purpose. Engineers often get confused because on the surface they all seem to do the same thing: store data. But the &lt;em&gt;way&lt;/em&gt; they store it — and who can access it and how — is completely different.&lt;/p&gt;

&lt;p&gt;Here's the simplest way I know to think about it:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;S3&lt;/strong&gt; is like a giant library. You can store billions of books (objects), and anyone with the right access can retrieve any book. But to fix a typo on page 47, you have to reprint the entire book.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;EBS&lt;/strong&gt; is like a hard drive physically attached to your computer. Super fast, but only your computer can use it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;EFS&lt;/strong&gt; is like a shared office filing cabinet on a network. Anyone in the office can open a drawer, pull out a folder, and edit a document — at the same time.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Let's go deeper on each one.&lt;/p&gt;


&lt;h2&gt;
  
  
  Amazon S3 — Object Storage Built for Scale
&lt;/h2&gt;

&lt;p&gt;S3 (Simple Storage Service) launched in 2006 and fundamentally changed how the world thinks about storing data. The core idea is simple: you have &lt;strong&gt;buckets&lt;/strong&gt;, and inside buckets you store &lt;strong&gt;objects&lt;/strong&gt;. Each object is just a file plus its metadata, stored at a unique key (think of it like a URL).&lt;/p&gt;
&lt;h3&gt;
  
  
  What makes S3 special
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Virtually unlimited scale.&lt;/strong&gt; S3 stores more than 500 trillion objects across hundreds of exabytes today.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;11 nines of durability (99.999999999%).&lt;/strong&gt; AWS automatically replicates your data across at least three Availability Zones.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pay only for what you use.&lt;/strong&gt; No minimum capacity, no infrastructure to manage.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multiple storage classes.&lt;/strong&gt; From S3 Standard (~$0.023/GB) down to Glacier Deep Archive (~$0.00099/GB) for data you almost never touch.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  The one thing S3 cannot do
&lt;/h3&gt;

&lt;p&gt;Here's the catch that trips everyone up: &lt;strong&gt;S3 is not a file system.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;When you store something in S3, it becomes an immutable object. If you want to change even a single character in a file, you have to download the entire object, make your change, and re-upload the whole thing as a new object. There's no such thing as "open this file and edit line 47." That's just not how object storage works.&lt;/p&gt;

&lt;p&gt;This isn't a bug — it's by design. The immutability of objects is part of what makes S3 so durable and scalable. But it creates real friction for any workload that needs to &lt;em&gt;work with&lt;/em&gt; data the way normal applications do: open a file, read some bytes, write some bytes, save.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# What you can do with S3&lt;/span&gt;
aws s3 &lt;span class="nb"&gt;cp &lt;/span&gt;myfile.txt s3://my-bucket/myfile.txt    &lt;span class="c"&gt;# upload&lt;/span&gt;
aws s3 &lt;span class="nb"&gt;cp &lt;/span&gt;s3://my-bucket/myfile.txt ./myfile.txt  &lt;span class="c"&gt;# download&lt;/span&gt;
aws s3 &lt;span class="nb"&gt;rm &lt;/span&gt;s3://my-bucket/myfile.txt               &lt;span class="c"&gt;# delete&lt;/span&gt;

&lt;span class="c"&gt;# What you CANNOT do&lt;/span&gt;
&lt;span class="c"&gt;# Open myfile.txt and append a line — impossible without full re-upload&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fihtdjmavdki0i9x7xit0.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fihtdjmavdki0i9x7xit0.jpg" alt=" " width="800" height="397"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Amazon EBS — The Fast Attached Drive
&lt;/h2&gt;

&lt;p&gt;EBS (Elastic Block Store) is block storage — the AWS equivalent of an SSD attached directly to your server. When you launch an EC2 instance, the root volume (where the operating system lives) is an EBS volume.&lt;/p&gt;

&lt;h3&gt;
  
  
  What EBS is good at
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Speed.&lt;/strong&gt; EBS delivers single-digit millisecond latency because it behaves like a local disk.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;POSIX semantics.&lt;/strong&gt; You can open files, write individual bytes, seek to specific positions — everything a normal file system supports.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Consistency.&lt;/strong&gt; What you write is immediately readable. No eventual consistency concerns.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  The hard limit of EBS
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;EBS volumes can only be attached to one EC2 instance at a time&lt;/strong&gt; (with some multi-attach exceptions for specific use cases). &lt;/p&gt;

&lt;p&gt;This means if you have a cluster of 10 EC2 instances all running your application, each one needs its own EBS volume. They can't share data through EBS. If instance A writes a file, instance B can't see it without some kind of sync mechanism.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;EC2 Instance A  →  EBS Volume A  (can't share)
EC2 Instance B  →  EBS Volume B  (separate, isolated)
EC2 Instance C  →  EBS Volume C  (separate, isolated)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For single-instance workloads — databases, operating system volumes, single-server applications — EBS is excellent. The moment you need shared storage across multiple servers, you hit a wall.&lt;/p&gt;




&lt;h2&gt;
  
  
  Amazon EFS — The Shared Network Drive
&lt;/h2&gt;

&lt;p&gt;EFS (Elastic File System) is AWS's managed Network File System (NFS). Think of it as a shared drive that any number of EC2 instances, containers, or Lambda functions can mount simultaneously and use like a local file system.&lt;/p&gt;

&lt;h3&gt;
  
  
  What EFS solves
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Concurrent access.&lt;/strong&gt; Thousands of compute resources can mount and use the same EFS volume at the same time.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Full POSIX semantics.&lt;/strong&gt; Open files, edit bytes in-place, file locking, directory operations — everything works.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scales automatically.&lt;/strong&gt; The file system grows and shrinks as you add or remove files. No capacity planning required.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Sub-millisecond latency&lt;/strong&gt; on Standard tier.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;EC2 Instance A  ──┐
EC2 Instance B  ──┤──→  EFS Volume  (all share the same files)
EC2 Instance C  ──┘
Lambda Function ──┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F368aftu96o0epx2bty0g.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F368aftu96o0epx2bty0g.jpg" alt=" " width="800" height="588"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Where EFS falls short
&lt;/h3&gt;

&lt;p&gt;The pricing model. &lt;strong&gt;EFS charges you for every gigabyte stored, whether you touched it this month or not.&lt;/strong&gt; Standard tier is $0.30/GB-month — roughly 13x more expensive than S3 Standard per gigabyte.&lt;/p&gt;

&lt;p&gt;This is fine when your data is "hot" (actively accessed). It's painful when you have petabytes of data where only a fraction is actively used at any time. You end up paying full file system prices for data that's sitting idle.&lt;/p&gt;

&lt;p&gt;And the other problem: &lt;strong&gt;EFS has zero native integration with S3.&lt;/strong&gt; They're completely separate systems. Your data lake is in S3. Your compute needs EFS. So you write sync scripts to copy data back and forth — and now you have two copies of everything, two storage bills, and a manual process that breaks at the worst possible times.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Old Workflow Pain (The Problem All of This Creates)
&lt;/h2&gt;

&lt;p&gt;Before S3 Files, a typical ML or data engineering team's workflow looked like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;S3 Data Lake
    ↓  (manual copy — takes time, costs money)
EFS Volume
    ↓  (mount on EC2)
EC2 Training Job
    ↓  (output back to EFS)
    ↓  (another manual copy)
S3 Data Lake  ← results stored here for analytics
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every arrow in that diagram is a point of failure. Every copy step is a delay, a cost, and a potential for the two copies to drift out of sync. Engineers were spending real engineering hours maintaining these sync pipelines — hours that weren't building anything valuable.&lt;/p&gt;

&lt;p&gt;This is the problem that s3fs tried to solve, years before AWS had an official answer.&lt;/p&gt;




&lt;h2&gt;
  
  
  s3fs-fuse — The Community's Workaround
&lt;/h2&gt;

&lt;p&gt;If you've been working with AWS for a few years, you've probably encountered &lt;code&gt;s3fs-fuse&lt;/code&gt;. It's an open-source FUSE (Filesystem in Userspace) tool that lets you mount an S3 bucket as a local directory on Linux, macOS, or FreeBSD.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Install&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;apt-get &lt;span class="nb"&gt;install &lt;/span&gt;s3fs

&lt;span class="c"&gt;# Configure credentials&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"ACCESS_KEY_ID:SECRET_ACCESS_KEY"&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; ~/.passwd-s3fs
&lt;span class="nb"&gt;chmod &lt;/span&gt;600 ~/.passwd-s3fs

&lt;span class="c"&gt;# Mount your bucket&lt;/span&gt;
s3fs my-bucket /mnt/s3-data &lt;span class="nt"&gt;-o&lt;/span&gt; &lt;span class="nv"&gt;passwd_file&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;~/.passwd-s3fs
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After that, you can run &lt;code&gt;ls&lt;/code&gt;, &lt;code&gt;cp&lt;/code&gt;, &lt;code&gt;cat&lt;/code&gt; — your S3 bucket looks like a local folder. For a quick demo or a simple use case, it feels magical.&lt;/p&gt;

&lt;h3&gt;
  
  
  What's actually happening under the hood
&lt;/h3&gt;

&lt;p&gt;Here's the thing nobody tells you upfront: s3fs isn't &lt;em&gt;really&lt;/em&gt; giving you file system access to S3. It's translating file commands into S3 API calls — and the translation has serious limitations.&lt;/p&gt;

&lt;p&gt;When you "edit" a file through s3fs, this is what actually happens:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;You: nano myfile.txt  (make a small change, save)
     ↓
s3fs: GET entire object from S3 → download to local temp cache
s3fs: You edit the local temp copy
s3fs: On file close → PUT entire object back to S3 (full re-upload)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Change one character in a 10GB file? s3fs downloads all 10GB, makes the change, and uploads all 10GB again. Every time.&lt;/p&gt;

&lt;h3&gt;
  
  
  The real limitations you need to know
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;No file locking.&lt;/strong&gt; If two processes try to write to the same file through s3fs at the same time, you get data corruption. Not an error message — silent data corruption.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No atomic renames.&lt;/strong&gt; Renaming a file in s3fs copies it to a new key and deletes the old one. Any application that relies on atomic renames (which includes most databases and many log processors) will break.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Slow directory listings.&lt;/strong&gt; Every &lt;code&gt;ls&lt;/code&gt; is a &lt;code&gt;ListObjects&lt;/code&gt; API call to S3. On a bucket with millions of objects, this is painfully slow.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No hard links or symbolic links.&lt;/strong&gt; S3 simply doesn't support them.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;Operation          | What s3fs does              | Problem
-------------------|-----------------------------|-----------------------
Read file          | GET entire object           | Slow for large files
Edit file          | Download → edit → full PUT  | Expensive re-upload
Append to file     | Rewrite entire object       | Very expensive
Rename file        | Copy + Delete               | Not atomic
File lock          | Not supported               | Data corruption risk
List directory     | ListObjects API call        | Slow on large buckets
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;s3fs works well for lightweight, read-heavy, single-process use cases. But the moment you need multi-process access, in-place edits, or production reliability — it starts breaking down. The community built it because AWS didn't have a better answer. Eventually, AWS tried building their own version.&lt;/p&gt;




&lt;h2&gt;
  
  
  Mountpoint for S3 — AWS's Open-Source Attempt (2023)
&lt;/h2&gt;

&lt;p&gt;In 2023, AWS released &lt;strong&gt;Mountpoint for S3&lt;/strong&gt;, their own open-source FUSE client. It was faster than s3fs-fuse and better optimised for cloud-native read-heavy workloads.&lt;/p&gt;

&lt;p&gt;But it still couldn't do in-place edits, directory renames, or file locking. It was better than s3fs-fuse, but it still hit the same fundamental ceiling: you can't make S3's API behave like a real file system by pretending.&lt;/p&gt;

&lt;p&gt;AWS knew this. Internally, they'd been trying to solve it properly for years.&lt;/p&gt;




&lt;h2&gt;
  
  
  Amazon S3 Files — The Real Solution (April 2026)
&lt;/h2&gt;

&lt;p&gt;On April 7, 2026, AWS launched &lt;strong&gt;S3 Files&lt;/strong&gt; — and it's the most significant S3 update since the service launched.&lt;/p&gt;

&lt;p&gt;The internal project was even called "EFS3" at one point. One engineer on the team described the design process as &lt;em&gt;"a battle of unpalatable compromises."&lt;/em&gt; Getting object storage and file system semantics to truly coexist is genuinely hard engineering. Every design decision forced a tradeoff where either the file presentation or the object presentation had to give something up.&lt;/p&gt;

&lt;p&gt;What they landed on is clever: instead of trying to make the S3 API &lt;em&gt;behave&lt;/em&gt; like a file system (which is what s3fs does), they did the opposite — they took a real, production-grade file system (EFS) and connected it directly to S3 storage.&lt;/p&gt;

&lt;h3&gt;
  
  
  How S3 Files actually works
&lt;/h3&gt;

&lt;p&gt;S3 Files uses a &lt;strong&gt;two-tier architecture&lt;/strong&gt;:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tier 1 — EFS Cache Layer (hot data)&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Stores your active working set: recently written files, recently read files, metadata&lt;/li&gt;
&lt;li&gt;Delivers ~1ms latency&lt;/li&gt;
&lt;li&gt;Serves small files (under 128KB by default) entirely from cache&lt;/li&gt;
&lt;li&gt;Handles all NFS file operations — open, read, write, rename, lock&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Tier 2 — S3 Bucket (your full dataset)&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Holds your complete data at normal S3 prices (~$0.023/GB)&lt;/li&gt;
&lt;li&gt;Large reads (1MB+) bypass the cache entirely and stream directly from S3 for free&lt;/li&gt;
&lt;li&gt;Changes made through the file system sync back to S3 automatically within minutes
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Your Application
      ↓  (NFS mount — standard Linux file operations)
EFS Cache Layer  ←→  Smart Router
      ↓                    ↓
   Hot data            Cold/large data
   (~1ms)              (streams from S3, free)
      ↓                    ↓
      └────────────────────┘
                  ↓
            S3 Bucket
       (your data, always here)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The key insight: &lt;strong&gt;your data never leaves S3.&lt;/strong&gt; The EFS cache is just a smart caching layer on top. You're not maintaining two copies — you have one copy in S3, accessible via both the S3 API and the file system mount simultaneously.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgk0a728p2arun7vntopk.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgk0a728p2arun7vntopk.png" alt=" " width="800" height="466"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  OLD way to New way
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F662i5jvty3f4qfmx1swi.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F662i5jvty3f4qfmx1swi.png" alt=" " width="800" height="532"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Getting started in 3 steps
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Step 1: Create an S3 file system&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;In the AWS Console → S3 → File Systems → Create file system. Enter your bucket name, done.&lt;/p&gt;

&lt;p&gt;Or via CLI:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws s3api create-file-system &lt;span class="nt"&gt;--bucket&lt;/span&gt; my-bucket
aws s3api create-mount-target &lt;span class="nt"&gt;--file-system-id&lt;/span&gt; fs-xxxx &lt;span class="nt"&gt;--subnet-id&lt;/span&gt; subnet-xxxx
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Step 2: Mount it on your EC2 instance&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Make sure the &lt;code&gt;amazon-efs-utils&lt;/code&gt; package is installed (preinstalled on AWS AMIs), then:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo mkdir&lt;/span&gt; /mnt/s3files
&lt;span class="nb"&gt;sudo &lt;/span&gt;mount &lt;span class="nt"&gt;-t&lt;/span&gt; s3files fs-0aa860d05df9afdfe:/ /mnt/s3files
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Step 3: Use it like any local directory&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Create a file&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Hello S3 Files"&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; /mnt/s3files/hello.txt

&lt;span class="c"&gt;# Edit it in place&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"New line added"&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; /mnt/s3files/hello.txt

&lt;span class="c"&gt;# List files&lt;/span&gt;
&lt;span class="nb"&gt;ls&lt;/span&gt; &lt;span class="nt"&gt;-la&lt;/span&gt; /mnt/s3files/

&lt;span class="c"&gt;# The same data is accessible via S3 API too&lt;/span&gt;
aws s3 &lt;span class="nb"&gt;ls &lt;/span&gt;s3://my-bucket/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Changes you make through the file system mount appear in S3 within minutes. Changes made directly to the S3 bucket appear in the file system within seconds.&lt;/p&gt;

&lt;h3&gt;
  
  
  Security — what you need to know
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;IAM integration for access control at both file system and object level&lt;/li&gt;
&lt;li&gt;Data encrypted in transit using TLS 1.3&lt;/li&gt;
&lt;li&gt;Data encrypted at rest using SSE-S3 (or KMS if you prefer customer-managed keys)&lt;/li&gt;
&lt;li&gt;POSIX permissions (UID/GID) stored as S3 object metadata&lt;/li&gt;
&lt;li&gt;Monitor via CloudWatch metrics and CloudTrail logs&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Pricing — the part that actually makes sense
&lt;/h3&gt;

&lt;p&gt;S3 Files charges EFS-level rates, but &lt;strong&gt;only on the fraction of data you're actively working with&lt;/strong&gt;:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;What you pay for&lt;/th&gt;
&lt;th&gt;Rate&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;High-performance storage (hot data)&lt;/td&gt;
&lt;td&gt;$0.30/GB-month&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Reads (small files served from cache)&lt;/td&gt;
&lt;td&gt;$0.03/GB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Writes&lt;/td&gt;
&lt;td&gt;$0.06/GB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Everything else in your S3 bucket&lt;/td&gt;
&lt;td&gt;Standard S3 rates (~$0.023/GB)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;If you have a 100TB dataset but only 1TB is actively used at any time — you pay EFS rates on 1TB and S3 rates on the other 99TB. AWS claims up to 90% cost savings compared to the old pattern of cycling data between S3 and a dedicated EFS volume.&lt;/p&gt;




&lt;h2&gt;
  
  
  Putting It All Together — Which Service Should You Use?
&lt;/h2&gt;

&lt;p&gt;Here's the honest answer:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Use this&lt;/th&gt;
&lt;th&gt;When you need&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;S3&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Bulk storage, backups, data lakes, analytics, static assets, anything accessed via API&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;EBS&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;OS volumes, databases, single-instance high-performance storage&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;EFS&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Shared file system for legacy NAS migration, on-premises workloads moving to cloud, apps that need pure NFS without S3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;S3 Files&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ML pipelines, agentic AI workflows, data engineering, any workload where both S3 API and file system access are needed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;s3fs-fuse&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Quick prototypes, read-heavy single-process scripts, legacy apps where you can't change the architecture&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  The quick comparison
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy897f8eanwniy70iw975.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy897f8eanwniy70iw975.png" alt=" " width="800" height="437"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Why This Matters for ML and AI Workloads
&lt;/h2&gt;

&lt;p&gt;If you're building machine learning pipelines or agentic AI systems, S3 Files is worth paying close attention to.&lt;/p&gt;

&lt;p&gt;The old workflow was: data lives in S3 → copy to EFS before training → run training job → copy results back to S3. For large datasets, that copy step alone could take hours. You were also paying double storage costs during the transition.&lt;/p&gt;

&lt;p&gt;With S3 Files, your training job mounts the S3 bucket directly. The EFS cache warms up as your training reads data. No copy step. No sync script. No duplicate storage.&lt;/p&gt;

&lt;p&gt;For agentic AI systems specifically — where multiple agents need to coordinate through shared files, read from each other's outputs, maintain shared state — S3 Files provides exactly the concurrent NFS access with close-to-open consistency that these workloads need. Standard Python file operations, standard shell tools, all working against data that lives in S3.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Short Version
&lt;/h2&gt;

&lt;p&gt;For a decade, AWS storage was a choice: pay S3 prices and lose file system semantics, or pay EFS prices and lose S3 integration. Teams wrote sync scripts, maintained duplicate data, and spent engineering time on storage plumbing instead of actual product work.&lt;/p&gt;

&lt;p&gt;s3fs-fuse was the community's best attempt at a workaround — and it worked, up to a point. But it was always emulating file system behavior on top of an API that wasn't designed for it.&lt;/p&gt;

&lt;p&gt;S3 Files is the first time AWS has genuinely solved this at the right layer. Real NFS semantics, real S3 storage, real production reliability. One bucket, two protocols, no compromises.&lt;/p&gt;

&lt;p&gt;If you've ever maintained a sync script between your data lake and your compute layer — you know exactly what problem this solves. And you know exactly how good it feels to delete that script.&lt;/p&gt;




&lt;h2&gt;
  
  
  Resources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://aws.amazon.com/s3/features/files/" rel="noopener noreferrer"&gt;Amazon S3 Files product page&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://aws.amazon.com/blogs/aws/launching-s3-files-making-s3-buckets-accessible-as-file-systems/" rel="noopener noreferrer"&gt;AWS Blog: Launching S3 Files&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/AmazonS3/latest/userguide/s3-files.html" rel="noopener noreferrer"&gt;S3 Files documentation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/s3fs-fuse/s3fs-fuse" rel="noopener noreferrer"&gt;s3fs-fuse on GitHub&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://aws.amazon.com/s3/pricing/" rel="noopener noreferrer"&gt;Amazon S3 pricing&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://aws.amazon.com/efs/pricing/" rel="noopener noreferrer"&gt;Amazon EFS pricing&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.youtube.com/watch?v=zb8TdNJhZCk" rel="noopener noreferrer"&gt;Intro to S3 Files by Darko Mesaros&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;Published April 2026. All pricing figures reflect us-east-1 as of the time of writing.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;If this helped you, drop a reaction or leave a comment — curious what storage patterns others are running into in the wild.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>aws</category>
      <category>devops</category>
      <category>machinelearning</category>
      <category>architecture</category>
    </item>
  </channel>
</rss>
