Originally published on lavkesh.com
Kubernetes began as Google's internal container orchestration system called Borg. They open-sourced it in 2014, and it's now the standard for running containerized workloads at scale. If you're running Docker containers in production, you likely need Kubernetes.
Kubernetes automates deployment, scaling, and management of containerized applications. You describe the desired state, and Kubernetes makes it happen and keeps it that way. If a container crashes, Kubernetes restarts it. If traffic spikes, it spins up more replicas. This means you stop managing servers manually.
A Kubernetes cluster has two main parts: the control plane and worker nodes. The control plane is the brain. It runs the API server, scheduler, controller manager, and etcd. The API server is the entry point for all cluster operations. etcd is a distributed key-value store that holds the cluster's state. The scheduler assigns workloads to nodes. The controller manager keeps the actual state matching the desired state.
Worker nodes are where your containers actually run. Each node runs kubelet, which communicates with the control plane, kube-proxy, which handles network routing, and a container runtime like containerd.
Some core concepts in Kubernetes include pods, deployments, services, ConfigMaps, Secrets, and namespaces. A pod is the smallest deployable unit. It usually wraps a single container, though you can run multiple tightly coupled containers in one pod. A deployment manages a set of pod replicas. You tell it how many copies of your app to run and what container image to use. It handles rolling updates and rollbacks.
A service provides a stable network endpoint for a set of pods. Pods come and go, but the service keeps a consistent IP and DNS name. Kubernetes handles load balancing across the pods behind it. ConfigMaps and Secrets let you inject configuration and credentials into pods without hardcoding them in container images. Namespaces provide logical partitions within a cluster, which is useful for separating teams or environments.
To deploy an application, you write a YAML manifest describing your deployment and apply it with kubectl apply -f deployment.yaml. Kubernetes creates the pods, keeps them running, and exposes them via a service. That's the basic flow. For scaling, you can run kubectl scale deployment webapp - replicas=5 or configure a Horizontal Pod Autoscaler (HPA) to scale automatically based on CPU or memory.
Kubernetes also supports rolling updates and rollbacks. When you push a new image version, Kubernetes rolls it out gradually. Old pods come down as new ones come up. If the new version is broken, kubectl rollout undo deployment/webapp takes you back to the previous version in seconds.
Some essential kubectl commands include kubectl apply -f file.yaml to apply a configuration to the cluster, kubectl get pods to list running pods, kubectl describe pod for detailed info about a pod, kubectl scale deployment - replicas=N to scale a deployment, kubectl rollout status deployment/ to check rollout progress, kubectl rollout undo deployment/ to roll back to previous version, and kubectl logs to view pod logs.
On Azure, AKS (Azure Kubernetes Service) manages the control plane for you. You pay for worker nodes, not the control plane. This makes it much cheaper to run for most workloads.
Most teams underestimate the importance of setting resource requests and limits for containers. I've seen clusters crash at 3am because a single misconfigured app consumed 100% CPU and starved the node. Use kubectl describe node to check allocatable resources. A 16-core node with no limits is a death sentence when 5 apps all ask for 4 cores. Start with requests: 500m CPU and 512Mi memory per container, and scale up only after observing production usage.
Pod eviction is a silent killer. If your container runtime (containerd or CRI-O) isn't configured with proper memory thresholds, OOMKilled events will silently terminate your app. Set memory limits 20% below node capacity to account for system daemons. I once spent 4 hours debugging why a service kept restarting in production - turned out the sysdig agent was eating memory without limits.
Networking can break your day. A service exposing ClusterIP instead of LoadBalancer will never be reachable outside the cluster. Use MetalLB for on-prem bare-metal clusters. For cloud providers, avoid using LoadBalancer for every service - it's expensive. Instead, create a single Ingress controller (NGINX or Traefik) and route traffic via host-based routing. We once wasted $12k/month on AWS ELBs before switching to a shared Ingress.
StatefulSets are not your friend unless you need stable network identifiers. Deploying databases in StatefulSets is fine, but for stateless apps, Deployments are 90% more efficient. I've seen teams spend weeks trying to debug why their Redis cluster kept resharding - turns out they mistakenly used a Deployment instead of a StatefulSet. The lesson: only use StatefulSets when you need ordered, unique, persistent storage per pod.
AKS's managed control plane doesn't absolve you from cost control. We had a customer's AKS cluster grow to 120 nodes in 3 weeks because they left Horizontal Pod Autoscalers unbounded. The fix? Set maxReplicas to a reasonable number and pair HPA with a Cluster Autoscaler that scales down nodes when idle. For batch jobs, use Azure Batch instead of AKS - the cost difference is 40% for CPU workloads.
Top comments (0)