Michael Levan

Posted on Apr 6, 2022

Prerequisites To Learning And Understanding Kubernetes

#kubernetes #devops #git #cloud

Kubernetes is at the top of everyone’s minds today. From engineers to CTO’s and senior leadership teams, everyone is hearing about this whole “Kubernetes” thing and the promise to make application and cloud-native deployments easier.

The problem is you simply cannot just “turn on some Kubernetes” and poof, all of your problems go away. Not only is Kubernetes extremely complex, but there are several underlying components that make up Kubernetes. With the underlying components, there’s a ton of underlying knowledge an engineer needs to truly understand Kubernetes.

In this blog post, you’ll learn about what you need to understand prior to even breaking into Kubernetes to be successful.

Operating Systems

First things first; Operating Systems (OS). When you think of OS’s, you can think of it very briefly:

Linux
Windows

Kubernetes runs somewhere, even the cloud Kubernetes services like Azure Kubernetes Service (EKS) and Elastic Kubernetes Service (EKS). That somewhere is an operating system. The operating system is usually Linux, but recently, AKS now has the ability to have Windows Node Pools so you can run Windows workloads.

If you try breaking into Kubernetes without understanding OS’s, chances are you won’t be able to debug a node group (where the Kubernetes apps are running via Kubernetes Worker Nodes) or keep the underlying operating systems up to date from a security standpoint.

To learn operating systems for this purpose, pretty much any “linux fundamentals” free course on YouTube would do the trick.

Infrastructure

Speaking of Operating Systems in the previous section, OS’s run somewhere. That somewhere is on infrastructure. The infrastructure can be anything from:

Bare metal
Virtual machines in a hypervisor like ESXi or Hyper-V
EC2 instances in AWS
Azure virtual machines

and several other places.

The underlying infrastructure, at one point or another, will require you to get into troubleshooting it. The goal is definitely to not have to RDP or SSH into servers, but sometimes you might have to.

Not to mention that to run Kubernetes, you need to understand infrastructure architecture. Things like how many Kubernetes worker nodes you should run, data centers they should run in, high availability, auto scaling, and network connectivity all play a crucial role in Kubernetes operating the way it should.

To learn infrastructure, you’ll typically get this knowledge from a systems administrator, infrastructure engineer, cloud engineer, or virtualization engineer style role.

Storage

When you’re deploying a stateful application, that state needs to be stored somewhere. Kubernetes Pods and those containers in the Pods are 100% ephemeral, meaning, they are meant to be turned off, destroyed, and re-deployed. Because of that, you can’t rely on out-of-the-box Kubernetes Pods to manage state for you.

With that being said, you have to understand storage. Storage, from an IT perspective, is really just where you store data. That data can be anything from:

Database data
Pictures
Audio clippings

and pretty much anything else. When thinking about storage, think about Google Drive, Dropbox, etc...

When it comes to applications, you’ll typically see data stored for the application that cannot be deleted without the application being corrupt. You’ll also see applications that need to connect to a database, like MySQL.

The primary ways that Kubernetes typically works with some type of stateful data is:

Stateful sets
Volumes
Database connections from Kubernetes to a database like MySQL

When it comes to an application, something always has to be stored somewhere. It’s no different when it comes to Kubernetes.

Networking

Much like infrastructure, networking is needed for Kubernetes. In fact, networking is one of the most crucial aspects of Kubernetes. Without knowledge of networking, ports, load balancers, and firewalls, you won’t be able to deploy a Kubernetes application that either needs to:

Communicate/send data/receive data from one backend Kubernetes application to another
Needs to be public facing for users to reach like a frontend web app or website

Networking is huge in Kubernetes for all aspects of communication between Kubernetes Deployments, Pods, and Services.

If you’re not up to speed with networking, you should definitely read a Network+ book or watch some videos on YouTube. Here are a few recommendations:

Security

Have you heard of the most recent security breach? Sure you have! There’s about fifty million per day!

Obviously exaggerating, but you get the point.

When it comes to breaches, a lot of the time it’s related to application security. Because you’re deploying applications to Kubernetes, you need to understand how to secure them. This doesn’t mean you need to be a red hat hacker taking down the FBI, but you do need to understand how to deploy an application in a secure fashion and also monitor it.

The other bit is the actual infrastructure-layer of Kubernetes and the security that’s needed there. It’s not about just the application, but who has permissions to what application, what cluster, and what portions of Kubernetes.

Below are a few tips from a security standpoint when it comes to Kubernetes:

Ensure you have a solid grasp of Role Based Access Control (RBAC)
API security (who’s accessing it)
Firewall and encryption
Audit logging
Ensure that if a new version of the Kubernetes API comes out and it has a security update, you plan the update
Ensure that Kubernetes Secrets are at the forefront of your mind

Programming

What's an application made of? Code of course! If you're deploying an application, you need to understand coding. There are a lot of people that believe this isn't true, but it's a reality. I won't try to change your mind, but I'll give you a few viewpoints:

If you're deploying an application to Kubernetes, it's almost certain you will have to troubleshoot at some point. If you don't have debugging experience, you'll fall short on the needs of deploying Kubernetes.
If you're just starting out with Kubernetes, chances are you'll need an application to deploy. Do you REALLY want to rely on applications that someone wrote on GitHub? Not to mention that even if an application is written, you'll still have to do some work to prepare it to be deployed via Kubernetes (dockerfiles, manifests, etc.). In psychology (specifically NLP), there's an idea around what you can control and what you cannot control. A lot of your energy should go into what you can control. Don't rely on others to get you to where you want to be. Don't rely on the need to find someone else's code.

Do you need to be a 10+ year programmer building the next Twitter or Instagram? No, absolutely not. Do you need to know how to troubleshoot code errors and write your own code (functions, automation code, etc.)? Absolutely.

Putting on your developer hat is a big part of Kubernetes. In fact, any Kubernetes application running is created from a Kubernetes Manifest, which is YAML code. Before I get roasted, I know YAML isn’t a programming language. That’s just a reference to explain that pretty much everything you do in Kubernetes is code-related.

If you’re new to programming, you can’t go wrong with learning Python and/or Go (golang). Learn basic things like functions, variables, APIs and how to troubleshoot code. After that, you should be well on your way.

Automation

When thinking about all of the bonuses that Kubernetes gives you in terms of deploying and managing applications, the next question is how do I deploy applications to Kubernetes?

Surely you don’t want to manually sit at a terminal running kubectl create -f deployment.yaml all day long, so you have to have a way to deploy applications to Kubernetes. There are several options, like GitOps, but GitOps is an advanced topic that you should learn after learning Kubernetes.

The best place to start from an automation standpoint would be CICD.

CICD stands for Continuous Integration (CI) and Continuous Deployment/Delivery (CD). CI is a way to package up your code to get it ready for deployment. When you think of “packaging the code up”, think of it like putting the code into a well-defined tiny little box that’s safe, secure, and has all of the requirements it needs to run successfully. CD is about taking that well-packaged code and sending it to some environment. That environment could be anything from VMs on-prem to the cloud to Kubernetes.

Troubleshooting

Last but certainly, absolutely, 110% not least, troubleshooting. I’ll keep this section brief because if you’ve worked in tech for more than one month, you’ll know what troubleshooting is. Essentially, it’s the art of figuring out a problem in your tech stack.

You will find yourself time and time again looking at Kubernetes logs from the cluster level to the Pod/application level to figure out what’s going on with failed deployments. To have the ability to do this, you’ll need solid troubleshooting skills.

Unfortunately, there’s really no course or book to teach you troubleshooting (other than psychology-related content to understand your mind a bit more), so the best way to figure out troubleshooting is by doing it.

Did I mention it’s an art?