Understanding Kubernetes through a concrete example
Chuck Ha Jul 13
Date: July 12th, 2018 Kubernetes Version: v1.11.0 📝 is a sidebar ⚠️ is a warning
Kubernetes is a container orchestration system. You can think of a container as an application and multiple containers as a pod. The orchestration part means you tell Kubernetes what containers you want running and it will take care of actually running the containers in your cluster, routing traffic to the correct pods, and many other features.
Kubernetes provides a schema for your pod definitions. This means you fill in a well documented template and give it to the Kubernetes cluster. Kubernetes then figures out what will run where, runs your pods and configures the cluster network. If a pod crashes, Kubernetes will notice the system is not in the correct state anymore. Kubernetes will act to return the system to the defined state.
Let's look at an example: A blogging platform called devtoo.com on a Kubernetes cluster.
The first step is to figure out the components of devtoo.com. Let's say these are all the components necessary:
- A web server that accepts HTTP traffic from the internet. Examples of web servers include nginx and apache
- An application server that loads the rails app into memory and serves requests. This would be the rails application that powers devtoo.com.
- A database to store all of our awesome posts. Postgres, mysql and MongoDB are all database examples.
- A cache to bypass the application and database and immediately return a result. Examples of caches include redis and memcached.
The end goal
The next step is to figure out what the final system should look like. Kubernetes gives you a lot of choice here. The components could each run in their own pod or they could all be put into one pod. I like to start at the simplest place and then fix the solution if it sucks. To me, that means each component will be run in its own pod. A typical web request will enter the system and hit the web server. The web server will ask the cache if it has a result for that endpoint. If it does, the result is returned immediately. If it does not, the request is passed on to the application server. The application server is configured to talk to the database and generate dynamic content which gets sent back to the web browser.
Defining the system
Kubernetes maps services to pods. There will be one service for each pod. This will allow you to reference other pods with DNS.
apiVersion: v1 kind: Pod metadata: name: nginx labels: app: web spec: containers: - name: nginx image: registry.hub.docker.com/library/nginx:1.15 ports: - containerPort: 80 --- apiVersion: v1 kind: Service metadata: name: web spec: selector: app: web ports: - protocol: TCP port: 80 targetPort: 80
The service defines a selector,
app: web. The service will route traffic to any pod that matches that selector. If you look at the pod definition you will see that there is an
app: web label defined on the pod. That means traffic comes into the service on port 80 and gets sent to the nginx pod on the
targetPort, also 80 in this case. The
containerPort must match.
Here, you use your magic wand and produce an nginx config that is embedded in the nginx image that sends traffic to the cache and then if there is no result to the app server.
Here is the cache definition:
apiVersion: v1 kind: Pod metadata: name: redis labels: app: cache spec: containers: - name: redis image: registry.hub.docker.com/library/redis:4.0 ports: - containerPort: 6379 --- apiVersion: v1 kind: Service metadata: name: cache spec: selector: app: cache ports: - protocol: TCP port: 6379 targetPort: 6379
And the database definition:
apiVersion: v1 kind: Pod metadata: name: postgres labels: app: db spec: containers: - name: db image: registry.hub.docker.com/library/postgres:10.4 ports: - containerPort: 6379 --- apiVersion: v1 kind: Service metadata: name: database spec: selector: app: cache ports: - protocol: TCP port: 6379 targetPort: 6379
Those are all of the dependencies that were considered for this deployment of devtoo.com. Next the application itself must be configured. Rails can use an environment variable to connect to a database. You could define that in the pod YAML like this:
⚠️This is super insecure! Kubernetes has much better ways to do this but I'm omitting them to keep the scope of this post "small".
apiVersion: v1 kind: Pod metadata: name: app labels: app: app spec: containers: - name: devtoo-com env: - name: DATABASE_URL value: postgresql://user1:password1@database/dev_to_db image: registry.hub.docker.com/devtoo.com/app:v9001 ports: - containerPort: 3001 --- apiVersion: v1 kind: Service metadata: name: devtoo-com spec: selector: app: app ports: - protocol: TCP port: 3001 targetPort: 3001
The last piece needed is an ingress point, a place where traffic can enter the cluster from the outside world.
📝I'm glossing over IngressControllers because, while required, they are an implementation detail to be ignored at this level of understanding.
apiVersion: extensions/v1beta1 kind: Ingress metadata: name: dev-to spec: backend: serviceName: web servicePort: 80
This says that any traffic received at this ingress point will be sent to the service with a name of
web on port 80.
Now your cluster is set up, let's trace a packet to get this blogpost. You enter http://devtoo.com/chuck_ha/this-post into your browser. http://devtoo.com resolves in DNS to some IP address which is a load balancer in front of your kubernetes cluster. The load balancer sends the traffic to your ingress point. Since there is only one service on the ingress, the traffic is then sent to the web service which is mapped to the nginx pod. The nginx container inspects the packet and sends it to the cache service which is mapped to the redis pod. The redis pod has never seen this URL before so execution continues from nginx. The request is sent to the application server where this page is generated, cached and returned to your web browser.
Then you click on the 🦄 button!
📝 A list of things I skipped so you could focus on the meat and not get lost in the details: