loading...
Cover image for The Kubernetes Workloads

The Kubernetes Workloads

feloy profile image Philippe MARTIN Originally published at leanpub.com ・10 min read

Kubernetes Overview Extracts (3 Part Series)

1) The Kubernetes Workloads 2) Kubernetes Authentication 3) Kubernetes Authorization

This article is part of the eBook Kubernetes Overview: Prepare CKA & CKAD certifications (https://leanpub.com/learning-kubernetes).

The Pod is the master piece of the Kubernetes cluster architecture.

The fundamental goal of Kubernetes is to help you manage your containers. The Pod is the minimal piece deployable in a Kubernetes cluster, containing one or several containers.

From the kubectl command line, you can run a pod containing a container as simply as running this command:

$ kubectl run --generator=run-pod/v1 nginx --image=nginx
pod/nginx created

By adding --dry-run -o yaml to the command, you can see the YAML template you would have to write to create the same Pod:

$ kubectl run --generator=run-pod/v1 nginx --image=nginx --dry-run -o yaml
apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: null
  labels:
    run: nginx
  name: nginx
spec:
  containers:
  - image: nginx
    name: nginx
    resources: {}
  dnsPolicy: ClusterFirst
  restartPolicy: Always
status: {}

Or, if you greatly simplify the template by keeping only the required fields:

-- simple.yaml
apiVersion: v1
kind: Pod
metadata:
  name: nginx
spec:
  containers:
  - name: nginx
    image: nginx

You can now start the Pod by using this template:

$ kubectl apply -f simple.yaml
pod/nginx created

The Pod created is ready ... if you are not very fussy. Otherwise, the Pod offers a long list of fields to make it more production-ready. Here are all these fields.

Pod specs

Here is a classification of the Pod specification fields.

  • Containers fields will define and parameterize more precisely each container of the Pod, whether it is a normal container (containers) or an init container (initContainers). The imagePullSecrets field will help to download containers images from private registries.

  • Volumes field (volumes) will define a list of volumes that containers will be able to mount and share.

  • Scheduling fields will help you define the most appropriate Node to deploy the Pod, by selecting nodes by labels (nodeSelector), directly specifying a node name (nodeName), using affinity and tolerations, by selecting a specific scheduler (schedulerName), and by requiring a specific runtime class (runtimeClassName). They will also be used to prioritize a Pod over other Pods (priorityClassNameand priority).

  • Lifecycle fields will help define if a Pod should restart after termination (restartPolicy) and fine-tune the periods after which processes running in the containers of a terminating pod are killed (terminationGracePeriodSeconds) or after which a running Pod will be stopped if not yet terminated (activeDeadlineSeconds). They also help define readiness of a pod (readinessGates).

  • Hostname and Name resolution fields will help define the hostname (hostname) and part of the FQDN (subdomain) of the Pod, add hosts in the /etc/hosts files of the containers (hostAliases), fine-tune the /etc/resolv.conf files of the containers (dnsConfig) and define a policy for the DNS configuration (dnsPolicy).

  • Host namespaces fields will help indicate if the Pod must use host namespaces for network (hostNetwork), PIDs (hostPID), IPC (hostIPC) and if containers will share the same (non-host) process namespace (shareProcessNamespace).

  • Service account fields will be useful to give specific rights to a Pod, by affecting it a specific service account (serviceAccountName) or by disabling the automount of the default service account with automountServiceAccountToken.

  • Security Context field (securityContext) helps define various security attributes and common container settings at the pod level.

Container Specs

An important part of the definition of a Pod is the definition of the containers it will contain.

We can separate container fields into two parts. The first part contains fields that are related to the container runtime (image, entrypoint, ports, environment variables and volumes), the second part containing fields that will be handled by the Kubernetes system.

The fields related to container runtime are:

  • Image fields define the image of the container (image) and the policy to pull the image (imagePullPolicy).

  • Entrypoint fields define the command (command) and arguments (args) of the entrypoint and its working directory (workingDir).

  • Ports field (ports) defines the list of ports to expose from the container.

  • Environment variables fields help define the environment variables that will be exported in the container, either directly (env) or by referencing ConfigMap or Secret values (envFrom).

  • Volumes fields define the volumes to mount into the container, whether they are a filesystem volume (volumeMounts) or a raw block volume (volumeDevices).

The fields related to Kubernetes are:

  • Resources field (resources) helps define the resource requirements and limits for a container.

  • Lifecycle fields help define handlers on lifecycle events (lifecycle), parameterize the termination message (terminationMessagePath and terminationMessagePolicy), and define probes to check liveness (livenessProbe) and readiness (readinessProbe) of the container.

  • Security Context field helps define various security attributes and common container settings at the container level.

  • Debugging fields are very specialized fields, mostly for debugging purposes (stdin, stdinOnce and tty).

Pod Controllers

The pod, although being the master piece of the Kubernetes architecture, is rarely used alone. You will generally use a Controller to run a pod with some specific policies.

The different controllers handling pods are:

  • ReplicaSet: ensures that a specified number of pod replicas are running at any given time.
  • Deployment: enables declarative updates for Pods and ReplicaSets.
  • StatefulSet: manages updates of Pods and ReplicaSets, taking care of stateful resources.
  • DaemonSet: ensures that all or some nodes are running a copy of a Pod.
  • Job: starts pods and ensures they complete.
  • CronJob: creates a Job on a time-based schedule.

In Kubernetes, all controllers respect the principle of the Reconcile Loop: the controller perpetually watches for some objects of interest, to be able to detect if the actual state of the cluster (the objects running into the cluster) satisfies the specs of the different objects the controller is responsible for and to adapt the cluster consequently.

Let's examine more precisely how are working the extensively used ReplicaSet and Deployment controllers.

ReplicaSet controller

Fields for a ReplicaSet are:

  • replicas indicates how many replicas of selected Pods you want.
  • selector defines the Pods you want the ReplicaSet controller to manage.
  • template is the template used to create new Pods when insufficient replicas are detected by the Controller.
  • minReadySeconds indicates the number of seconds the controller should wait after a Pod starts without failing to consider the Pod is ready.

The ReplicaSet controller perpetually watches the Pods with the labels specified with selector. At any given time, if the number of actual running Pods with these labels:

  • is greater than the requested replicas, some Pods will be terminated to satisfy the number of replicas. Note that the terminated Pods are not necessarily Pods that were created by the ReplicaSet controller,
  • is lower than the requested replicas, new Pods will be created with the specified Pod template to satisfy the number of replicas. Note that to avoid the ReplicaSet controller to create Pods in a loop, the specified template must create a Pod selectable by the specified selector (this is the reason why you must set the same labels in the selector.matchLabels and template.metadata.labels fields).

Note that:

  • the selector field of a ReplicaSet is immutable,
  • changing the template of a ReplicaSet will not have an immediate effect. It will affect the Pods that will be created after this change,
  • changing the replicas field will immediately trigger the creation or termination of Pods.

Deployment controller

Fields for a Deployment are:

  • replicas indicates the number of replicas requested.
  • selector defines the Pods you want the Deployment controller to manage.
  • template is the template requested for the Pods.
  • minReadySeconds indicates the number of seconds the controller should wait after a Pod starts without failing to consider the Pod is ready.
  • strategy is the strategy to apply when changing the replicas of the previously and currently active ReplicaSets.
  • revisionHistoryLimit is the number of ReplicaSet to keep for future use.
  • paused indicates if the Deployment is active or not.
  • progressDeadlineSeconds

The Deployment controller perpetually watches the ReplicaSets with the requested selector. Among these:

  • if a ReplicaSet with the requested template exists, the controller will ensure that the number of replicas for this ReplicaSet equals the number of requested replicas (by using the requested strategy) and minReadySeconds equals the requested one.
  • if no ReplicaSet exists with the requested template, the controller will create a new ReplicaSet with the requested replicas, selector, template and minReadySeconds.
  • for ReplicaSets with a non matching template, the controller will ensure that the number of replicas is set to zero (by using the requested strategy).

Note that:

  • the selector field of a Deployment is immutable,
  • changing the template field of a Deployment will immediately:
    • either trigger the creation of a new ReplicaSet if no one exists with the requested selector and template
    • or update an existing one matching the requested selector and template with the requested replicas (using strategy),
    • set to zero the number of replicas of other ReplicaSets,
  • changing the replicas or minReadySeconds field of a Deployment will immediately update the corresponding value of the corresponding ReplicaSet (the one with the requested template).

With this method, the Deployment controller manages a series of ReplicaSets, one for each revision of the Pod template. The active ReplicaSet is the one with a positive number of Replicas, the other revisions having a number of Replicas to zero.

This way, you can switch from one revision to another (for a rollback for example) by switching the Pod template from one revision to another.

Update and Rollback

Let's first deploy an image of the nginx server, with the help of a Deployment:

$ kubectl create deployment nginx --image=nginx:1.10
deployment.apps/nginx created

The command kubectl rollout provides several subcommands to work with deployments.

The subcommand status gives us the status of the deployment:

$ kubectl rollout status deployment nginx
deployment "nginx" successfully rolled out

The history subcommand gives us the history of revisions for the deployment. Here the deployment is at its first revision:

$ kubectl rollout history deployment nginx
deployment.apps/nginx
REVISION  CHANGE-CAUSE
1         <none>

We will now update the image of nginx to use the 1.11 revision. One way is to use the kubectl set image command:

$ kubectl set image deployment nginx nginx=nginx:1.11
deployment.extensions/nginx image updated

We can see with the history subcommand that the deployment is at its second revision:

$ kubectl rollout history deployment nginx
deployment.apps/nginx
REVISION  CHANGE-CAUSE
1         <none>
2         <none>

The change-cause is empty by default. it can either contain the command used to make the rollout by using the --record flag:

$ kubectl set image deployment nginx nginx=nginx:1.12 --record
deployment.apps/nginx image updated
$ kubectl rollout history deployment nginx
deployment.apps/nginx
REVISION  CHANGE-CAUSE
1         <none>
2         <none>
3         kubectl set image deployment nginx nginx=nginx:1.12 --record=true

Or by setting the kubernetes.io/change-cause annotation after the rollout:

$ kubectl set image deployment nginx nginx=nginx:1.13
deployment.apps/nginx image updated
$ kubectl annotate deployment nginx \
  kubernetes.io/change-cause="update to revision 1.13" \
  --record=false --overwrite=true
$ kubectl rollout history deployment nginx
deployment.apps/nginx
REVISION  CHANGE-CAUSE
1         <none>
2         <none>
3         kubectl set image deployment nginx nginx=nginx:1.12 --record=true
4         update to revision 1.13

It is also possible to edit the specifications of the deployment:

$ kubectl edit deployment nginx

Your preferred editor opens, you can for example add an environment variable FOO=bar to the specification of the container:

[...]
    spec:
      containers:
      - image: nginx:1.13
        env:
        - name: FOO
          value: bar
[...]

After you save the template and quit the editor, the new revision is deployed. Let's verify that the new pod contains this environment variable:

$ kubectl describe pod -l app=nginx
[...]
    Environment:
      FOO:  bar
[...]

Let's set a change-cause for this release and see the history:

$ kubectl annotate deployment nginx \
  kubernetes.io/change-cause="add FOO environment variable" \
  --record=false --overwrite=true
$ kubectl rollout history deployment nginx
deployment.apps/nginx
REVISION  CHANGE-CAUSE
1         <none>
2         <none>
3         kubectl set image deployment nginx nginx=nginx:1.12 --record=true
4         update to revision 1.13
5         add FOO environment variable

Now let's rollback the last rollout with the undo subcommand:

$ kubectl rollout undo deployment nginx
deployment.apps/nginx rolled back
$ kubectl rollout history deployment nginx
deployment.apps/nginx
REVISION  CHANGE-CAUSE
1         <none>
2         <none>
3         kubectl set image deployment nginx nginx=nginx:1.12 --record=true
5         add FOO envvar
6         update to revision 1.13

We see that we switched back to the 4th release (which disappeared in the list and has been renamed as the 6th revision).

It is also possible to rollback to a specific revision. For example, to use the nginx:1.12 image again:

$ kubectl rollout undo deployment nginx --to-revision=3
deployment.apps/nginx rolled back
$ kubectl rollout history deployment nginx
deployment.apps/nginx
REVISION  CHANGE-CAUSE
1         <none>
2         <none>
5         add FOO envvar
6         update to revision 1.13
7         kubectl set image deployment nginx nginx=nginx:1.12 --record=true

Finally, you can verify that one ReplicaSet exists for each revision:

$ kubectl get replicaset
NAME               DESIRED   CURRENT   READY   AGE
nginx-65c8969d67   0         0         0       58m
nginx-68b47b4d58   0         0         0       62m
nginx-7856959c59   1         1         1       62m
nginx-84c7fd7848   0         0         0       62m
nginx-857df58577   0         0         0       62m

Deployment Strategies

You have seen in the Deployment controller that the deployment controller provides different strategies when changing the number of replicas of the old and new replicasets.

The Recreate Strategy

The simplest strategy is the Recreate strategy: in this case, the old replicaset will be downsized to zero, and when all pods of this replicaset are stopped, the new replicaset will be upsized to the number of requested replicas.

Some consequences are:

  • there will be a small downtime, the time the old pods stop and the new pods start,
  • no additional resources are necessary to run previous and new pods in parallel,
  • old and new versions will not run simultaneously.

The RollingUpdate Strategy

The RollingUpdate strategy is a more advanced strategy, and the one by default when you create a Deployment.

The goal of this strategy is to update from previous to new version without downtime.

This strategy will combine the possibility to downsize and upsize replicasets and the possibility to expose pods through services.

You will see in Discovery and Load-balancing that pods are traditionally accessed through Services. A Service resource declares a list of Endpoints, which are the list of Pods that are exposed through this service. Pods are removed from endpoints of services when there are not ready to serve requests and are added when they become ready to serve requests.

The readiness of a pod is determined by the state of the Readiness probes declared for its containers. If you do not declare readiness probes for your containers, the risk is that the pods are detected ready before they really are, and traffic is sent to them while they are still in their startup phase.

During a rolling update, the deployment controller will, on the one hand:

  • upsize the number of replicas of the new versions,
  • when a replica is ready, it will be added to the service endpoints by the endpoints controller,

and on the other hand:

  • mark replicas of old versions not ready, so they are removed from the service endpoints by the endpoints controller,
  • stop these replicas.

Depending on traffic and available resources, you may want to either first augment the number of new versions replicas, then stop old replicas, or conversely first stop old replicas then start new versions replicas.

Fot this, the fields maxSurge and maxUnavailable of the Deployment strategy field indicate how many replicas can be present respectively in addition to or less than the expected number of replicas. Depending on these values, the Deployment controller will either first start new versions or conversely first stop old versions.

This article is part of the eBook Kubernetes Overview: Prepare CKA & CKAD certifications. If you learned things, read on with 15 more chapters: https://leanpub.com/learning-kubernetes

Kubernetes Overview Extracts (3 Part Series)

1) The Kubernetes Workloads 2) Kubernetes Authentication 3) Kubernetes Authorization

Posted on by:

feloy profile

Philippe MARTIN

@feloy

Currently interested in Cloud, JAMStack, Kubernetes, Go, Angular, and always PostgreSQL, Linux and more...

Discussion

markdown guide