Understanding Kubernetes Requests and Limits: A Simple Guide

Kubernetes is a powerful container orchestration platform, widely used for deploying and managing applications. One of the key features that makes Kubernetes flexible and efficient is the ability to set resource requests and limits for containers. But what do these terms mean, and why are they important?

In this blog post, we'll break down what requests and limits are, why they matter, and how you can use them to optimize your Kubernetes deployments.

What are Kubernetes Requests and Limits?

When you deploy a containerized application in Kubernetes, each container needs access to system resources like CPU and memory (RAM). Kubernetes lets you define the amount of CPU and memory a container can use. This is done through requests and limits.

Requests: This is the amount of resources that Kubernetes will guarantee for a container. When you set a request, you are telling Kubernetes, "I need at least this amount of CPU or memory to run the container." The scheduler uses the request values to decide which node in the cluster should run the container.
Limits: This is the maximum amount of resources that the container can use. If the container tries to use more resources than the limit, Kubernetes will intervene. For CPU, it might throttle the container's CPU usage, and for memory, it could terminate the container (kill it) and possibly restart it.

Why Do Requests and Limits Matter?

Setting requests and limits correctly helps ensure that your applications run smoothly and efficiently. Here are some of the key reasons why they matter:

Resource Efficiency: By setting both requests and limits, you ensure that your containers don't use more resources than they need, which can help optimize the usage of cluster resources and prevent bottlenecks.
Fair Resource Distribution: If you don’t set resource limits, one container could consume all available CPU or memory on a node, starving other containers of the resources they need. With limits, Kubernetes ensures that no container can monopolize the node’s resources.
Preventing Resource Exhaustion: Without limits, you could accidentally over-allocate resources to a container, causing other workloads to suffer. If you set proper limits, Kubernetes can protect your application from consuming excessive resources.
Avoiding Unexpected Container Failures: Setting memory limits is especially important because if a container exceeds its memory limit, Kubernetes will kill the container, thinking it is a misbehaving process. This helps avoid the situation where a container runs indefinitely out of control and potentially crashes the node.

How to Set Requests and Limits

To define requests and limits for CPU and memory, you specify them in your Pod's configuration YAML file. Here’s an example:

apiVersion: v1
kind: Pod
metadata:
  name: example-pod
spec:
  containers:
  - name: example-container
    image: nginx
    resources:
      requests:
        memory: "64Mi"  # 64 MiB of memory guaranteed
        cpu: "250m"     # 250 milliCPU (0.25 of a CPU core) guaranteed
      limits:
        memory: "128Mi" # 128 MiB of memory maximum
        cpu: "500m"     # 500 milliCPU (0.5 of a CPU core) maximum

Key Points:

Requests are defined under requests, and they specify the minimum amount of resources the container needs.
Limits are defined under limits, and they specify the maximum amount of resources the container can use.

Understanding CPU and Memory Units

CPU is measured in millicores (m). For example:
- 1 CPU is represented as 1000m or 1, which means 1 full CPU core.
- 500m means half of a CPU core.
- 250m means a quarter of a CPU core.
Memory is measured in units like:
- Mi (mebibytes), which is 1024 * 1024 bytes, and is more accurate for containerized applications.
- Gi (gibibytes), which is 1024 MiB or approximately 1.07 GB.
- For example: 64Mi means 64 mebibytes of memory.

What Happens When a Container Exceeds Its Limits?

CPU: If the container uses more CPU than its limit, Kubernetes will throttle the container’s CPU usage. This means that it will not be allowed to exceed the allocated CPU time, which could slow down the container but won’t kill it.
Memory: If the container exceeds its memory limit, Kubernetes will terminate the container and possibly restart it. This is because excessive memory usage is often a sign of a memory leak or a misbehaving process, and Kubernetes tries to protect the node’s stability.

Best Practices for Requests and Limits

To make the most of Kubernetes' resource management features, here are some best practices to follow:

Set Requests and Limits for Every Container: It's a good practice to always set both requests and limits for your containers. Without them, Kubernetes may not be able to schedule your pods efficiently or might overcommit the resources, causing instability.
Use Realistic Values: Don’t set requests and limits too high or too low. If you set them too low, your container may not have enough resources to run properly. If you set them too high, you might waste cluster resources, leaving less for other workloads.
Monitor and Adjust: Kubernetes doesn’t provide a one-size-fits-all solution. It's important to monitor the performance of your pods and adjust the resource values accordingly. Over time, you’ll get a better sense of the resources your application really needs.
Use Resource Requests to Control Pod Scheduling: Use resource requests to control pod scheduling, ensuring that your application is scheduled on nodes with enough resources to handle it.
Consider Vertical Pod Autoscaling (VPA): If you are unsure about the right values for requests and limits, you can use Kubernetes Vertical Pod Autoscaler (VPA), which adjusts resource requests and limits for your containers based on historical usage.

Conclusion

Kubernetes requests and limits are powerful tools to ensure that your containerized applications are both resource-efficient and resilient. By setting both minimum (requests) and maximum (limits) resource values, you can ensure that your containers get the resources they need while preventing them from consuming too much and affecting other workloads in your cluster.

Remember to always monitor your containers’ resource usage and adjust these values as needed to keep your Kubernetes cluster running smoothly.