Assuming, we already have an AWS EKS cluster with worker nodes.
In this post – we will connect to a newly created cluster, will create a test deployment with an HPA – Kubernetes Horizontal Pod AutoScaler and will try to get information about resources usage using kubectl top
.
Kubernetes cluster
Create a test cluster using eksctl
:
$ eksctl create cluster --profile arseniy --region us-east-2 --name eks-dev-1
...
[✔] node "ip-192-168-54-141.us-east-2.compute.internal" is ready
[✔] node "ip-192-168-85-24.us-east-2.compute.internal" is ready
[✔] kubectl command should work with "/home/setevoy/.kube/config", try 'kubectl get nodes'
[✔] EKS cluster "eks-dev-1" in "us-east-2" region is ready
And switch to it.
Kubernetes cluster context
Configure your kubectl
:
$ aws eks --profile arseniy --region us-east-2 update-kubeconfig --name eks-dev-1
Added new context arn:aws:eks:us-east-2:534***385:cluster/eks-dev-1 to /home/setevoy/.kube/config
Check EKS clusters available in your AWS account in the us-east-2 region:
$ aws eks --profile arseniy --region us-east-2 list-clusters --output text
CLUSTERS eksctl-bttrm-eks-production-1
CLUSTERS mobilebackend-dev-eks-0-cluster
CLUSTERS eks-dev-1
And check the current profile used:
$ kubectl config current-context
arn:aws:eks:us-east-2:534***385:cluster/eks-dev-1
aws eks
already configured the last one cluster for us for the kubectl
.
If need – you can always get all profiles available using get-contexts
:
$ kubectl config get-contexts
And switch to a necessary one:
$ kubectl config use-context arn:aws:eks:us-east-2:534***385:cluster/eks-dev-1
Switched to context "arn:aws:eks:us-east-2:534***385:cluster/eks-dev-1".
Deployment
For the testing purpose let’s create a deployment with an AutoScaler.
HPA will use metrics collected from the mertics-server
to get data about resource usage on nodes and pods to know when it has to scale up or down a particular deployment’s pods.
Create the HPA and deployment:
---
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: hello-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: hello
minReplicas: 1
maxReplicas: 2
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 80
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: hello
spec:
replicas: 1
selector:
matchLabels:
app: hello
template:
metadata:
labels:
app: hello
spec:
containers:
- name: hello
image: gcr.io/google-samples/node-hello:1.0
resources:
limits:
cpu: "0.1"
requests:
cpu: "0.1"
Apply them:
$ kubectl apply -f example-deployment.yml
horizontalpodautoscaler.autoscaling/hello-hpa created
deployment.apps/hello created
Check:
$ kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
hello-hpa Deployment/hello <unknown>/80% 1 2 1 26s
HPA – TARGETS and unable to get metrics
If you’ll check the HPA right now – you’ll see, that it is unable to collect data about its targets (nodes and pods):
$ kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
hello-hpa Deployment/hello <unknown>/80% 1 2 1 26s
In the Conditions and Events you can find more details about the issue:
$ kubectl describe hpa hello-hpa
...
Conditions:
Type Status Reason Message
---- ------ ------ -------
AbleToScale True SucceededGetScale the HPA controller was able to get the target's current scale
ScalingActive False FailedGetResourceMetric the HPA was unable to compute the replica count: unable to get metrics for resource cpu: unable to fetch metrics from resource metrics API: the server could not find the requested resource
(get pods.metrics.k8s.io)
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedGetResourceMetric 12s (x3 over 43s) horizontal-pod-autoscaler unable to get metrics for resource cpu: unable to fetch metrics from resource metrics API: the server could not find the requested resource (get pods.metrics.k8s.io)
Warning FailedComputeMetricsReplicas 12s (x3 over 43s) horizontal-pod-autoscaler failed to get cpu utilization: unable to get metrics for resource cpu: unable to fetch metrics from resource metrics API: the server could not find the requested resource (get pods.metrics.k8s.io)
Actually, here is our cause:
failed to get cpu utilization: unable to get metrics for resource cpu
Error from server (NotFound): the server could not find the requested resource (get services http:heapster:)
Also, if try to use top
for nodes and pods now – Kubernetes will throw another error:
$ kubectl top node
Error from server (NotFound): the server could not find the requested resource (get services http:heapster:)
$ kubectl top pod
Error from server (NotFound): the server could not find the requested resource (get services http:heapster:)
To make top
able to display resources usage it trying to connect to the Heapster service, see the source code:
...o.Client = metricsutil.NewHeapsterMetricsClient(clientset.CoreV1(), o.HeapsterOptions.Namespace, o.HeapsterOptions.Scheme, o.HeapsterOptions.Service, o.HeapsterOptions.Port)...
But Heaspter is deprecated service used earlier to collect the metrics.
Nowadays for the CPU and Memory metrics the metrics-server
service is used.
Running metrics-server
Clone the repository:
$ git clone https://github.com/kubernetes-sigs/metrics-server.git
$ cd metrics-server/
metrics-server configuration for AWS EKS
To make metrics-server able to find all resources in an AWS Elastic Kubernetes Service cluster – edit its deployment file deploy/kubernetes/metrics-server-deployment.yaml
, and add the command
with four arguments:
...
command:
- /metrics-server
- --logtostderr
- --kubelet-insecure-tls=true
- --kubelet-preferred-address-types=InternalIP
- --v=2
...
-
kubelet-insecure-tls
– do not check kubelet-clients CA certificate on nodes -
kubelet-preferred-address-types
– how to find resources in the Kubernetes space – by using Hostname, InternalDNS, InternalIP, ExternalDNS or ExternalIP, for the EKS set it to the InternalIP value -
v=2
– logs detalization level
Save and deploy it:
$ kubectl apply -f deploy/kubernetes/
clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader created
clusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator created
rolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader created
apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io created
serviceaccount/metrics-server created
deployment.apps/metrics-server created
service/metrics-server created
clusterrole.rbac.authorization.k8s.io/system:metrics-server created
clusterrolebinding.rbac.authorization.k8s.io/system:metrics-server created
Check the system services:
$ kubectl -n kube-system get pod
NAME READY STATUS RESTARTS AGE
aws-node-mt9pq 1/1 Running 0 2m5s
aws-node-rl7t2 1/1 Running 0 2m2s
coredns-74dd858ddc-xmrhj 1/1 Running 0 7m33s
coredns-74dd858ddc-xpcwx 1/1 Running 0 7m33s
kube-proxy-b85rv 1/1 Running 0 2m5s
kube-proxy-n647l 1/1 Running 0 2m2s
metrics-server-546565fdc9-56xwl 1/1 Running 0 6s
Or by using:
$ kubectl get apiservices | grep metr
v1beta1.metrics.k8s.io kube-system/metrics-server True 91s
The service’ logs:
$ kubectl -n kube-system logs -f metrics-server-546565fdc9-qswck
I0215 11:59:05.896014 1 serving.go:312] Generated self-signed cert (/tmp/apiserver.crt, /tmp/apiserver.key)
I0215 11:59:06.439725 1 manager.go:95] Scraping metrics from 0 sources
I0215 11:59:06.439749 1 manager.go:148] ScrapeMetrics: time: 2.728µs, nodes: 0, pods: 0
I0215 11:59:06.450735 1 secure_serving.go:116] Serving securely on [::]:4443
E0215 11:59:10.096632 1 reststorage.go:160] unable to fetch pod metrics for pod default/hello-7d6c85c755-r88xn: no metrics known for pod
E0215 11:59:25.109059 1 reststorage.go:160] unable to fetch pod metrics for pod default/hello-7d6c85c755-r88xn: no metrics known for pod
Try top
for nodes:
$ kubectl top node
error: metrics not available yet
And for pods:
$ kubectl top pod
W0215 13:59:58.319317 4014051 top_pod.go:259] Metrics not available for pod default/hello-7d6c85c755-r88xn, age: 4m51.319306547s
error: Metrics not available for pod default/hello-7d6c85c755-r88xn, age: 4m51.319306547s
After 1-2 minutes check metrics-server
logs again:
I0215 12:00:06.439839 1 manager.go:95] Scraping metrics from 2 sources
I0215 12:00:06.447003 1 manager.go:120] Querying source: kubelet_summary:ip-192-168-54-141.us-east-2.compute.internal
I0215 12:00:06.450994 1 manager.go:120] Querying source: kubelet_summary:ip-192-168-85-24.us-east-2.compute.internal
I0215 12:00:06.480781 1 manager.go:148] ScrapeMetrics: time: 40.886465ms, nodes: 2, pods: 8
I0215 12:01:06.439817 1 manager.go:95] Scraping metrics from 2 sources
And try top
again:
$ kubectl top node
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
ip-192-168-54-141.us-east-2.compute.internal 25m 1% 406Mi 5%
ip-192-168-85-24.us-east-2.compute.internal 26m 1% 358Mi 4%
Pods:
$ kubectl top pod
NAME CPU(cores) MEMORY(bytes)
hello-7d6c85c755-r88xn 0m 8Mi
And the HPA service:
$ kubectl describe hpa hello-hpa
...
Conditions:
Type Status Reason Message
---- ------ ------ -------
AbleToScale True ReadyForNewScale recommended size matches current size
ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)
ScalingLimited True TooFewReplicas the desired replica count is more than the maximum replica count
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedComputeMetricsReplicas 4m23s (x12 over 7m10s) horizontal-pod-autoscaler failed to get cpu utilization: unable to get metrics for resource cpu: unable to fetch metrics from resource metrics API: the server could not find the requested resource (get pods.metrics.k8s.io)
Warning FailedGetResourceMetric 4m8s (x13 over 7m10s) horizontal-pod-autoscaler unable to get metrics for resource cpu: unable to fetch metrics from resource metrics API: the server could not find the requested resource (get pods.metrics.k8s.io)
Pay attention here on the following messages:
- the HPA was able to successfully calculate a replica count from cpu resource utilization – HPA was able now to collect metrics
- unable to get metrics for resource cpu – check the Age, – its counter must stop growing (keeps [13 number)
Actually, that’s all needed to run a metrics-server
to use it with an HPA.
Below – one more common error when using HPA and metrics-server
.
HPA was unable to compute the replica count: missing request for cpu
Sometimes HPA can report about the following errors:
$ kubectl describe hpa hello-hpa
...
Conditions:
Type Status Reason Message
---- ------ ------ -------
AbleToScale True SucceededGetScale the HPA controller was able to get the target's current scale
ScalingActive False FailedGetResourceMetric the HPA was unable to compute the replica count: missing request for cpu
...
Warning FailedGetResourceMetric 2s (x2 over 17s) horizontal-pod-autoscaler missing request for cpu
This may happen if a pod’s template in a Deployment has no resources
and requests
defined.
In this current case I commented out those lines:
...
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: hello
spec:
replicas: 1
selector:
matchLabels:
app: hello
template:
metadata:
labels:
app: hello
spec:
containers:
- name: hello
image: gcr.io/google-samples/node-hello:1.0
# resources:
# limits:
# cpu: "0.1"
# requests:
# cpu: "0.1"
And deployed the pod without requests
which resulted to the “ missing request for cpu ” error message.
Set the requests
back – and it must be working now.
Done.
Top comments (0)