Pradeep Singh

Posted on Oct 29, 2020

Running a self managed kubernetes cluster on AWS

#aws #devops #kubernetes

In this post I want to walk through the steps one needs to take to run a self managed kubernetes cluster on AWS. One of the reasons one might want to do it is so that one could run newest version of kubernetes not yet available as a managed service.

We are going to use kubeadm to install kubernetes components and use AWS integrations for things like load balancing, CI/DR ranges, etc. I am going to install kubernetes version 1.19 on ubuntu server 18.04 LTS. docker will be the container runtime.

The control plane will be setup manually. The worker nodes will be part of an autoscaling group. Autoscaling will be managed by cluster-autoscaler.

Both the control plane and data plane will be deployed in private subnets. There is no reason for any of these instances to be in public subnets.

For convenience, I am going to use the same AMI for both control plane and data plane.

Things that I had to struggle with are correctly using the flag --cloud-provider=aws and tagging of instances with kubernetes.io/cluster/kubernetes.

Creating the AMI

Start an instance with base ubuntu 18.04 image. Assign a key pair while launching so that we can ssh into the instance.

Let's install kubeadm, kubelet and kubectl first. Following the directions at https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/, with slight changes, we need to run this script

sudo apt-get update && sudo apt-get install -y apt-transport-https curl
curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -
cat <<EOF | sudo tee /etc/apt/sources.list.d/kubernetes.list
deb https://apt.kubernetes.io/ kubernetes-xenial main
EOF
sudo apt-get update
sudo apt-get install -y kubelet kubeadm kubectl
sudo apt-mark hold kubelet kubeadm kubectl

Swap needs to be disabled on the instance.

Install docker -


sudo apt-get update && sudo apt-get upgrade -y && sudo apt-get install -y apt-transport-https curl jq
sudo apt-get install docker.io -y
sudo systemctl enable docker.service
sudo systemctl start docker.service

Worker node needs to use aws cli, so let's install it -

curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
sudo apt install unzip
unzip awscliv2.zip
sudo ./aws/install

The kubelet needs to run with the flag --cloud-provider=aws. We can apply this change at one place in this AMI and it will be carried over to all instances launched with this image.

Edit the file /etc/systemd/system/kubelet.service.d/10-kubeadm.conf and the change the line to include cloud-provider flag at the end -

Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --cloud-provider=aws"

At this point we can create an AMI from this instance. Let's call it k8s-image.

Setting up control plane

Load Balancer

Because our control plane is going to be highly available, its instances are going to run behind a load balancer. So I created a classic ELB which forwards TCP traffic on port 6443, the default port used by the control plane. I was unable to get the NLB to work in this experiment.

I setup the load balancer to communicate with 3 private availability zones where the control plane will be located. Let's say the DNS name of the load balancer is internal-k8s-111.us-east-1.elb.amazonaws.com

These security groups were manually created -

ELB security group

I created a security group for ELB which allowed ingress on port 6443 from the security group attached to worker nodes mentioned below. Let's call it k8s-loadbalancer-sg

Control plane/Worker node security group

For convenience I used the same security group for both control plane and data plane instances. This security group allowed ingress from the load balancer security group mentioned above on port 6443. It also allowed ingress on all ports if the source was this security group itself. Let's call it k8s-custom-sg. I also enabled ssh on port 22

IAM role

Let's create an IAM role which will be attached to our instances. I am going to call it CustomK8sRole. The same IAM role will be attached to both control plane and data plane instances in this experiment. In real world you would only provide minimum necessary privileges to an IAM role.

You will need to update this role to provide access to all the AWS services we are going to mention in this article.

Creating the first machine

Launch an instance using our custom AMI k8s-image and IAM role CustomK8sRole.

Tag the node so that it can be discovered by worker nodes, e.g., with key k8s-control-plane. Tag key is sufficient, value will not be used.

Add another tag kubernetes.io/cluster/kubernetes with the value of owned. This seems to be necessary if we are using a cloud provider. Here kubernetes is the cluster name which is the default name used by kubeadm. If you have chosen a different name for the cluster, change the tag key accordingly.

Change the hostname of the instance -

sudo hostnamectl set-hostname \
$(curl -s http://169.254.169.254/latest/meta-data/local-hostname)

This is being done manually for control plane. It will be automated for worker nodes.

For API server and controller manager to integrate with aws cloud provider we need to start them with the flag --cloud-provider=aws. I couldn't find a way to tell kubeadm to do it using command line args. So we are going to run kubeadm with a config file config.yaml. If we use a config file then all arguments need to go in the file, including the API server (control plane) endpoint. Control plane endpoint is the DNS name of ELB we created above.

Create the file config.yaml -

apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
apiServer:
  extraArgs:
    cloud-provider: aws
controllerManager:
  extraArgs:
    cloud-provider: aws
    configure-cloud-routes: "false"    
controlPlaneEndpoint: internal-k8s-111.us-east-1.elb.amazonaws.com:6443

The kubelet also needs to run with this flag but that configuration is baked into this custom AMI because of the change we have made to the file 10-kubeadm.conf, so we don't need a section to configure the kubelet.

We do not need to provide CI/DR ranges for pod IPs and service IPs in this configuration file because we will use AWS VPC CNI plugin.

Attach this instance to the ELB. It will initially show out of service. That's ok because there is nothing running on the instance. But it needs to be attached now because this is the controller endpoint kubelet will try to access. After the AAPI server starts running on the instance, it will show as InService.

Start kubeadm -

sudo kubeadm init --config config.yaml --upload-certs

This should be it. The only error I came across in this step was security group misconfiguration where the ELB was not being able to communicate with the kubelet.

It should print a message about how you can add additional control plane and data plane nodes to the cluster.

If there are errors, debug them using journalctl -a -xeu kubelet -f.

Setup the .kube/config file as indicated in the message and you should see one master node running -

kubectl get nodes

CNI

Nothing is going to work until you install a CNI plugin. Following AWS's documentation about AWS CNI plugin, install it substituting correct value for <region-code> -

curl -o aws-k8s-cni.yaml https://raw.githubusercontent.com/aws/amazon-vpc-cni-k8s/v1.7.5/config/v1.7/aws-k8s-cni.yaml
sed -i -e 's/us-west-2/<region-code>/' aws-k8s-cni.yaml
kubectl apply -f aws-k8s-cni.yaml

Adding control plane nodes

Start up as many machines as you would like using the custom image and custom IAM role.

Add the tags kubernetes.io/cluster/kubernetes (value owned) and k8s-control-plane(with value as empty) as done for the first node.

Change the host name before doing anything and add the machine to the ELB.

Run the command as printed by the previous step.

sudo hostnamectl set-hostname \
$(curl -s http://169.254.169.254/latest/meta-data/local-hostname)
kubeadm join internal-k8s-111.us-east-1.elb.amazonaws.com:6443 --token 111.111 \
    --discovery-token-ca-cert-hash sha256:111 \
    --control-plane --certificate-key 111

Worker nodes

The command for a worker node to join the cluster was printed when we setup our first control plane node. It looks something like this -

kubeadm join internal-k8s-111.us-east-1.elb.amazonaws.com:6443 --token 111.111 \
    --discovery-token-ca-cert-hash sha256:111

So we need a token generated on the control plane node, we need to know the control plane endpoint and certificate hash. We have these values available currently because we just setup our control plane. But how do we handle the issue of worker nodes getting added to the cluster because of autoscaling one month down the road. It is not a good idea to hardcode these values.

EC2 instances allow us to provide a script as UserData which executes immediately after the instance launches. We can use UserData to automate this process.

Our script will make use of AWS CLI which is conveniently baked into our image.

We will also make use of AWS Systems Manager Run Command to execute commands remotely on a control plane node. I benefitted extremely from the provided examples.

Our instances will need to be part of an autoscaling group (ASG). The ASG defines how many instances do we want currently to run, the minimum number of instances that should always be running and the maximum number of instances it will allow to run. ASG also declares subnets where the instances can be launched.

The ASG needs a Launch Configuration which defines the AMI which will be used to launch the instance. We will use our custom AAMI.

We can also declare resource tags which will be applied to instances launched using this Launch Configuration. The resource tag that needs to be configured here is kubernetes.io/cluster/kubernetes with the value of owned.

Our script will query for EC2 instances which have been tagged with k8s-control-plane, which all of our control plane nodes are. Then it will execute a command remotely on the first node from this list using aws ssm send-command to generate a new token and generate the command to join the cluster as a worker node. It will then execute this command and finally execute another remote command to delete the token which was generated.

Here's the UserData which we can supply to our launch configuration. The SSM agent is already installed on control plane nodes because we used AWS managed ubuntu image. Our CustomK8sRole will need to have policies added that allow it to execute these commands.

#!/bin/bash
sudo hostnamectl set-hostname \
$(curl -s http://169.254.169.254/latest/meta-data/local-hostname)
instances=$(aws ec2 describe-instances --filters "Name=tag-key,Values=k8s-control-plane" | jq -r ".Reservations[].Instances[].InstanceId")
echo "control plane instances- $instances"
instance=$(echo $instances| cut -d ' ' -f 1)
echo "working with instance- $instance. Generating token."
sh_command_id=$(aws ssm send-command \
    --instance-ids "${instance}" \
    --document-name "AWS-RunShellScript" \
    --comment "Generate kubernetes token" \
    --parameters commands="kubeadm token generate" \
    --output text \
    --query "Command.CommandId")
sleep 5
echo "Receiving token"
result=$(aws ssm list-command-invocations --command-id "$sh_command_id" --details | jq -j ".CommandInvocations[0].CommandPlugins[0].Output")
token=$(echo $result| cut -d ' ' -f 1)
echo "generating join command"
sh_command_id=$(aws ssm send-command \
    --instance-ids "${instance}" \
    --document-name "AWS-RunShellScript" \
    --comment "Generate kubeadm command to join worker node to cluster" \
    --parameters commands="kubeadm token create $token  --print-join-command" \
    --output text \
    --query "Command.CommandId")
sleep 10
echo "getting result"
result=$(aws ssm list-command-invocations --command-id "$sh_command_id" --details | jq -j ".CommandInvocations[0].CommandPlugins[0].Output")
join_command=$(echo ${result%%---*})
echo "executing join command"
$join_command
echo "deleting kubernetes token"
sh_command_id=$(aws ssm send-command \
    --instance-ids "${instance}" \
    --document-name "AWS-RunShellScript" \
    --comment "Delete kubernetes token" \
    --parameters commands="kubeadm token delete $token" \
    --output text \
    --query "Command.CommandId")
sleep 5
result=$(aws ssm list-command-invocations --command-id "$sh_command_id" --details | jq -j ".CommandInvocations[0].CommandPlugins[0].Output")
echo $result

Cluster Autoscaler

Download the yaml manifest curl -O https://raw.githubusercontent.com/kubernetes/autoscaler/master/cluster-autoscaler/cloudprovider/aws/examples/cluster-autoscaler-one-asg.yaml

We are using the one-asg manifest in this case. But if your setup uses persistent volumes then you will have to use multi-asg manifest by configuring one ASG per availability zone.

Replace k8s-worker-asg-1 in the file with the name of your ASG and edit the section for certificates like so -

        - name: ssl-certs
          hostPath:
            path: "/etc/ssl/certs/ca-certificates.crt"

The location mentioned in the file /etc/ssl/certs/ca-bundle.crt is incorrect for our setup.

Apply the manifest kubectl apply -f cluster-autoscaler-one-asg.yaml.

Calico

To be able to provide network policies, calico is one option -

kubectl apply -f https://raw.githubusercontent.com/aws/amazon-vpc-cni-k8s/v1.7.5/config/v1.7/calico.yaml

AWS Load Balancer Controller

The AWS ALB Ingress Controller is now known as AWS Load Balancer Controller. It can create Application Load Balancers for the services that you want to expose to the internet.

I was able to deploy the echoserver. The public subnets had to be tagged with kubernetes.io/role/elb and private subnets with kubernetes.io/role/internal-elb.

Top comments (1)

GPUONCLOUD • Mar 24 '22

Hi Pradeep,
Thanks for putting this together.
Have followed these steps, however the cluster-autoscaler has been restarting frequently. No much info on the pod logs, however I am anticipating minor glitch in the way its configured at my end - unable to figure it out though.

FYI - For testing purpose I have implemented it on AWS with self-managed kubernetes having single master and 3 worker nodes on EC2. For testing I have also skipped the ALB.

Any inputs would be of great help.
thanks.