DEV Community

Cover image for Kubernetes Services; Expose your app to the Internet
Olufisayo Bamidele
Olufisayo Bamidele

Posted on

Kubernetes Services; Expose your app to the Internet

In the last articles in this series, we talked about the main pillars of Kubernetes: Cluster, control pane, nodes, workloads, pods, containers, and services. We dockerized a simple nodejs echo server and ran it on our machine inside a single-node Kubernetes cluster. In this article, we run that same workload in a Kubernetes cluster in the Cloud, but before that, let's talk about a topic I graced on quickly in the last article: Services.

Kubernetes Services.

While writing the last articles, I realized that the word "service" is overloaded in tech; It could mean a web backend application, a daemon running in an OS, a cloud offering, or even a module in a codebase. A "Service" is also a thing in Kubernetes that differs from all the meanings above. For clarity, we would be very specific when discussing "Services" in Kubernetes. We would call them k8s-Services

What is a K8S-Service?

A k8s-service is an abstraction that controls communication with a target group of Pods within a cluster and the exposure of those pods outside a Cluster. To be more specific, a k8s-service abstracts the networking of related pods.

Why Do We Need K8s-Services

In the last part of this series, we mentioned that a Pod gets assigned a unique IP address that we can use to communicate with the Pod inside the Cluster. Why can't we talk to pods directly using their IP addresses? Yes, we can, but we shouldn't. Pods, by nature, are short-lived. They come in and out of existence depending on many factors. Factors like

  1. The Pod's health: If it maxes out its allocated memory and CPU, it will stop receiving traffic. Once Kubernetes detects this, it kills that Pod and replaces it with a new one.

  2. Initial Configuration Of The Deployment That Started The Pod: Pods are usually started as deployment members. A deployment is a Kubernetes configuration that describes the desired state of a pod, including when we would like to have a replica. In a deployment, it is possible to specify the number of pods you want when a specific event occurs. For example, one can add a configuration like this: "If the total % of CPU usage goes up to 60%, create a new replica of this Pod to handle extra requests". In the configuration described above, replicas of a pod are expected to come in and out of existence depending on how intensively each Pod is being used.

Each new Pod replica gets a unique IP address, so using Pod's IP addresses is unreliable as a new replica receives a new IP Address. See the image below.

Problem with using PodIP for comminication

K8s-services solve the problem described above. To ensure that we can communicate with a group of Pods without manually keeping track of the Pods' IP addresses, we must create a k8s-service that sits between a request and the destination Pods. Like Pods, k8s-services get assigned unique IP addresses on creation, but unlike Pods, k8s-services are not ephemera because nothing capable of crashing is running inside them. They are there until deleted. They are just a data structure translated to a network routing configuration on the operating system.

illustration of k8s-service

With the setup in the image above, clients of Pod replicas do not need to remember the IP address of each Pod. That responsibility is given to k8s-service. Clients only need to remember the k8s-service's IP address.

Geek Bit ℹī¸: The image above shows that k8s-services keep track of destination pod IP addresses in a table. While most of the diagram oversimplifies a Kubernetes cluster, this part is literal. A component of Kubernetes called Kubeproxy is responsible for translating the specification of your k8s-service to a network configuration. The configuration is usually implemented on your OS as a NAT iptable or as ipvs. Most cloud services providers like AWS and Google run kubeproxy in NAT IPTable mode. If you're running Kubernetes on Linux, you can view the translation of your k8s-service configuration as NAT table by running sudo iptables -t nat -L KUBE-SERVICES -n | column -t. As a Kubernetes user. Of course, you are usually not concerned about this implementation detail unless you're an administrator.

Types Of k8s-Service

ClusterIP K8S-Service

Whenever you create a k8s-service without specifying the type, the ClusterIP k8s-service is the type that Kubernetes would create. When you create a ClusterIP k8s-service, Kubernetes assigns a private static IP address to the k8s-service, which routes the request to matching pods. ClusterIPs only allow communication within the Cluster; Machines outside the Cluster cannot communicate with Pods through ClusterIP k8s service by default unless you allow this through something called an ingress. We will cover ingresses in another part of this series. See the illustration below.

ClusterIP Illustration

Defining a ClusterIP in Kubernetes

This example edits the k8s-service example in the last article.
file_name: node-echo-service.yaml

apiVersion: v1
kind: Service
metadata:
  name: node-echo-service
spec:
  selector:
    app: node-echo # the label of the group of pods that this service maps to
  ports:
    - port: 80 
      targetPort: 5001
Enter fullscreen mode Exit fullscreen mode

selector: app: node-echo instructs Kubernetes to put this ClusterIP k8s-service in front of any pod with the label app=node-echo. port:80 is the port the service binds to. targetPort:50001 is the port that our container is listening on; that is where the k8s-service will forward traffic to

To create the service, run kubectl apply -f node-echo-service. yaml. If the configuration does not contain a syntax error, you should get an output that says service/node-echo-service created on your terminal.

To confirm the creation of our service, type

kubectl get services
Enter fullscreen mode Exit fullscreen mode

You should see the following output.

kubectl get services nodeip

Geek Bit ℹī¸: Pay attention attention, especially to the external IP column. Notice that the value is none. Also, notice that the ClusterIP column is a class A private IP address. Networking 101: Private IPs are only used in LAN. The Kubernetes Cluster's network is the LAN in this scenario. This ClusterIP k8s-service cannot receive internet traffic by default. The other two kinds of k8s services are built on top of ClusterIPs.

NodePort K8S-Service

NodePort k8s-Services builds on top of ClusterIP. In addition to getting a static private IP address, the NodePort k8s-service receives traffic from outside the Kubernetes cluster by opening up a Port in every Node. The traffic from these open ports can hit any node, and the service can forward requests to the available matching Pods. When configuring a NodePort k8s-service, we must provide three ports.

targetPort: The destination port of the matching pods. Usually, the port that your container is running on
port: The port that the k8s service binds to
nodePort: The port on each Node that accepts public traffic

See the the illustration below.

NodePort k8s-service

💡 I like to describe NodePort K8s-Service as a k8s-service whose public IP address is the address of every Node in the Kubernetes Cluster.

Although NodePort k8s-services allow clients from outside the Cluster to communicate with our Pods, they are not production-ready for the following reasons:

  1. NodePort k8s-services only allow traffic from ports 30000 to 32767. Those are non-standard ports in a production environment. Browsers and HTTP clients look at port 80 by default and port 443 for HTTPS. Any other port would require users to be specific. Imagine having to remember the port of every website that you visit
  2. NodePort k8s-service receives internet traffic through all the Nodes available in the Cluster. This is problematic in production because clients must keep track of all those IP addresses. At the very least, you need a static, permanent, public IP address associated with the K8S-service for your workload to be production-ready. You can achieve this through the creation of an Ingress(don't think about this for now) or using the next k8s-service type called LoadBalancer

Defining a NodePort K8s-service

To define a NodePort k8s-service, we need to add two new properties to the configuration in the ClusterIP section.

  1. Under the spec property, add the property "type" whose value is NodePort, i.e., type: NodePort
  2. Under the port object, add the property "nodePort", whose value is any port you choose, i.e., nodePort: 30000

file_name: node-echo-service.yaml

apiVersion: v1
kind: Service
metadata:
  name: node-echo-service
spec:
  type: NodePort # telling k8s that we are talking about NodePort
  selector:
    app: node-echo # the label of the group of pods that this service maps to
  ports:
    - port: 80 
      targetPort: 5001
       nodePort: 30000 # The port port that receives traffic from the internet
Enter fullscreen mode Exit fullscreen mode

Note: If you skip the nodePort property, Kubernetes will automatically choose your value.

Submit the new configuration to Kubernetes by running kubectl apply-f node-echo-service.yaml. If your configuration contains no syntax error, you should get an output that says service/node-echo-service configured.

To see the result, run kubectl get services node-echo-service -o wide. Your result should look similar to the screenshot below.

Kubectl get services node-echo-service -o wide

Pay attention to the type and the port column. The type column now says "NodePort," the ports column maps the k8s-service's port 80 to port 30000 on our machine.

We can now communicate with our workload by running curl -d "amazing" 127.0.0.1:30000

GeekBit ℹī¸: NodePort is not useless in production; it's just not unsuitable for most web applications. Assuming I run a compute-intensive workload(say, image processing) in Kubernetes, I have dedicated an entire Cluster to this workload. I want to balance incoming tasks across Nodes so that every Node in the cluster always has the same number of tasks running inside them. I'd go for a NodePort k8s-service and set the externalTrafficPolicy to Local, ensuring that traffic to a Node only fulfills a request inside that Node. Finally, I'd put a network load balancer in front of the k8s-service. Of course, don't worry about it if you don't understand everything. Keep following this series, and it'll eventually make sense.

LoadBalancer K8s-service

With the LoadBalancer K8s-Service type, Kubernetes assigns it a static public IP address. This is what we want in production for web servers 🤗. The IP address is then announced across the underlying network infrastructure.
See the illustration below.

LoadBalancer Service Type

⚠ī¸ Note that Kubernetes doesn't come by default with a network Loadbalancer, so one would usually have to install a plugin such as metallb for load balancing but you don't need to worry about this since your cloud provider would have made this available to your Cluster by default

There are two other types of Kubernetes services which I am intentionally skipping in this part. As we dive deeper into Kubernetes networking in the future, I will talk about these in more detail.

Defining a LoadBalancer K8s-service

To define a k8s-service of type LoadBalancer, take the yaml config file from the NodePort section and

  1. Change the type from NodePort to LoadBalancer, i.e. type: LoadBalancer
  2. remove the nodePort property. The resulting yaml should look like so

file_name: node-echo-service.yaml

apiVersion: v1
kind: Service
metadata:
  name: node-echo-service
spec:
  type: LoadBalancer
  selector:
    app: node-echo # the label of the group of pods that this service maps to
  ports:
    - port: 80 
      targetPort: 5001
Enter fullscreen mode Exit fullscreen mode

Apply the configuration by running kubectl apply -f node-echo-service.yaml. You should get the following output; service/node-echo-service configured

Running kubectl get services node-echo-service -o wide, you should get an output similar to the screenshot below.

load balancer

Observe the external ip column. Now it says localhost; This is because I don't have a load balancer installed in the Cluster, but you would get a public IP address in the Cloud.
If we run curl -d "load balancers are amazing" localhost without specifying any port, we should get those exact words echoed back to us.

Deploying our Kubernetes Workload to the Cloud

From the first part of this series to this particular article, we have learned the very basics things we need to nail down to deploy stuff workloads to a Kubernetes Cluster in the Cloud. Now it's time to do the do. Let's take our workload to the Cloud.

I chose Google Cloud for this demo because I've had more experience with Kubernetes on GCP.

Step 1: Set up the projects we want to deploy

We will be using the project we used in the previous part.

  1. Clone the repository and duplicate the folder "k8s-node-echo

  2. Rename the duplicate folder with a name of your choice. I'm calling mine "k8s-node-echo-with-loadbalancer".

  3. cd into the k8s-node-echo-with-loadbalancer

Step 2: Build The Project As a Docker Image

  1. Create a Docker Hub account - Docker Hub, like GitHub for Docker Images. This is where we would push our docker image. Note: DockerHub is not the only place we can push our images, just as GitHub is not the only place to push our code. Docker hub is just one of the popular destinations for your open-source docker images. Take note of your username while signing up. It would be useful later on

  2. Log in to your docker hub account on your docker desktop. Click on the Login icon on the docker hub UI, as shown in the screenshot below. Docker desktop login

  3. Back in your terminal, in our project folder, run the following command

 build -f Dockerfile . -t <yourdockerhubusername>/node-echo:v1 -t <yourdockerhubusername>/node-echo:latest
Enter fullscreen mode Exit fullscreen mode

For example, for me, that would be

docker build -f Dockerfile  . -t ngfizzy/node-echo:v1 -t ngfizzy/node-echo:latest
Enter fullscreen mode Exit fullscreen mode

🚨 If you're working on an M1 and above Macbook, remember to add the --platform linux/amd flag, i.e docker build --platform linux/amd -f Dockerfile. -t ngfizzy/node-echo:v1 -t. This is because arm architecture(which m1 chips are based on) is not the default chip most cloud service providers use.

The -t flag specifies the name of your image. The image name contains three parts.
ngfizzy: your docker hub username
node-echo: our application name
v1: the version of our application

The second -t option only aliases v1 as the latest version.

Running docker images | grep "REPOSITORY\|ngfizzy" should show you more information about the image you just built, like the screenshot below

Docker images

Step 4: Push the image to Docker Hub

Run

docker push ngfizzy/node-echo:v1
Enter fullscreen mode Exit fullscreen mode

If everything works out, your output should look similar to mine in the screenshot below.

Docker push

Run the same command for the ngfizzy/node-echo:latest. If you visit your docker hub account, you should be able to see those images there now. Here's mine

Step 5: Update your node-echo-deployment.yaml file1.

  1. Clear the excessive comments I used for explaining the file in the last part of this article
  2. Update the image property ngfizzy/node-echo:latest
  3. Change the imagePullPolicy property's value to Always

The resulting configuration should look like this.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: node-echo-deployment 
spec:
  replicas: 1 
  selector:
    matchLabels:
      app: node-echo
  template:
    metadata:
      labels:
        app: node-echo
    spec:
      containers:
        - name: node-echo
          image: ngfizzy/node-echo:latest
          imagePullPolicy: Always
          resources:
            limits:
              cpu: 1 
              memory: 256Mi
          ports:
            - name: node-echo-port 
              containerPort: 5001
          livenessProbe:
            httpGet:
              path: /
              port: node-echo-port 
          readinessProbe:
            httpGet:
              path: /
              port: node-echo-port
          startupProbe: # configuration for endpoints
            httpGet:
              path: / 
              port: node-echo-port    
Enter fullscreen mode Exit fullscreen mode

Only lines 16 - 18 changed in the configuration above.

Step 6: Update the node-echo-service.yaml file

Replace the content of node-echo-service.yaml with the LoadBalancer configuration in the load balancer section of this article. Here's the configuration to save you from scrolling

apiVersion: v1
kind: Service
metadata:
  name: node-echo-service
spec:
  type: LoadBalancer
  selector:
    app: node-echo # the label of the group of pods that this service maps to
  ports:
    - port: 80 
      targetPort: 5001
Enter fullscreen mode Exit fullscreen mode

Step 7: Create a GKE Cluster On GCP

  1. Create a GCP account if you don't already have one
  2. Create a GCP Project if you don't have one previously
  3. On the home page, click on the Create GKE Cluster button as shown in the image below GCP Home

⚠ī¸ If you have previously enabled if you have not previously enabled Cloud Compute and GKE API, you'd be prompted to do so by following the prompts. When you're done, return to the home page and click the Create GKE Cluster button again.

You'd be presented with the following page settings page after clicking.

GKE Autopilot config

For demo purposes, we would accept all the default settings and click the submit button at the bottom of the screen.

That should redirect you to this, as seen in the screenshot below.

Cluster creating

It takes a couple of minutes for the Cluster to be created.

Step 8: Install GCloud CLI if you've not done that already

Follow the instructions here https://cloud.google.com/sdk/docs/install

Step 9: Log in to Google Cloud on your CLI and your gcloud project

  1. gcloud auth login
  2. gcloud config set project <your-project-id>

Step 9: Connect to your GKE Cluster on your local machine

On the GKE Cluster page, click on the connect button. Follow the numbers in the screenshot below for Navigation.

Connect to ke

Click on the pop-up box after clicking connect, then click the copy icon to copy the connection command to your clipboard. Go back to your CLI and paste the command. You should get the following output.

Fetching cluster endpoint and auth data.
kubeconfig entry generated for <your cluster name>.
Enter fullscreen mode Exit fullscreen mode

Confirm that you are now connected by running kubectl config get-contexts, you should see at least one entry in the table. The name of one of them should start with gke_*

Deploy To GKE Cluster

Now that we are connected to the GKE cluster

  1. Apply our deployment by running kubectl apply -f node-echo-deployment.yaml. You might get a warning saying, Warning: autopilot-default-resources-mutator:Autopilot updated Deployment... Don't worry about this
  2. Apply your k8s-service config by running kubectl apply -f node-echo-service.yaml
  3. Confirm the deployment by running kubectl get all. You should see an output similar to the screenshot below.

Kubectl get all

Test your deployment

  1. Run kubectl get services node-echo-service
  2. Copy the IP address under the External IP column and send a post request to it like the one below
curl -d "hello world" 34.123.423.124
# The server would echo "Hello world" back to you
Enter fullscreen mode Exit fullscreen mode

Summary

In this article, we took a more detailed look at Kubernetes services; we then used our knowledge to deploy a simple server to the Internet. We are just scratching the surface of Kubernetes. In this series, I aim to gradually reveal containers, and Kubernetes features until we can paint a complete picture of how everything works from end to end.

In the next article, we will take a full circle look deeper at containers, and explore how they do what they do.

Top comments (0)