Network Observability For Kubernetes Part 1: KubeShark

#kubernetes #programming #cloud #devops

If you aren’t aware of what’s happening from a networking perspective when it comes to Kubernetes, your environment is not properly managed.

If you don’t have networking skills, as in, truly understanding the way networks work (IP addressing, routing, ports, firewall rules, ingress/egress, etc…), your environment is not managed properly.

The thing is, networking isn’t on the top of every engineer's list and it may not be a skill that all engineers have obtained. Although it may not be a “hot new topic”, networking is the most important piece of any good implementation.

This blog post is the first in a four-part series that will help you understand how to retrieve the information you need from a Kubernetes networking perspective when it comes to API calls.

Why Network Observability Is Crucial For Kubernetes

Networking is arguably the most important part of Kubernetes. Without proper networking, clusters and application workloads are usually in complete shambles.

But what does this mean?

Without properly understanding ingress and egress, Pods will accept traffic from anywhere and send traffic anywhere. This is an attacker's dream.

Without understanding port forwarding and communication over ports, you won’t be able to ensure the traffic that’s supposed to reach a particular container inside of a Pod will reach its destination.

Without understanding IP addressing, CIDRs, and the way applications talk to each other over a network, you can’t secure the traffic and you can’t properly ensure that network traffic is going to the proper location.

Above all else, without proper networking knowledge, there’s absolutely no way to ensure that the network is working as expected from an observability and performance perspective.

There are three main parts of networking in Kubernetes:

How cluster components are talking to Kubernetes resources.
How Pods and other resources are talking to each other.
What traffic is going to Pods (ingress) and what traffic is leaving from Pods (egress).

The question is - how can you see that traffic in an efficient way?

What Is KubeShark

When it comes to seeing the network API traffic in an efficient way, that’s where KubeShark can come into play.

Kubernetes is constantly sending network traffic. Whether it’s to various Kubernetes components in the Control Plane, components in the Worker Nodes, or resources (Pods, Services) running inside of the cluster. Packets are packets regardless of where they’re coming from, and these network packets are very rarely ever observed, monitored, and seen aside from when something goes wrong.

KubeShark's primary focus is analyzing API traffic for Kubernetes resources, which from a Kubernetes perspective, is definitely more than at least 90% of the traffic. Everything in Kubernetes is done with some API call and over the network, so there’s a lot of visibility inside of a tool like KubeShark. You can see what Pods are talking to other Pods and what Kubernetes components from the Control Plane and Worker Nodes are talking to each other or Pods. You can also see what APIs are being called within Kubernetes.

Installing KubeShark

There are a few different methods of installing KubeShark.

Shell:

sh <(curl -Ls https://kubeshark.co/install)

Mac:

brew tap kubeshark/kubeshark
brew install kubeshark

Helm:

helm repo add kubeshark https://helm.kubeshark.co

helm install kubeshark kubeshark/kubeshark

# Forward the frontend if you don't use the kubeshark command
kubectl port-forward -n default service/kubeshark-front 8899:80

When you use the Helm Chart, it runs KubeShark right away as a Helm Chart. If you use the kubeshark command, it installs the Helm Chart in the background after you run a particular command.

The most popular method appears to be using the kubeshark command because with the Kubeshark command, you can specify the Namespace that you want to view traffic for.

For example, running the following command will show traffic in the test Namespace.

kubeshark tap -n test

You should see a web browser open in your browser showing API traffic.

Understanding Filtering

Once KubeShark is deployed, you’re going to see various traffic coming in from all Pods and resources depending on what Namespace KubeShark is looking at.

To reduce the traffic a bit and make it a little more readable, you can filter traffic.

For example, below is a filter that collects all the traffic to and from the npm-metrics-cluster-service.