Platforms, applications, and various systems have no business being deployed in production without proper monitoring and observability.
Without monitoring, you can’t see in real-time what’s happening.
Without observability, you can’t take action on any metrics, logs, or traces.
The thing is, you’ve always needed multiple platforms to do this… until now.
In this blog post, you’ll learn about OpenTelemetry and how to implement it in your Kubernetes environment.
OpenTelemetry
What exactly is OpenTelemetry?
Well, let’s break it down.
I’ll be honest - when I first started diving into OpenTelemetry, I wasn’t really sure why I’d need it. I thought to myself I can just use Prometheus and a log aggregator to do this.
But then it hit me… yes, I could use multiple tools to do the same thing, but why would I use multiple tools when I could instead just use one?
After further research on OpenTelemetry, I came to the conclusion that it’s not such a bad idea.
Essentially the heart and soul of it is it gives you full Observability (logging, metrics, and traces) in one place for the entire cluster vs having to piecemeal tools together. For example, Prometheus is great, but it only does Metrics. You’ll need another tool for traces and logs. With OpenTelemetry, you don’t need multiple tools.
With the rapid rate at that tools and platforms are being created in the Kubernetes space, a tool like OpenTelemetry which does it all from an observability perspective in one platform is amazing. It takes the confusion, hardship, and overall need to manage multiple tools/platforms out of the equation.
From a monitoring perspective, yes, you still need multiple tools.
You only need one from a Logging, Metrics, and Tracing perspective - OpenTelemetry.
Keep in mind that OpenTelemetry isn’t just for Kubernetes. You can also put it in your application code and run it on various systems and containerization platforms.
How Does OpenTelemetry Work With Kubernetes
Like any tool/platform that’s ready to be consumed and utilized by Kubernetes Engineers, OpenTelemetry has a Kubernetes Operator.
💡 A Kubernetes Operator consists of various Custom Resource Definitions (CRD) and a Controller to ensure the current state of the deployment is the desired state.
For OpenTelemetry to work on Kubernetes, you must have cert-manager
installed, which is another Operator that is used to add certificates to Kubernetes resources automatically. It automates the process of renewing and obtaining the certs as well. You’ll learn how to deploy cert-manager
in the next section.
Outside of Kubernetes, for OpenTelemetry installations, you have two options:
- Agent
- Gateway
The Agent works like any other agent. It’s a Collector that runs on the cluster to collect observability information from the environment.
The Gateway runs as a standalone instance on each cluster. It can help limit the egress points needed to send data and consolidate API token management. The Collectors with a Gateway installation also operate independently, so it’s way easier to scale.
With Kubernetes however, you would use the Operator. The Collector and Operator follow the same versioning. Whenever a new version of the Collector is available, there will be a new version of the operator available.
You’ll see a few different installation methods in the next section.
Once installed, you then implement a collector and a point to which you want the data to be sent. For example, Kafka or Jaeger.
Purpose Of OpenTelemetry
At this point, you may be thinking a few things, but if you’re not, I’d like to point them out:
- OpenTelemetry isn’t doing anything from a technical perspective that other tools like Datadog and New Relic aren’t.
- It’s collecting/receiving/processing observability pieces like Metrics, Logging, and Traces, much like other tools and platforms do.
So, what’s the point of OpenTelemetry then?
It’s so you’re not locked into a specific platform.
No, that doesn’t mean “vendor lock-in” in the traditional sense of “you’re using a platform, you’re stuck there”. What it means is OpenTelemetry allows you to take the observability data and send it to whatever location you’d like.
Want to send it to Grafana? No problem. Want to send it to Kafka? No problem. Want to send it to Fluentbit? No problem. Want to send it to some backend that you created or a home-grown solution? No problem.
The other beautiful part is that you can do it all in a declarative fashion using CRDs in your Kubernetes cluster, which a lot of other tools don’t give you the ability to do. It’s meant to work as all other CRDs work, making it feel “closer to home” when it comes to Kubernetes.
Collectors
Because OpenTelemetry is an open platform in the sense that it essentially just aggregates observability data, it’s then your job to figure out where you want to send the data.
That’s where Collectors come in.
Collectors are an agnostic way to process and export telemetry data. The OpenTelemetry Collector (OTEL) removes the need for engineers to implement a ton of different agents and exporters based on where and how they need to transfer the observability data from the Kubernetes cluster to the location of your choosing.
You may be thinking yeah, I don’t have to have a bunch of agents/exports, but I do need to have a bunch of Collectors if I’m exporting to various locations.
The answer is yes, but it’s all handled as a Kubernetes resource/object, so the declarative nature is far different and more natural in a Kubernetes environment, which makes it far easier to manage.
Installing OpenTelemetry
There are two installation methods:
- A raw Kubernetes Manifest.
- A Helm Chart.
Helm Charts are better from a management perspective for production, but let’s see how to use both.
For the raw k8s Manifest installation:
kubectl apply -f https://github.com/open-telemetry/opentelemetry-operator/releases/latest/download/opentelemetry-operator.yaml
For the Helm installation:
helm repo add open-telemetry https://open-telemetry.github.io/opentelemetry-helm-charts
helm repo update
helm install opentelemetry-operator open-telemetry/opentelemetry-operator
You can then deploy an aggregator based on your purpose. For example, this is a simple Collector that exports logs.
kubectl apply -f - <<EOF
apiVersion: opentelemetry.io/v1alpha1
kind: OpenTelemetryCollector
metadata:
name: simplest
spec:
config: |
receivers:
otlp:
protocols:
grpc:
http:
processors:
memory_limiter:
check_interval: 1s
limit_percentage: 75
spike_limit_percentage: 15
batch:
send_batch_size: 10000
timeout: 10s
exporters:
logging:
service:
pipelines:
traces:
receivers: [otlp]
processors: []
exporters: [logging]
EOF
You can now see that it’s running in your Kubernetes.
kubectl get OpenTelemetryCollector
NAME MODE VERSION AGE
simplest deployment 0.66.0 52s
Conclusion
I do like the idea of OpenTelemetry. As an engineer, I like home-grown solutions and choosing my own adventure. Having the ability to be as flexible as possible and send the observability data wherever you want is an amazing thing. With OpenTelemetry, I don’t have to worry about a tool/platform no longer being supported or it migrating functionality in a way that I dislike. If that happens, I can just write another Collector that points my data from one place to another.
The problem is, I don’t know if this is going to work for the enterprise.
I can see a lot of managers and leaders saying something along the lines of: I have to pay people to write all of this code for the Collectors to get it working? And manage it? I just want to buy X Solution so it does all of that for me
Only time will tell if I’m right or not.
On the flip side, OpenTelemetry is truly awesome for teams that want complete flexibility. I personally like the flexibility of it.
Top comments (0)