DEV Community

Cover image for Introducing oteldoctor: a production-readiness analyzer for OpenTelemetry Collector configs
Firat Celik
Firat Celik

Posted on

Introducing oteldoctor: a production-readiness analyzer for OpenTelemetry Collector configs

I’ve just released oteldoctor v0.1.0, an open-source Go CLI that analyzes OpenTelemetry Collector configurations before they reach production.

Why I built it

OpenTelemetry Collector configs often start small.

Then they grow.

A few receivers.
A few processors.
A couple of exporters.
Some Kubernetes manifests.
A little bit of batching.
A debug exporter during testing.
A quick HTTP endpoint.
A few attributes added for convenience.

Eventually, that YAML file becomes production-critical infrastructure.

The problem is that a Collector config can be valid YAML, and even acceptable to the Collector, while still being risky for production.

For example:

  • telemetry may be dropped when exporters fail
  • the Collector may restart during memory spikes
  • debug endpoints may be exposed
  • hardcoded secrets may leak into version control
  • high-cardinality attributes may increase observability cost
  • service identity may be inconsistent across environments

That’s the gap oteldoctor tries to fill.

What oteldoctor checks

oteldoctor analyzes Collector configs across six categories:

Category Examples
Structural Undefined references, unused components, empty pipelines
Reliability Missing memory_limiter, missing batch, retry/queue gaps
Security Plain HTTP, hardcoded secrets, exposed debug endpoints
Cost / Cardinality High-cardinality dimensions, missing sampling, debug in production
Semantic Quality Deprecated attributes, missing service identity
Kubernetes Readiness GOMEMLIMIT, resource limits, probes, exposure risks

Example usage

oteldoctor analyze ./deploy --profile production
Enter fullscreen mode Exit fullscreen mode

Generate SARIF for GitHub Code Scanning:

oteldoctor analyze ./deploy --profile production --format sarif > oteldoctor.sarif
Enter fullscreen mode Exit fullscreen mode

Render the Collector pipeline as a graph:

oteldoctor graph collector.yaml --format mermaid
Enter fullscreen mode Exit fullscreen mode

Explain a rule:

oteldoctor explain OTEL-SEC-202
Enter fullscreen mode Exit fullscreen mode

Install:

go install github.com/firfircelik/oteldoctor/cmd/oteldoctor@v0.1.0
Enter fullscreen mode Exit fullscreen mode

What it is not

oteldoctor does not replace the OpenTelemetry Collector’s own configuration validation.

The Collector can tell you whether a config is syntactically valid and operationally acceptable.

oteldoctor focuses on production readiness: reliability, security, cost/cardinality, semantic convention quality, and Kubernetes deployment risks.

GitHub: https://github.com/firfircelik/oteldoctor

This is the first public release. I’d love feedback from anyone using OpenTelemetry Collector, especially around real-world configs and rule calibration.

Top comments (0)