DEV Community

Cover image for How OpenTelemetry Organizes Distributed Tracing
mrasu
mrasu

Posted on

How OpenTelemetry Organizes Distributed Tracing

OpenTelemetry is a framework for sending traces, metrics, and logs. It consists of various components, such as protocols and SDKs tailored for many programming languages.

This article explores how OpenTelemetry accomplishes distributed tracing and delves into the underlying mechanisms that make it work.

What is Distributed Tracing?

Distributed tracing is a vital mechanism for monitoring and tracking traces across multiple servers, commonly employed in microservices architectures.

OpenTelemetry's distributed tracing primarily consists of two elements: Trace and Span.
A Span contains execution details such as timestamps or SQL query, while a Trace is a tree structure with Span as its nodes.

In the visual representation below, the entire figure represents a Trace, and each bar represents a Span.
Image of Trace

It's important to note that a Trace is not a tangible entity; programs generate only Span.
The backend organizes a Trace from received Spans.

Generating Traces from Spans

Distributed tracing can be achieved when the backend constructs a Trace based on Spans.

To create a Trace based on Spans, you need to consider three key elements within the Span:

  • TraceId
  • SpanId
  • ParentSpanId

You can find them in the OpenTelemetry Proto file, specifically in the trace.proto.

  • TraceID: This is the unique identifier for the entire trace. If, for instance, the TraceID is 9023c11c... in hexadecimal, any span sharing this TraceID (9023c11c...) is part of the same trace.
  • SpanId: Each Span has a unique identifier within the same Trace
  • ParentSpanId: This is the identifier of the parent span. When the SpanId of one Span matches the ParentSpanId of another, the Span is considered as the parent of another one.

By combining these elements, we can construct a Trace tree.
Let's illustrate this with an example:

Image of Trace

In this example, observe that the ParentSpanId of Spans matches the SpanId of their parent, while different TraceId give rise to different trees.

Propagation in Distributed Tracing

As established in the previous chapter, we learned that Traces can be constructed from elements in Spans.

Now, the question arises: How does a program determine its own TraceId and ParentSpanId?

While sharing TraceId within the same process can be achieved through memory, distributed tracing in a multi-machine environment necessitates a more sophisticated approach.

This is where Propagation becomes essential.

The idea is simple: "Pass your TraceId and SpanId to other machines somehow."
For instance, when making an HTTP request, you can include these identifiers in the headers.

There are some methods for transmitting this information in HTTP headers, due to historical reasons, but I'll focus on the approach standardized by the W3C, which has gained adoption in the industry.

W3C Trace Context

The format of the W3C Trace Context is specified by the W3C, as indicated by its name.

When utilizing W3C Trace Context, the traceparent field is employed in the HTTP request header.

The header follows the format ${version}-${trace-id}-${parent-id}-${trace-flags}.
When using curl, it looks like the example below:

curl \
  -H "traceparent:00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01" \
  localhost
Enter fullscreen mode Exit fullscreen mode

This example demonstrates how the traceparent field is included in an HTTP request header, conveying essential trace information to the destination:

  • Version: 00 (currently, there is no version other than 00)
  • TraceId: 4bf92f3577b34da6a3ce929d0e0e4736
  • ParentSpanId: 00f067aa0ba902b7
  • Flag: 01 at the end indicates whether the request is being sampled or not.

This header enables the sharing of Trace information with another service.

Conclusion

In conclusion, despite the formidable name of distributed tracing, it essentially boils down to passing data (TraceId, SpanId, ParentSpanId).

That's the way OpenTelemetry facilitates distributed tracing.

Top comments (0)