mrasu

Posted on Feb 1, 2024

How OpenTelemetry Organizes Distributed Tracing

#sre #webdev #microservices #monitoring

OpenTelemetry is a framework for sending traces, metrics, and logs. It consists of various components, such as protocols and SDKs tailored for many programming languages.

This article explores how OpenTelemetry accomplishes distributed tracing and delves into the underlying mechanisms that make it work.

What is Distributed Tracing?

Distributed tracing is a vital mechanism for monitoring and tracking traces across multiple servers, commonly employed in microservices architectures.

OpenTelemetry's distributed tracing primarily consists of two elements: Trace and Span.
A Span contains execution details such as timestamps or SQL query, while a Trace is a tree structure with Span as its nodes.

In the visual representation below, the entire figure represents a Trace, and each bar represents a Span.

It's important to note that a Trace is not a tangible entity; programs generate only Span.
The backend organizes a Trace from received Spans.

Generating Traces from Spans

Distributed tracing can be achieved when the backend constructs a Trace based on Spans.

To create a Trace based on Spans, you need to consider three key elements within the Span:

TraceId
SpanId
ParentSpanId

You can find them in the OpenTelemetry Proto file, specifically in the trace.proto.

TraceID: This is the unique identifier for the entire trace. If, for instance, the TraceID is 9023c11c... in hexadecimal, any span sharing this TraceID (9023c11c...) is part of the same trace.
SpanId: Each Span has a unique identifier within the same Trace
ParentSpanId: This is the identifier of the parent span. When the SpanId of one Span matches the ParentSpanId of another, the Span is considered as the parent of another one.

By combining these elements, we can construct a Trace tree.
Let's illustrate this with an example:

In this example, observe that the ParentSpanId of Spans matches the SpanId of their parent, while different TraceId give rise to different trees.

Propagation in Distributed Tracing

As established in the previous chapter, we learned that Traces can be constructed from elements in Spans.

Now, the question arises: How does a program determine its own TraceId and ParentSpanId?

While sharing TraceId within the same process can be achieved through memory, distributed tracing in a multi-machine environment necessitates a more sophisticated approach.

This is where Propagation becomes essential.

The idea is simple: "Pass your TraceId and SpanId to other machines somehow."
For instance, when making an HTTP request, you can include these identifiers in the headers.

There are some methods for transmitting this information in HTTP headers, due to historical reasons, but I'll focus on the approach standardized by the W3C, which has gained adoption in the industry.

W3C Trace Context

The format of the W3C Trace Context is specified by the W3C, as indicated by its name.

When utilizing W3C Trace Context, the traceparent field is employed in the HTTP request header.

The header follows the format ${version}-${trace-id}-${parent-id}-${trace-flags}.
When using curl, it looks like the example below:

curl \
  -H "traceparent:00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01" \
  localhost

This example demonstrates how the traceparent field is included in an HTTP request header, conveying essential trace information to the destination:

Version: 00 (currently, there is no version other than 00)
TraceId: 4bf92f3577b34da6a3ce929d0e0e4736
ParentSpanId: 00f067aa0ba902b7
Flag: 01 at the end indicates whether the request is being sampled or not.

This header enables the sharing of Trace information with another service.

Conclusion

In conclusion, despite the formidable name of distributed tracing, it essentially boils down to passing data (TraceId, SpanId, ParentSpanId).

That's the way OpenTelemetry facilitates distributed tracing.

DEV Community

How OpenTelemetry Organizes Distributed Tracing

What is Distributed Tracing?

Generating Traces from Spans

Propagation in Distributed Tracing

W3C Trace Context

Conclusion

Top comments (0)

A Workflow Copilot. Tailored to You.

Read next

How 🚀 Go is Changing 💻 the Tech 🌐 Landscape 🏞️ in 2025 👀

JavaScript Interview Questions

Building a Nickname-Based Crypto Transfer Service Like WhiteBIT's QuickSend: A Developer's Guide

Seeder vs Factory: Populating Test Data in Laravel

Okay