DEV Community

Cover image for The Mechanics of Distributed Tracing in OpenTelemetry
Siddhant Khare
Siddhant Khare

Posted on

The Mechanics of Distributed Tracing in OpenTelemetry

Introduction

OpenTelemetry is an open-source observability framework that provides mechanisms for creating and sending traces, metrics, and logs. It consists of various elements such as protocols for transmission and SDKs for different programming languages. In this article, we will explore how OpenTelemetry achieves distributed tracing.

What is Distributed Tracing?

Distributed tracing is a technique for tracking and monitoring traces across multiple servers, like microservices. It helps to visualize and understand the flow of a request as it traverses through different services.

Key Components of Distributed Tracing

  • Trace: A collection of spans representing a single request or transaction.
  • Span: A single unit of work within a trace, representing a specific operation.

A trace is a tree structure composed of multiple spans. Here's a visual representation:

Image description

Explain Like I'm 5 explanation about Distributed Tracing (for LinkedIn users, for Twitter users)

Understanding Trace from Span

To achieve distributed tracing, it is essential to understand the relationship between traces and spans. Each span includes the following elements:

  • TraceId: The ID of the trace to which the span belongs.
  • SpanId: A unique ID for the span within the trace.
  • ParentSpanId: The ID of the parent span.

These elements are specified in the span using Protocol Buffers.

Opentelemetry proto

Code snippet ref

Example of Span Elements in a Trace

Consider the following Go code example:

package main

import (
    "context"
    "go.opentelemetry.io/otel"
    "go.opentelemetry.io/otel/trace"
)

func CreateTrace() {
    tracer := otel.Tracer("example-tracer")
    ctx, parentSpan := tracer.Start(context.Background(), "parent-span")
    defer parentSpan.End()

    ctx, childSpan := tracer.Start(ctx, "child-span")
    defer childSpan.End()
}
Enter fullscreen mode Exit fullscreen mode

If you print this to stdout, it will output something like this:

{
  "Name": "child-span",
  "SpanContext": {
    "TraceID": "9023c11c3272a955da5f499faa9afa71",
    "SpanID": "ca44f59e13b40d44"
  },
  "Parent": {
    "TraceID": "9023c11c3272a955da5f499faa9afa71",
    "SpanID": "70e471ef5735034d"
  }
}
{
  "Name": "parent-span",
  "SpanContext": {
    "TraceID": "9023c11c3272a955da5f499faa9afa71",
    "SpanID": "70e471ef5735034d"
  },
  "Parent": {
    "TraceID": "00000000000000000000000000000000",
    "SpanID": "0000000000000000"
  }
}
Enter fullscreen mode Exit fullscreen mode

In this example, the TraceId is the same for both spans, indicating they belong to the same trace. The ParentSpanId of the child span matches the SpanId of the parent span, establishing a parent-child relationship.

example diagram of parent, child span

Propagation of Trace Context

To enable distributed tracing across multiple services, the trace context needs to be propagated. This is achieved by passing the TraceId and SpanId through headers in HTTP requests.

W3C Trace Context

The W3C Trace Context specification standardizes how trace context information is passed. The traceparent header is used in HTTP requests with the format: ${version}-${trace-id}-${parent-id}-${trace-flags}.

Example using curl:

curl -H "traceparent: 00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01" localhost
Enter fullscreen mode Exit fullscreen mode

Propagation in Go

Here's an example of a server and client in Go that demonstrates trace context propagation:

Server Code

package main

import (
    "fmt"
    "net/http"
    "go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp"
    "go.opentelemetry.io/otel/exporters/stdout/stdouttrace"
    "go.opentelemetry.io/otel/sdk/trace"
    sdktrace "go.opentelemetry.io/otel/sdk/trace"
)

func RunServer() {
    exp, _ := stdouttrace.New()
    tp := sdktrace.NewTracerProvider(
        sdktrace.WithBatcher(exp),
    )

    otelHandler := otelhttp.NewHandler(http.HandlerFunc(handler), "handle-request", otelhttp.WithTracerProvider(tp))
    http.Handle("/", otelHandler)
    http.ListenAndServe(":9002", nil)
}

func handler(w http.ResponseWriter, r *http.Request) {
    fmt.Println("handled")
}
Enter fullscreen mode Exit fullscreen mode

Client Code

package main

import (
    "context"
    "io"
    "go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp"
    "go.opentelemetry.io/otel"
)

func CreatePropagationTrace() {
    tracer := otel.Tracer("example-tracer")
    ctx, span := tracer.Start(context.Background(), "hello-span")
    defer span.End()

    req, _ := otelhttp.Get(ctx, "http://localhost:9002")
    io.ReadAll(req.Body)
}
Enter fullscreen mode Exit fullscreen mode

Output

When you run the server and client, the output will show the trace propagation:

Server Trace

{
  "Name": "handle-request",
  "SpanContext": {
    "TraceID": "817f4043c5837f2bbb44562f3683f274",
    "SpanID": "3bba3b994e029bfc"
  },
  "Parent": {
    "TraceID": "817f4043c5837f2bbb44562f3683f274",
    "SpanID": "892d624c6f0c01a6"
  }
}
Enter fullscreen mode Exit fullscreen mode

Client Trace

{
  "Name": "HTTP GET",
  "SpanContext": {
    "TraceID": "817f4043c5837f2bbb44562f3683f274",
    "SpanID": "892d624c6f0c01a6"
  },
  "Parent": {
    "TraceID": "817f4043c5837f2bbb44562f3683f274",
    "SpanID": "1f312e90fb65c0e3"
  }
}
{
  "Name": "hello-span",
  "SpanContext": {
    "TraceID": "817f4043c5837f2bbb44562f3683f274",
    "SpanID": "1f312e90fb65c0e3"
  },
  "Parent": {
    "TraceID": "00000000000000000000000000000000",
    "SpanID": "0000000000000000"
  }
}
Enter fullscreen mode Exit fullscreen mode

Overall process

Conclusion

Distributed tracing with OpenTelemetry enables us to track and monitor requests across multiple services by passing trace context through headers. By understanding and implementing the elements of TraceId, SpanId, and ParentSpanId, we can visualize the flow of a request and diagnose issues more effectively.

With the standardized W3C Trace Context, trace context propagation becomes consistent and interoperable across different services and platforms.

This article has covered the basics of how OpenTelemetry achieves distributed tracing, providing code examples and visualizations to illustrate the concepts. Happy tracing!

For more details, visit the OpenTelemetry Documentation.


For more tips and insights on monitoring and tech, follow me on Twitter @Siddhant_K_code and stay updated with the latest & detailed tech content like this. Happy coding!

Top comments (0)