DEV Community

Cover image for End-to-End Distributed Tracing in Node.js Microservices Using OpenTelemetry & Grafana
Shafqat Awan
Shafqat Awan

Posted on

End-to-End Distributed Tracing in Node.js Microservices Using OpenTelemetry & Grafana

End-to-End Distributed Tracing in Node.js Microservices with OpenTelemetry & Grafana

Observability is a cornerstone of modern cloud-native architectures. In this tutorial, we build a production-style Node.js microservices system using an API Gateway, Auth Service, and Order Service, and implement end-to-end distributed tracing using OpenTelemetry and Grafana Tempo.

The objective is to see every request end-to-end — from the client, through the API Gateway, into backend services — both locally with Docker and in the cloud using Grafana Cloud.

This is not a toy demo. It mirrors how observability is implemented in real systems.


Why Distributed Tracing Matters

In a microservices architecture, a single user request often passes through multiple services. Without tracing, questions like these are extremely hard to answer:

  • Why is this request slow?
  • Which service failed?
  • Where is latency introduced?
  • How are services interacting in production?

OpenTelemetry standardizes how traces are generated and propagated, while Grafana Tempo stores and visualizes them at scale.


Architecture Overview

Client

API Gateway (Cloudflare Workers)
├── Auth Service (Railway) → Login / JWT Verification
└── Order Service (Railway) → Protected Business Data

Key design decisions:

  • All external traffic enters through the API Gateway
  • The API Gateway acts as the root span
  • Trace context propagates automatically downstream
  • All spans share a single trace ID

Core Components

API Gateway

  • Entry point for all client requests
  • Creates root trace span
  • Forwards requests to backend services
  • Propagates trace headers

Auth Service

  • Handles login and token verification
  • Emits child spans
  • Participates in trace propagation

Order Service

  • Serves protected resources
  • Emits child spans linked to the same trace

Observability Module

  • Local Docker-based observability stack
  • Grafana Tempo + Grafana UI
  • OTLP HTTP ingestion

Local Setup

Start the observability stack

cd observability
docker compose up
Enter fullscreen mode Exit fullscreen mode

Grafana UI will be available at:

http://localhost:3000
Enter fullscreen mode Exit fullscreen mode

Start services (in separate terminals)

cd auth-service
node index.js
Enter fullscreen mode Exit fullscreen mode
cd order-service
node index.js
Enter fullscreen mode Exit fullscreen mode
cd api-gateway
node index.js
Enter fullscreen mode Exit fullscreen mode

Implementing Distributed Tracing

1. Tracing Initialization (tracing.js)

Each service initializes OpenTelemetry using the Node SDK:

const { NodeSDK } = require("@opentelemetry/sdk-node");
const { getNodeAutoInstrumentations } = require("@opentelemetry/auto-instrumentations-node");
const { OTLPTraceExporter } = require("@opentelemetry/exporter-trace-otlp-http");

const sdk = new NodeSDK({
  traceExporter: new OTLPTraceExporter({
    url: "http://tempo:4318/v1/traces",
  }),
  instrumentations: [getNodeAutoInstrumentations()],
});

sdk.start();
Enter fullscreen mode Exit fullscreen mode

This automatically instruments:

  • HTTP
  • Express
  • Fetch
  • Node runtime operations

2. API Gateway as Root Span

The API Gateway creates the root span for every request:

const { context, trace, SpanKind } = require("@opentelemetry/api");
const tracer = trace.getTracer("api-gateway");

app.use((req, res, next) => {
  const span = tracer.startSpan(`HTTP ${req.method} ${req.path}`, {
    kind: SpanKind.SERVER,
    attributes: {
      "service.name": "api-gateway",
      "http.method": req.method,
      "http.route": req.path,
    },
  });

  const spanContext = trace.setSpan(context.active(), span);

  context.with(spanContext, () => {
    res.on("finish", () => span.end());
    next();
  });
});
Enter fullscreen mode Exit fullscreen mode

This ensures:

  • Every request starts a new trace
  • All downstream spans attach to this root

3. Propagating Trace Headers

Trace context must be forwarded when calling downstream services:

const { propagation, context } = require("@opentelemetry/api");

function injectTraceHeaders(headers = {}) {
  const carrier = {};
  propagation.inject(context.active(), carrier);
  return { ...headers, ...carrier };
}

// Example downstream call
await fetch(`${AUTH_SERVICE_URL}/verify`, {
  method: "POST",
  headers: injectTraceHeaders({
    Authorization: token,
  }),
});
Enter fullscreen mode Exit fullscreen mode

This links Auth Service and Order Service spans under the same trace.


Observing Traces in Grafana

Once requests flow through the system, Grafana Tempo displays traces like:

api-gateway (root)
├── auth-service
│   └── verify-token
└── order-service
    └── get-orders
Enter fullscreen mode Exit fullscreen mode

This allows you to:

  • See request latency per service
  • Identify bottlenecks
  • Debug failures visually

Cloud Deployment

Component Platform
API Gateway Cloudflare Workers
Auth Service Railway
Order Service Railway
Tracing Backend Grafana Cloud Tempo
Visualization Grafana Cloud UI

The same tracing setup works across cloud boundaries with no code changes.


What You’ll Learn

After completing this tutorial, you will understand:

  • How distributed tracing works internally
  • Why API Gateways must act as trace roots
  • How trace context flows across services
  • How Grafana Tempo visualizes microservice interactions
  • How production systems are debugged using observability

Video Walkthrough

You can watch the complete step-by-step coding walkthrough on YouTube:

Node.js Microservices Observability — OpenTelemetry & Grafana
👉 [https://youtu.be/wyiem6fc47Q] CodingMavrick


Final Thoughts

Observability is not an optional add-on — it’s a core requirement for modern backend systems.

This project demonstrates how real microservices are:

  • Designed
  • Connected
  • Observed
  • Debugged

If you’re building Node.js microservices for production, this setup gives you the full picture — from first request to cloud trace visualization.

Happy tracing
CodingMavrick

Top comments (0)