End-to-End Distributed Tracing in Node.js Microservices with OpenTelemetry & Grafana
Observability is a cornerstone of modern cloud-native architectures. In this tutorial, we build a production-style Node.js microservices system using an API Gateway, Auth Service, and Order Service, and implement end-to-end distributed tracing using OpenTelemetry and Grafana Tempo.
The objective is to see every request end-to-end — from the client, through the API Gateway, into backend services — both locally with Docker and in the cloud using Grafana Cloud.
This is not a toy demo. It mirrors how observability is implemented in real systems.
Why Distributed Tracing Matters
In a microservices architecture, a single user request often passes through multiple services. Without tracing, questions like these are extremely hard to answer:
- Why is this request slow?
- Which service failed?
- Where is latency introduced?
- How are services interacting in production?
OpenTelemetry standardizes how traces are generated and propagated, while Grafana Tempo stores and visualizes them at scale.
Architecture Overview
Client
↓
API Gateway (Cloudflare Workers)
├── Auth Service (Railway) → Login / JWT Verification
└── Order Service (Railway) → Protected Business Data
Key design decisions:
- All external traffic enters through the API Gateway
- The API Gateway acts as the root span
- Trace context propagates automatically downstream
- All spans share a single trace ID
Core Components
API Gateway
- Entry point for all client requests
- Creates root trace span
- Forwards requests to backend services
- Propagates trace headers
Auth Service
- Handles login and token verification
- Emits child spans
- Participates in trace propagation
Order Service
- Serves protected resources
- Emits child spans linked to the same trace
Observability Module
- Local Docker-based observability stack
- Grafana Tempo + Grafana UI
- OTLP HTTP ingestion
Local Setup
Start the observability stack
cd observability
docker compose up
Grafana UI will be available at:
http://localhost:3000
Start services (in separate terminals)
cd auth-service
node index.js
cd order-service
node index.js
cd api-gateway
node index.js
Implementing Distributed Tracing
1. Tracing Initialization (tracing.js)
Each service initializes OpenTelemetry using the Node SDK:
const { NodeSDK } = require("@opentelemetry/sdk-node");
const { getNodeAutoInstrumentations } = require("@opentelemetry/auto-instrumentations-node");
const { OTLPTraceExporter } = require("@opentelemetry/exporter-trace-otlp-http");
const sdk = new NodeSDK({
traceExporter: new OTLPTraceExporter({
url: "http://tempo:4318/v1/traces",
}),
instrumentations: [getNodeAutoInstrumentations()],
});
sdk.start();
This automatically instruments:
- HTTP
- Express
- Fetch
- Node runtime operations
2. API Gateway as Root Span
The API Gateway creates the root span for every request:
const { context, trace, SpanKind } = require("@opentelemetry/api");
const tracer = trace.getTracer("api-gateway");
app.use((req, res, next) => {
const span = tracer.startSpan(`HTTP ${req.method} ${req.path}`, {
kind: SpanKind.SERVER,
attributes: {
"service.name": "api-gateway",
"http.method": req.method,
"http.route": req.path,
},
});
const spanContext = trace.setSpan(context.active(), span);
context.with(spanContext, () => {
res.on("finish", () => span.end());
next();
});
});
This ensures:
- Every request starts a new trace
- All downstream spans attach to this root
3. Propagating Trace Headers
Trace context must be forwarded when calling downstream services:
const { propagation, context } = require("@opentelemetry/api");
function injectTraceHeaders(headers = {}) {
const carrier = {};
propagation.inject(context.active(), carrier);
return { ...headers, ...carrier };
}
// Example downstream call
await fetch(`${AUTH_SERVICE_URL}/verify`, {
method: "POST",
headers: injectTraceHeaders({
Authorization: token,
}),
});
This links Auth Service and Order Service spans under the same trace.
Observing Traces in Grafana
Once requests flow through the system, Grafana Tempo displays traces like:
api-gateway (root)
├── auth-service
│ └── verify-token
└── order-service
└── get-orders
This allows you to:
- See request latency per service
- Identify bottlenecks
- Debug failures visually
Cloud Deployment
| Component | Platform |
|---|---|
| API Gateway | Cloudflare Workers |
| Auth Service | Railway |
| Order Service | Railway |
| Tracing Backend | Grafana Cloud Tempo |
| Visualization | Grafana Cloud UI |
The same tracing setup works across cloud boundaries with no code changes.
What You’ll Learn
After completing this tutorial, you will understand:
- How distributed tracing works internally
- Why API Gateways must act as trace roots
- How trace context flows across services
- How Grafana Tempo visualizes microservice interactions
- How production systems are debugged using observability
Video Walkthrough
You can watch the complete step-by-step coding walkthrough on YouTube:
Node.js Microservices Observability — OpenTelemetry & Grafana
👉 [https://youtu.be/wyiem6fc47Q] CodingMavrick
Final Thoughts
Observability is not an optional add-on — it’s a core requirement for modern backend systems.
This project demonstrates how real microservices are:
- Designed
- Connected
- Observed
- Debugged
If you’re building Node.js microservices for production, this setup gives you the full picture — from first request to cloud trace visualization.
Happy tracing
CodingMavrick
Top comments (0)