DEV Community

Cover image for Distributed Tracing with OpenTelemetry and Jaeger for Nest Application
Sagyam Thapa
Sagyam Thapa

Posted on • Originally published at blog.sagyamthapa.com.np

Distributed Tracing with OpenTelemetry and Jaeger for Nest Application

Introduction

Have you ever had a bug that occurred in production and you have no idea what went wrong because your logs won’t tell you exactly what went wrong or a request that takes usually long to process.

Sometimes debugging these issues without a tracing system is impossible. A tracing system is like a CCTV camera that captures every thing what happened, when did it happen, what was the order of events, how long did it each event take. This information is vital for debugging and identifying performance bottlenecks in complex distributed applications.

Prerequisite

  • NodeJS

  • Typescript

  • NestJS

  • Docker

Terminology

  • Trace: A trace is like a complete journey map of a single request as it moves through your entire distributed system. Imagine it as a detailed travel log that follows a request from its starting point to its final destination, capturing every stop and interaction along the way.

Image description

  • Instrumentation: The process of adding code to your application to collect telemetry data. It's like installing GPS trackers in different parts of your system.

  • Exporter: A component responsible for sending collected trace data to a back-end system for storage and analysis. Think of it as a postal service that sends your travel logs to a central archive.

  • Span:

    • Root span: The first span in a trace, marking the beginning of the entire request journey. It's like the starting point of your travel log.
    • Child span: A span that is nested within another span, representing a more specific operation within a broader process.
  • Context propagation: The mechanism of transferring trace information between different services and components. It's like passing a traveler's passport that contains their complete journey details.

  • Metrics: Metrics are numerical data that tells up about app’s performance, health and behavior.

  • Logs: Logs are text entries describing usage patterns, activities, and operations within your application.

Three horsemen of observability

Observability lets you understand a system from the outside by letting you ask questions about that system without knowing its inner workings. It allows you to easily troubleshoot and handle novel problems, that is, “unknown unknowns”. It also answers the question “Why is this happening?”

Image description

Setting up the project

pnpm i -g @nestjs/cli
nest new tracing-app
cd tracing-app
Enter fullscreen mode Exit fullscreen mode

Installing dependencies

Install Jaeger and OpenTelemetry related libraries:

pnpm install @opentelemetry/sdk-trace-node @opentelemetry/resources @opentelemetry/sdk-trace-base 
pnpm install @opentelemetry/instrumentation @prisma/instrumentation @opentelemetry/instrumentation-net @opentelemetry/instrumentation-http @opentelemetry/instrumentation-express
pnpm install @opentelemetry/exporter-trace-otlp-http
pnpm install @opentelemetry/api @opentelemetry/semantic-conventions
Enter fullscreen mode Exit fullscreen mode

Install Prisma ORM and SQLite:

pnpm install @prisma/client sqlite3 class-validator
pnpm install prisma --save-dev
pnpm install --save @nestjs/swagger
Enter fullscreen mode Exit fullscreen mode

Initialize Prisma:

npx prisma init
Enter fullscreen mode Exit fullscreen mode

This will create a prisma directory with a schema.prisma file.

datasource db {
  provider = "sqlite"
  url      = "file:./dev.db"
}

generator client {
  provider = "prisma-client-js"
}

model User {
  id    Int     @id @default(autoincrement())
  name  String
  email String  @unique
}
Enter fullscreen mode Exit fullscreen mode

Run Prisma migrations:

npx prisma migrate dev --name init
Enter fullscreen mode Exit fullscreen mode

Generate Prisma Client:

npx prisma generate
Enter fullscreen mode Exit fullscreen mode

Setup a CRUD endpoint

Generate a CRUD module for users

pnpm nest generate resource users
Enter fullscreen mode Exit fullscreen mode

This will create a users module with a controller, service, and DTOs.

Create a prisma.service.ts file in prisma folder

import { Injectable, OnModuleInit, OnModuleDestroy } from '@nestjs/common';
import { PrismaClient } from '@prisma/client';

@Injectable()
export class PrismaService extends PrismaClient implements OnModuleInit, OnModuleDestroy {
  async onModuleInit() {
    await this.$connect();
  }

  async onModuleDestroy() {
    await this.$disconnect();
  }
}
Enter fullscreen mode Exit fullscreen mode

Update the users.module.ts file to include the PrismaService:

import { Module } from '@nestjs/common';
import { UsersService } from './users.service';
import { UsersController } from './users.controller';
import { PrismaService } from '../../prisma/prisma.service';

@Module({
  controllers: [UsersController],
  providers: [UsersService, PrismaService],
})
export class UsersModule { }
Enter fullscreen mode Exit fullscreen mode

Create a file named create-user.dto.ts in the users/dto directory:

import { IsEmail, IsNotEmpty, IsString } from 'class-validator';
import { ApiProperty } from '@nestjs/swagger';

export class CreateUserDto {
    @ApiProperty({
        description: 'The name of the user',
        example: 'John Doe',
    })
    @IsNotEmpty()
    @IsString()
    name: string;

    @ApiProperty({
        description: 'The email of the user',
        example: 'email@domain.com',
    })
    @IsNotEmpty()
    @IsEmail()
    email: string;
}

export class UpdateUserDto extends PartialType(CreateUserDto) {}
Enter fullscreen mode Exit fullscreen mode

Update the users.service.ts file to use Prisma:

import { Injectable } from '@nestjs/common';
import { PrismaService } from '../prisma/prisma.service';
import { CreateUserDto } from './dto/create-user.dto';
import { UpdateUserDto } from './dto/update-user.dto';

@Injectable()
export class UsersService {
  constructor(private prisma: PrismaService) {}

  create(createUserDto: CreateUserDto) {
    return this.prisma.user.create({
      data: createUserDto,
    });
  }

  findAll() {
    return this.prisma.user.findMany();
  }

  findOne(id: number) {
    return this.prisma.user.findUnique({
      where: { id },
    });
  }

  update(id: number, updateUserDto: UpdateUserDto) {
    return this.prisma.user.update({
      where: { id },
      data: updateUserDto,
    });
  }

  remove(id: number) {
    return this.prisma.user.delete({
      where: { id },
    });
  }
}
Enter fullscreen mode Exit fullscreen mode

Update the users.controller.ts file:

import { Controller, Get, Post, Body, Patch, Param, Delete } from '@nestjs/common';
import { UsersService } from './users.service';
import { CreateUserDto } from './dto/create-user.dto';
import { UpdateUserDto } from './dto/update-user.dto';
import { ApiGoneResponse, ApiNotFoundResponse, ApiOkResponse, ApiOperation, ApiParam, ApiTags } from '@nestjs/swagger';

@ApiTags('users')
@Controller('users')
export class UsersController {
  constructor(private readonly usersService: UsersService) { }

  @ApiOperation({ summary: 'Create user' })
  @ApiOkResponse({ description: 'User created' })
  @Post()
  create(@Body() createUserDto: CreateUserDto) {
    return this.usersService.create(createUserDto);
  }

  @ApiOperation({ summary: 'Get all users' })
  @ApiOkResponse({ description: 'Users found' })
  @Get()
  findAll() {
    return this.usersService.findAll();
  }

  @ApiOperation({ summary: 'Get user by id' })
  @ApiOkResponse({ description: 'User found' })
  @ApiNotFoundResponse({ description: 'User not found' })
  @ApiParam({ name: 'id', description: 'User id' })
  @Get(':id')
  findOne(@Param('id') id: string) {
    return this.usersService.findOne(+id);
  }

  @ApiOperation({ summary: 'Update user' })
  @ApiOkResponse({ description: 'User updated' })
  @ApiNotFoundResponse({ description: 'User not found' })
  @ApiParam({ name: 'id', description: 'User id' })
  @Patch(':id')
  update(@Param('id') id: string, @Body() updateUserDto: UpdateUserDto) {
    return this.usersService.update(+id, updateUserDto);
  }

  @ApiOperation({ summary: 'Delete user' })
  @ApiGoneResponse({ description: 'User deleted' })
  @ApiParam({ name: 'id', description: 'User id' })
  @Delete(':id')
  remove(@Param('id') id: string) {
    return this.usersService.remove(+id);
  }
}
Enter fullscreen mode Exit fullscreen mode

Configuring exporters

Create a file tracing.ts in your src directory:

import { ATTR_SERVICE_NAME, ATTR_SERVICE_VERSION } from '@opentelemetry/semantic-conventions';
import { BatchSpanProcessor } from '@opentelemetry/sdk-trace-base';
import { ExpressInstrumentation } from '@opentelemetry/instrumentation-express';
import { HttpInstrumentation } from '@opentelemetry/instrumentation-http';
import { NetInstrumentation } from '@opentelemetry/instrumentation-net';
import { NodeTracerProvider } from '@opentelemetry/sdk-trace-node';
import { OTLPTraceExporter } from '@opentelemetry/exporter-trace-otlp-http';
import { PrismaInstrumentation } from '@prisma/instrumentation';
import { Resource } from '@opentelemetry/resources';
import { diag, DiagConsoleLogger, DiagLogLevel } from '@opentelemetry/api';
import { registerInstrumentations } from '@opentelemetry/instrumentation';

export function setupTracing() {
    // Enable OpenTelemetry diagnostic logging
    diag.setLogger(new DiagConsoleLogger(), DiagLogLevel.INFO);

    // Create a resource with service information
    const resource = new Resource({
        [ATTR_SERVICE_NAME]: process.env.SERVICE_NAME || 'tracer-app',
        [ATTR_SERVICE_VERSION]: process.env.npm_package_version || '1.0.0',
    });

    const otlpExporter = new OTLPTraceExporter({
        url: process.env.OTEL_EXPORTER_OTLP_ENDPOINT || 'http://localhost:4318/v1/traces',
    });


    // Create tracer provider with resource and span processors
    const provider = new NodeTracerProvider({
        resource,
        spanProcessors: [
            new BatchSpanProcessor(otlpExporter, {
                maxQueueSize: 100,
                scheduledDelayMillis: 5000,
                exportTimeoutMillis: 30000,
                maxExportBatchSize: 50,
            })
        ]
    });

    // Register instrumentations with more comprehensive coverage
    registerInstrumentations({
        tracerProvider: provider,
        instrumentations: [
            new HttpInstrumentation({
                requestHook: (span, request) => {
                    span.setAttribute('http.request.method', request.method);
                },
            }),
            new NetInstrumentation(),
            new ExpressInstrumentation(),
            new PrismaInstrumentation({ middleware: true }),
        ],
    });

    // Register the provider
    provider.register();

    // Return the provider for potential manual instrumentation
    return provider;
}

// Call this at application startup
setupTracing();
Enter fullscreen mode Exit fullscreen mode

Explanation

  • Diagnostic Logging: Enables diagnostic logging using a console logger at the INFO level to debug tracing setup.

  • Resource Initialization:

    • Defines metadata about the service, like SERVICE_NAME and SERVICE_VERSION.
    • This metadata is attached to every trace and helps identify which service the trace belongs to.
  • OTLP Trace Exporter: Configures the OpenTelemetry Protocol (OTLP) exporter to send trace data to the backend which is Jaeger for now using HTTP protocol. Note that Jaeger can be swapped with other backend like Honeycomb, Zipkin etc and you change the protocol to more efficient GRPC from current HTTP.

  • Tracer Provider with Span Processor: Creates a NodeTracerProvider, which manages tracers and spans:

    • Resource: Includes service metadata.
    • BatchSpanProcessor: Buffers spans and exports them in batches to minimize performance impact. Key configurations:

      • maxQueueSize: Maximum spans in the queue before flushing.
      • scheduledDelayMillis: Frequency of flushing spans.
      • exportTimeoutMillis: Max time allowed for export.
      • maxExportBatchSize: Maximum spans per export batch.
  • Register Instrumentation: Automatically captures traces for libraries and frameworks:

    • HttpInstrumentation: Captures time taken by HTTP requests/responses.
    • NetInstrumentation: Captures time taken by low-level networking events.
    • ExpressInstrumentation: Tracks time taken by Express middleware and routes.
    • PrismaInstrumentation: Tracks time taken by SQL queries generated by Prisma

Inject instrumenting code in your application

Import and initialize the tracing configuration in your main application file main.ts:

import { NestFactory } from '@nestjs/core';
import { SwaggerModule, DocumentBuilder } from '@nestjs/swagger';
import { AppModule } from './app.module';
import './tracing';

async function bootstrap() {
  const app = await NestFactory.create(AppModule);

  const config = new DocumentBuilder()
    .setTitle('Tracing example')
    .setDescription('The tracing API description')
    .setVersion('1.0')
    .addTag('tracing')
    .build();
  const documentFactory = () => SwaggerModule.createDocument(app, config);
  SwaggerModule.setup('api-docs', app, documentFactory);

  await app.listen(process.env.PORT ?? 3000);
}
bootstrap();
Enter fullscreen mode Exit fullscreen mode

Run Your Application

pnpm run start:dev
Enter fullscreen mode Exit fullscreen mode

Setting Jaeger for development environment

Easiest way to setup Jaeger is with docker-compose winch will work fine a development environment.

Create docker-compose.yaml file:

services:
  jaeger:
    image: jaegertracing/all-in-one:1.63.0
    container_name: jaeger
    environment:
      COLLECTOR_OTLP_ENABLED: "true"
    ports:
      - "4317:4317" # For Jaeger-GRPC
      - "4318:4318" # For Jaeger-HTTP
      - "16686:16686" # # Web UI

networks:
  default:
    driver: bridge
Enter fullscreen mode Exit fullscreen mode

Containerize app (optional)

You can use docker init command to automatically generate an optimized Dockerfile if you have newer version of docker installed.

# Arguments for versions
ARG NODE_VERSION=20.18.0
ARG PNPM_VERSION=9.12.2
ARG ALPINE_VERSION=3.20

################################################################################
# Base stage: Build the application
FROM node:${NODE_VERSION}-alpine${ALPINE_VERSION} AS builder

# Set working directory
WORKDIR /usr/src/app

# Install pnpm globally with cache
RUN --mount=type=cache,target=/root/.npm \
    npm install -g pnpm@${PNPM_VERSION}

# Copy package.json and pnpm-lock.yaml to install dependencies
COPY ../package.json pnpm-lock.yaml ./

# Install dependencies with cache
RUN --mount=type=cache,target=/root/.pnpm-store \
    pnpm install --frozen-lockfile

# Copy the all application code
COPY .. .

# Setup prisma
RUN pnpm prisma generate

# Build the application
RUN pnpm run build

# Runner Stage
FROM node:${NODE_VERSION}-alpine${ALPINE_VERSION} AS runner

# Set working directory
WORKDIR /usr/src/app

# Copy the built application from the builder stage
COPY --from=builder /usr/src/app/dist ./dist
COPY ../package.json pnpm-lock.yaml ./
COPY ../prisma/schema.prisma ./prisma/schema.prisma

# Install pnpm globally
RUN --mount=type=cache,target=/root/.npm \
    npm install -g pnpm@${PNPM_VERSION}

# Install dependencies with cache
RUN --mount=type=cache,target=/root/.pnpm-store \
    pnpm install --frozen-lockfile --prod

# Set NODE_ENV to production
ENV NODE_ENV=production

# Run the application
CMD ["pnpm", "run", "start:prod"]
Enter fullscreen mode Exit fullscreen mode

Swagger UI

Visit http://localhost:3000/api-docs and make some API calls

Image description

Visualizing traces

Open your browser and go to http://localhost:16686 to see the Jaeger UI. Run some request and click Find Trace and click on a trace

Image description

Image description

Image description

Top comments (0)