DEV Community

Cover image for Debugging Microservices: How Correlation IDs Cut Our Debug Time from Hours to Minutes
Ameni Ben Saada
Ameni Ben Saada

Posted on

Debugging Microservices: How Correlation IDs Cut Our Debug Time from Hours to Minutes

Hey dev friends! πŸ’»

I'm back with a new article to share what I've recently learned. Today, we're going to talk about logging in microservices and how implementing proper logging transformed our debugging workflow from a nightmare into something actually manageable. πŸš€

If you're working with microservices, you know the pain: something breaks in production, and you're jumping between different services trying to figure out what happened. Sound familiar? Let's fix that!


πŸ€” The Problem

Picture this: You have 5 microservices running. A user reports an error. You start investigating.

You check Service A β†’ Nothing obvious.

You check Service B β†’ Maybe something?

You check Service C β†’ Still not sure.

Instead of clear answers, you're piecing together a puzzle from scattered log files, trying to match timestamps and hoping you'll find the connection. Hours pass. 😰

This is a common reality. Many teams have logs, but they aren't useful. Each service logs independently with no way to track a request's journey through the system.


πŸ’‘ The Solution: Correlation IDs

The game changer? Correlation IDs.

Think of correlation IDs like a tracking number for your package πŸ“¦. Just like you can track a package through every shipping center it passes through, a correlation ID lets you track a request through every microservice it touches.

Every request gets a unique ID that follows it through your entire system. When something fails, you:

  • Search that one ID
  • See the complete request journey
  • Identify the exact failure point

Simple, powerful, effective.


πŸ› οΈ How to Build This System

Here's how you can implement a centralized logging system for your microservices:

β†’ Pino for structured, performant logging

β†’ Sentry for error tracking and real-time alerts

β†’ Custom NestJS providers for consistency across services

β†’ Correlation IDs to trace requests end-to-end

Let me show you how to do it step by step! πŸ‘‡


πŸ“ Step 1: Setting Up Pino in NestJS

First, install the necessary packages:

npm install pino pino-pretty pino-http
npm install --save-dev @types/pino
Enter fullscreen mode Exit fullscreen mode

Why Pino? It's fast (really fast!), produces structured JSON logs, and has great NestJS support.


🎯 Step 2: Creating a Custom Logger Provider

Here's where the magic happens. We create a custom logger provider that includes correlation IDs in every log:

// logger.service.ts
import { Injectable, Scope } from '@nestjs/common';
import * as pino from 'pino';

@Injectable({ scope: Scope.TRANSIENT })
export class LoggerService {
  private logger: pino.Logger;
  private context: string;
  private correlationId: string;

  constructor() {
    this.logger = pino({
      level: process.env.LOG_LEVEL || 'info',
      transport: {
        target: 'pino-pretty',
        options: {
          colorize: true,
          translateTime: 'SYS:standard',
          ignore: 'pid,hostname',
        },
      },
    });
  }

  setContext(context: string) {
    this.context = context;
  }

  setCorrelationId(correlationId: string) {
    this.correlationId = correlationId;
  }

  private formatMessage(message: string, data?: any) {
    return {
      message,
      context: this.context,
      correlationId: this.correlationId,
      timestamp: new Date().toISOString(),
      ...data,
    };
  }

  log(message: string, data?: any) {
    this.logger.info(this.formatMessage(message, data));
  }

  error(message: string, trace?: string, data?: any) {
    this.logger.error(this.formatMessage(message, { trace, ...data }));
  }

  warn(message: string, data?: any) {
    this.logger.warn(this.formatMessage(message, data));
  }

  debug(message: string, data?: any) {
    this.logger.debug(this.formatMessage(message, data));
  }
}
Enter fullscreen mode Exit fullscreen mode

What's happening here?

  • We create a transient-scoped service (new instance for each request)
  • Each log includes: message, context, correlationId, timestamp
  • We support different log levels: info, error, warn, debug
  • Everything is structured as JSON for easy searching

πŸ”‘ Step 3: Generating Correlation IDs

Now we need to generate and pass correlation IDs. We do this with a middleware:

// correlation-id.middleware.ts
import { Injectable, NestMiddleware } from '@nestjs/common';
import { Request, Response, NextFunction } from 'express';
import { v4 as uuidv4 } from 'uuid';

@Injectable()
export class CorrelationIdMiddleware implements NestMiddleware {
  use(req: Request, res: Response, next: NextFunction) {
    // Check if correlation ID already exists (from previous service)
    const correlationId = req.headers['x-correlation-id'] as string || uuidv4();

    // Attach it to the request
    req['correlationId'] = correlationId;

    // Add it to response headers
    res.setHeader('x-correlation-id', correlationId);

    next();
  }
}
Enter fullscreen mode Exit fullscreen mode

Key points:

  • If a correlation ID exists (from another service), we use it
  • If not, we generate a new one using UUID
  • We attach it to both request and response

πŸ”— Step 4: Using the Logger in Your Services

Register the middleware in your main module:

// app.module.ts
import { Module, NestModule, MiddlewareConsumer } from '@nestjs/common';
import { CorrelationIdMiddleware } from './correlation-id.middleware';
import { LoggerService } from './logger.service';

@Module({
  providers: [LoggerService],
  exports: [LoggerService],
})
export class AppModule implements NestModule {
  configure(consumer: MiddlewareConsumer) {
    consumer.apply(CorrelationIdMiddleware).forRoutes('*');
  }
}
Enter fullscreen mode Exit fullscreen mode

Now use it in your controllers:

// user.controller.ts
import { Controller, Get, Req } from '@nestjs/common';
import { Request } from 'express';
import { LoggerService } from './logger.service';

@Controller('users')
export class UserController {
  constructor(private readonly logger: LoggerService) {
    this.logger.setContext('UserController');
  }

  @Get()
  async getUsers(@Req() req: Request) {
    const correlationId = req['correlationId'];
    this.logger.setCorrelationId(correlationId);

    this.logger.log('Fetching users');

    // Your business logic here

    return users;
  }
}
Enter fullscreen mode Exit fullscreen mode

🚨 Step 5: Integrating Sentry for Error Tracking

Install Sentry:

npm install @sentry/node
Enter fullscreen mode Exit fullscreen mode

Configure it in your main file:

// main.ts
import * as Sentry from '@sentry/node';

Sentry.init({
  dsn: process.env.SENTRY_DSN,
  environment: process.env.NODE_ENV,
  tracesSampleRate: 1.0,
});

async function bootstrap() {
  const app = await NestFactory.create(AppModule);

  // Add Sentry error handler
  app.use(Sentry.Handlers.requestHandler());
  app.use(Sentry.Handlers.errorHandler());

  await app.listen(3000);
}
Enter fullscreen mode Exit fullscreen mode

Update your logger to send errors to Sentry:

// logger.service.ts
error(message: string, trace?: string, data?: any) {
  const errorData = this.formatMessage(message, { trace, ...data });

  this.logger.error(errorData);

  // Also send to Sentry
  Sentry.captureException(new Error(message), {
    extra: errorData,
  });
}
Enter fullscreen mode Exit fullscreen mode

🌐 Step 6: Passing Correlation IDs Between Services

When making HTTP calls to other microservices, pass the correlation ID:

// some.service.ts
import { HttpService } from '@nestjs/axios';
import { Injectable } from '@nestjs/common';

@Injectable()
export class SomeService {
  constructor(
    private readonly httpService: HttpService,
    private readonly logger: LoggerService,
  ) {}

  async callAnotherService(correlationId: string, data: any) {
    this.logger.log('Calling Service B', { data });

    return this.httpService.post(
      'http://service-b/endpoint',
      data,
      {
        headers: {
          'x-correlation-id': correlationId, // ← Pass it along!
        },
      },
    ).toPromise();
  }
}
Enter fullscreen mode Exit fullscreen mode

πŸ“Š What This Looks Like in Practice

Before:

[2025-01-15 10:30:45] Request received
[2025-01-15 10:30:46] Processing data
[2025-01-15 10:30:47] Error: Operation failed
Enter fullscreen mode Exit fullscreen mode

Which request? Which user? No idea. πŸ€·β€β™€οΈ

After:

{
  "level": "info",
  "message": "Request received",
  "context": "ApiController",
  "correlationId": "550e8400-e29b-41d4-a716-446655440000",
  "timestamp": "2025-01-15T10:30:45.123Z"
}

{
  "level": "info",
  "message": "Processing data",
  "context": "DataService",
  "correlationId": "550e8400-e29b-41d4-a716-446655440000",
  "timestamp": "2025-01-15T10:30:46.456Z"
}

{
  "level": "error",
  "message": "Operation failed: Validation error",
  "context": "DataService",
  "correlationId": "550e8400-e29b-41d4-a716-446655440000",
  "timestamp": "2025-01-15T10:30:47.789Z"
}
Enter fullscreen mode Exit fullscreen mode

Now you can search for 550e8400-e29b-41d4-a716-446655440000 and see the entire request journey across all services! 🎯


✨ The Impact

Before implementing this:

  • Production issue? β†’ Check each service manually
  • Try to correlate timestamps
  • Spend hours debugging

After implementation:

  • Production issue? β†’ Search by correlation ID
  • See complete request flow instantly
  • Debug in minutes

Example scenario: A user reports an error. Instead of manually checking logs across multiple services, you can now search by the correlation ID and immediately see the request traveled through Service A β†’ Service B β†’ Service C, and exactly where it failed with the specific error message. Total time: under 10 minutes. ⚑

Without this system? That same investigation could take an hour or more of manually checking logs and trying to correlate timestamps.


🎯 Key Takeaways

What separates good logging from great logging:

Structure β†’ JSON over plain text (searchable and parseable)

Context β†’ Correlation IDs tracking every request

Proper levels β†’ Debug/info/error used correctly, not everything as info

Centralization β†’ One place to search everything

Real-time alerts β†’ Sentry catches errors before users complain


πŸ’­ Final Thoughts

Setting up proper logging takes time upfront. A solid implementation might take a couple of days. But it can save you hours,sometimes days,on every production issue after that.

If you're building microservices, don't treat logging as an afterthought. It's literally the difference between debugging in 10 minutes vs 3 hours.

Your future self will thank you. Trust me! 😊


πŸ”— Resources


What about you? πŸ€” Have you implemented correlation IDs in your microservices? What challenges did you face? Share your experiences below!

And if you found this helpful, follow me for more articles about web development, NestJS, and DevOps practices. Let's learn together ❀️

Top comments (0)