DEV Community

Cover image for Monitoring and Testing Cloud Native APIs with Grafana
Adnan Rahić for Kubeshop

Posted on • Originally published at tracetest.io

Monitoring and Testing Cloud Native APIs with Grafana

Grafana, when combined with distributed tracing, is widely used for troubleshooting and diagnosing problems. What if you could use the data captured in the distributed trace as part of your testing strategy to prevent errors from reaching production in the first place?

By combining Grafana Tempo with Tracetest, you can create a robust solution for monitoring and testing APIs with distributed tracing.

This tutorial guides you through setting up and using Docker Compose to run Grafana Tempo and Tracetest, enabling effective monitoring and testing of your APIs.

https://res.cloudinary.com/djwdcmwdz/image/upload/v1687272161/Blogposts/grafana-tracetest/screely-1687272153898_ei28gt.png

See the full code for the example app you’ll build in the GitHub repo, here.

Microservices are Hard to Monitor…

I’ll use a sample microservice app called Pokeshop to demo distributed tracing and how to forward traces to Grafana Tempo.

It consists of 5 services.

  1. Node.js API
    1. HTTP
    2. gRPC
  2. Node.js Worker
  3. RabbitMQ (Queue)
  4. Redis (Cache)
  5. Postgres

https://res.cloudinary.com/djwdcmwdz/image/upload/v1687427148/Blogposts/grafana-tracetest/pokeshop-draw-jun22_f3lvgo.png

I’ve prepared a docker-compose.yaml file with the Pokeshop services. Check it out here.

version: "3"
services:

  # Demo
  postgres:
    image: postgres:14
    environment:
      POSTGRES_PASSWORD: postgres
      POSTGRES_USER: postgres
    healthcheck:
      test: pg_isready -U "$$POSTGRES_USER" -d "$$POSTGRES_DB"
      interval: 1s
      timeout: 5s
      retries: 60
    ports:
      - 5432:5432

  demo-cache:
    image: redis:6
    restart: unless-stopped
    healthcheck:
      test: ["CMD", "redis-cli", "ping"]
      interval: 1s
      timeout: 3s
      retries: 60

  demo-queue:
    image: rabbitmq:3.8-management
    restart: unless-stopped
    healthcheck:
      test: rabbitmq-diagnostics -q check_running
      interval: 1s
      timeout: 5s
      retries: 60

  demo-api:
    image: kubeshop/demo-pokemon-api:latest
    restart: unless-stopped
    pull_policy: always
    environment:
      REDIS_URL: demo-cache
      DATABASE_URL: postgresql://postgres:postgres@postgres:5432/postgres?schema=public
      RABBITMQ_HOST: demo-queue
      POKE_API_BASE_URL: https://pokeapi.co/api/v2
      COLLECTOR_ENDPOINT: http://otel-collector:4317
      NPM_RUN_COMMAND: api
    ports:
      - "8081:8081"
    healthcheck:
      test: ["CMD", "wget", "--spider", "localhost:8081"]
      interval: 1s
      timeout: 3s
      retries: 60
    depends_on:
      postgres:
        condition: service_healthy
      demo-cache:
        condition: service_healthy
      demo-queue:
        condition: service_healthy

  demo-worker:
    image: kubeshop/demo-pokemon-api:latest
    restart: unless-stopped
    pull_policy: always
    environment:
      REDIS_URL: demo-cache
      DATABASE_URL: postgresql://postgres:postgres@postgres:5432/postgres?schema=public
      RABBITMQ_HOST: demo-queue
      POKE_API_BASE_URL: https://pokeapi.co/api/v2
      COLLECTOR_ENDPOINT: http://otel-collector:4317
      NPM_RUN_COMMAND: worker
    depends_on:
      postgres:
        condition: service_healthy
      demo-cache:
        condition: service_healthy
      demo-queue:
        condition: service_healthy

  demo-rpc:
    image: kubeshop/demo-pokemon-api:latest
    restart: unless-stopped
    pull_policy: always
    environment:
      REDIS_URL: demo-cache
      DATABASE_URL: postgresql://postgres:postgres@postgres:5432/postgres?schema=public
      RABBITMQ_HOST: demo-queue
      POKE_API_BASE_URL: https://pokeapi.co/api/v2
      COLLECTOR_ENDPOINT: http://otel-collector:4317
      NPM_RUN_COMMAND: rpc
    ports:
      - 8082:8082
    healthcheck:
      test: ["CMD", "lsof", "-i", "8082"]
      interval: 1s
      timeout: 3s
      retries: 60
    depends_on:
      postgres:
        condition: service_healthy
      demo-cache:
        condition: service_healthy
      demo-queue:
        condition: service_healthy
  # Demo End
Enter fullscreen mode Exit fullscreen mode

OpenTelemetry Instrumentation in the Pokeshop Microservice App

The Pokeshop is configured with OpenTelemetry code instrumentation using the official tracing libraries. These libraries will capture and propagate distributed traces across the Pokeshop microservice app.

The tracing libraries are configured to send traces to OpenTelemetry Collector. The OpenTelemetry Collector will then forward traces to Grafana Tempo. It will be explained in the following section.

By opening the tracing.ts you can see how to set up the OpenTelemetry SDKs to instrument your code. It contains all the required modules and helper functions.

// tracing.ts

import * as opentelemetry from '@opentelemetry/api';
import { NodeSDK } from '@opentelemetry/sdk-node';
import { OTLPTraceExporter } from '@opentelemetry/exporter-trace-otlp-grpc';
import { Resource } from '@opentelemetry/resources';
import * as dotenv from 'dotenv';
import { SemanticResourceAttributes } from '@opentelemetry/semantic-conventions';
import { SpanStatusCode } from '@opentelemetry/api';

dotenv.config(); // Loaded from .env

const { COLLECTOR_ENDPOINT = '', SERVICE_NAME = 'pokeshop' } = process.env;

// [rest of the file]
// ...
Enter fullscreen mode Exit fullscreen mode

I’m using an env var for the OpenTelemetry Collector endpoint. See the .env file here.

DATABASE_URL="postgresql://ashketchum:squirtle123@localhost:5434/pokeshop?schema=public"
REDIS_URL=localhost
RABBITMQ_HOST=localhost
POKE_API_BASE_URL=https://pokeapi.co/api/v2
COLLECTOR_ENDPOINT=http://localhost:4317
APP_PORT=8081
RPC_PORT=8082
Enter fullscreen mode Exit fullscreen mode

The rest of the tracing.ts file contains helper methods for creating trace spans.

// tracing.js

// [...]

let globalTracer: opentelemetry.Tracer | null = null;

async function createTracer(): Promise<opentelemetry.Tracer> {
  const collectorExporter = new OTLPTraceExporter({
    url: COLLECTOR_ENDPOINT,
  });

  const sdk = new NodeSDK({
    traceExporter: collectorExporter,
    instrumentations: [],
  });

  sdk.addResource(
    new Resource({
      [SemanticResourceAttributes.SERVICE_NAME]: SERVICE_NAME,
    })
  );

  await sdk.start();
  process.on('SIGTERM', () => {
    sdk
      .shutdown()
      .then(
        () => console.log('SDK shut down successfully'),
        err => console.log('Error shutting down SDK', err)
      )
      .finally(() => process.exit(0));
  });

  const tracer = opentelemetry.trace.getTracer(SERVICE_NAME);

  globalTracer = tracer;

  return globalTracer;
}

async function getTracer(): Promise<opentelemetry.Tracer> {
  if (globalTracer) {
    return globalTracer;
  }

  return createTracer();
}

async function getParentSpan(): Promise<opentelemetry.Span | undefined> {
  const parentSpan = opentelemetry.trace.getSpan(opentelemetry.context.active());
  if (!parentSpan) {
    return undefined;
  }

  return parentSpan;
}

async function createSpan(
  name: string,
  parentSpan?: opentelemetry.Span | undefined,
  options?: opentelemetry.SpanOptions | undefined
): Promise<opentelemetry.Span> {
  const tracer = await getTracer();
  if (parentSpan) {
    const context = opentelemetry.trace.setSpan(opentelemetry.context.active(), parentSpan);

    return createSpanFromContext(name, context, options);
  }

  return tracer.startSpan(name);
}

async function createSpanFromContext(
  name: string,
  ctx: opentelemetry.Context,
  options?: opentelemetry.SpanOptions | undefined
): Promise<opentelemetry.Span> {
  const tracer = await getTracer();
  if (!ctx) {
    return tracer.startSpan(name, options, opentelemetry.context.active());
  }

  return tracer.startSpan(name, options, ctx);
}

async function runWithSpan<T>(parentSpan: opentelemetry.Span, fn: () => Promise<T>): Promise<T> {
  const ctx = opentelemetry.trace.setSpan(opentelemetry.context.active(), parentSpan);

  try {
    return await opentelemetry.context.with(ctx, fn);
  } catch (ex) {
    parentSpan.recordException(ex);
    parentSpan.setStatus({ code: SpanStatusCode.ERROR });
    throw ex;
  }
}

export { getTracer, getParentSpan, createSpan, createSpanFromContext, runWithSpan };
Enter fullscreen mode Exit fullscreen mode

Monitoring with Grafana, Tempo and OpenTelemetry Collector

Grafana Tempo is a powerful solution for monitoring and testing APIs using distributed tracing. Tempo provides a highly scalable, cost-effective, and easy-to-use trace data store. It’s optimized for trace visualization with Grafana. With Tempo, you can monitor and test your APIs in real time. This allows you to identify potential bottlenecks or performance issues and respond quickly to ensure the reliability and performance of your APIs.

In this section, you’ll learn how to configure:

  • Grafana Tempo. First, you’ll set up Grafana Tempo to receive and store traces from the Pokeshop app. It will need the OpenTelemetry Collector as the main trace receiver and forwarder.
  • OpenTelemetry Collector. The OpenTelemetry Collector will receive traces from the Pokeshop app and forward them to Grafana Tempo.
  • Grafana. Lastly, I’ll explain how to configure Grafana to read trace data from Tempo.

https://res.cloudinary.com/djwdcmwdz/image/upload/v1687427573/Blogposts/grafana-tracetest/pokeshop_grafana-draw-jun22_e2wek4.png

Adding OpenTelemetry Collector, Tempo and Grafana to Docker Compose

You need to add 3 more services to the Docker Compose.

# docker-compose.yaml

[...]

  # Grafana
  otel-collector:
    image: otel/opentelemetry-collector-contrib:0.59.0
    command:
      - "--config"
      - "/otel-local-config.yaml"
    volumes:
      - ./collector.config.yaml:/otel-local-config.yaml
    depends_on:
      - tempo
  tempo:
    image: grafana/tempo:latest
    command: [ "-config.file=/etc/tempo.yaml" ]
    volumes:
      - ./tempo.config.yaml:/etc/tempo.yaml
      - ./tempo-data:/tmp/tempo
    ports:
      - "3200"   # tempo
      - "4317"  # otlp grpc
      - "4318"  # otlp http
  grafana:
    image: grafana/grafana:9.4.3
    volumes:
      - ./grafana.config.yaml:/etc/grafana/provisioning/datasources/datasources.yaml
    environment:
      - GF_AUTH_ANONYMOUS_ENABLED=true
      - GF_AUTH_ANONYMOUS_ORG_ROLE=Admin
      - GF_AUTH_DISABLE_LOGIN_FORM=true
      - GF_FEATURE_TOGGLES_ENABLE=traceqlEditor
    ports:
      - "3000:3000"
  # Grafana End
Enter fullscreen mode Exit fullscreen mode

For these three services, you are loading three dedicated config files. Keep the config files in the same directory as the docker-compose.yaml file. Let’s move on to the configuration!

OpenTelemetry Collector Configuration

The OpenTelemetry Collector is configured via a config file. Let’s configure it to ingest traces on the default HTTP and GRPC ports via the OTLP protocol.

  • HTTP: 4318
  • gRPC: 4317

Create a file called collector.config.yaml.

# collector.config.yaml

receivers:
  otlp:
    protocols:
      grpc:
      http:

processors:
  batch:
    timeout: 100ms

exporters:
  logging:
    loglevel: debug
  otlp/tempo:
    endpoint: tempo:4317
    tls:
      insecure: true

service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [batch]
      exporters: [otlp/tempo]
Enter fullscreen mode Exit fullscreen mode

The exporter config defines the location you’ll send traces to. In this case, Tempo. The Tempo ingestion endpoint uses OTLP as well and uses the same port as the OpenTelemetry Collector, 4317.

Now, let’s configure Tempo to receive the traces.

Grafana Tempo Configuration

Tempo is configured with a config file. Create another file in the same directory as the docker-compose.yaml, called tempo.config.yaml.

# tempo.config.yaml

auth_enabled: false

server:
  http_listen_port: 3200
  grpc_listen_port: 9095

distributor:
  receivers:
    otlp:
      protocols:
        http:
        grpc:

ingester:
  trace_idle_period: 10s
  max_block_bytes: 1_000_000
  max_block_duration: 5m

compactor:
  compaction:
    compaction_window: 1h
    max_compaction_objects: 1000000
    block_retention: 1h
    compacted_block_retention: 10m

storage:
  trace:
    backend: local
    wal:
      path: /tmp/tempo/wal
    local:
      path: /tmp/tempo/blocks
    pool:
      max_workers: 100
      queue_depth: 10000
Enter fullscreen mode Exit fullscreen mode

The important configs to note are the server and distributor sections.

The server defines how to access and query Tempo, and the distributor will define how to ingest traces into Tempo.

  • Use port 3200 to query for traces from Tempo in the Grafana dashboards.
  • Use port 9095 to query for traces from Tracetest when running integration tests.

Let’s set up Grafana and explore the trace data.

Configuring Grafana Data Sources

For Grafana, you define data sources that are used in a config file. Create another file in the same directory as the docker-compose.yaml. Give it a name, grafana.config.yaml.

# grafana.config.yaml

apiVersion: 1

datasources:
- name: Tempo
  type: tempo
  access: proxy
  orgId: 1
  url: http://tempo:3200
  basicAuth: false
  isDefault: true
  version: 1
  editable: false
  apiVersion: 1
  uid: tempo
Enter fullscreen mode Exit fullscreen mode

You can see that the URL field matches the Tempo http_listen_port.

View Traces in Grafana

With Tempo, OpenTelemetry Collector and Grafana added, restart your Docker Compose.

docker compose down
docker compose up --build
Enter fullscreen mode Exit fullscreen mode

Trigger a simple cURL request to generate a few traces.

curl -d '{"id":"6"}' -H "Content-Type: application/json" -X POST http://localhost:8081/pokemon/import
Enter fullscreen mode Exit fullscreen mode

Open Grafana on localhost:3000. Choose Tempo and the TraceQL tab.

Add and run the query below.

{ name="POST /pokemon/import" }
Enter fullscreen mode Exit fullscreen mode

https://res.cloudinary.com/djwdcmwdz/image/upload/v1687203262/Blogposts/grafana-tracetest/screely-1687203243668_r8mlu6.png

Choose a trace from here and it will open up in the panel on the right. With OpenTelemetry instrumentation and Grafana configuration, you can elevate your trace debugging and validation, as well as build integration tests to validate API behavior.

Trace Validation and Integration Testing with Tracetest

Tracetest is an open-source project, part of the CNCF landscape. It allows you to quickly build integration and end-to-end tests, powered by your distributed traces.

Tracetest uses your existing distributed traces to power trace-based testing with assertions against your trace data at every point of the request transaction.

You only need to point Tracetest to your Tempo instance, or send traces to Tracetest directly!

With Tracetest you can:

  • Define tests and assertions against every single microservice that a trace goes through.
  • Work with your existing distributed tracing solution, allowing you to build tests based on your already instrumented system.
  • Define multiple transaction triggers, such as a GET against an API endpoint, a GRPC request, etc.
  • Define assertions against both the response and trace data, ensuring both your response and the underlying processes worked correctly, quickly, and without errors.
  • Save and run the tests manually or via CI build jobs with the Tracetest CLI.

Install and Configure Tracetest

Tracetest runs as a container in your Docker Compose stack, just like Tempo, or the OpenTelemetry Collector.

Start by adding Tracetest to the docker-compose.yaml.

[...]

  # Tracetest
  tracetest:
    image: kubeshop/tracetest:${TAG:-latest}
    platform: linux/amd64
    volumes:
      - type: bind
        source: ./tracetest.config.yaml
        target: /app/tracetest.yaml
      - type: bind
        source: ./tracetest.provision.yaml
        target: /app/provisioning.yaml
    ports:
      - 11633:11633
    command: --provisioning-file /app/provisioning.yaml
    extra_hosts:
      - "host.docker.internal:host-gateway"
    depends_on:
      postgres:
        condition: service_healthy
      otel-collector:
        condition: service_started
    healthcheck:
      test: ["CMD", "wget", "--spider", "localhost:11633"]
      interval: 1s
      timeout: 3s
      retries: 60
    environment:
      TRACETEST_DEV: ${TRACETEST_DEV}
  # Tracetest End
Enter fullscreen mode Exit fullscreen mode

To connect to a Postgres instance, Tracetest requires a configuration file. This file will be used to store its test data. Create a tracetest.config.yaml file in the same directory as the docker-compose.yaml.

# tracetest.config.yaml

---
postgres:
  host: postgres
  user: postgres
  password: postgres
  port: 5432
  dbname: postgres
  params: sslmode=disable
Enter fullscreen mode Exit fullscreen mode

Connecting Tracetest to Grafana Tempo can be done in the Web UI, but it’s just as easy with a provisioning file. Create a tracetest.provision.yaml file like this.

# tracetest.provision.yaml

---
type: PollingProfile
spec:
  name: Default
  strategy: periodic
  default: true
  periodic:
    retryDelay: 5s
    timeout: 10m

---
type: DataStore
spec:
  name: Tempo
  type: tempo
  tempo:
    type: grpc
    grpc:
      endpoint: tempo:9095
      tls:
        insecure: true

---
type: Demo
spec:
  type: pokeshop
  enabled: true
  name: pokeshop
  opentelemetryStore: {}
  pokeshop:
    httpEndpoint: http://demo-api:8081
    grpcEndpoint: demo-rpc:8082
Enter fullscreen mode Exit fullscreen mode

Remember exposing port 9095 for Tempo? You’re using it here to query for traces with Tracetest when running integration tests.

https://res.cloudinary.com/djwdcmwdz/image/upload/v1687427895/Blogposts/grafana-tracetest/pokeshop_grafana_tracetest-draw-jun22_vetdhq.png

Restart Docker Compose.

docker compose down
docker compose up --build
Enter fullscreen mode Exit fullscreen mode

Navigate to http://localhost:11633 and open the settings. You’ll see Tempo selected and the endpoint set to tempo:9095.

https://res.cloudinary.com/djwdcmwdz/image/upload/v1688650947/Blogposts/grafana-tracetest/screely-1687206947865_1_jfxp2v.png

You can also configure this manually without the provision file.

The Demo section in the provision file will enable a preset of tests against the Pokeshop API for easier test definitions. Omitting it will have no impact.

Let’s jump into validating the traces generated by the Pokeshop API.

Validate API Traces Against OpenTelemetry Rules and Standards

The Tracetest Analyzer is the first-ever tool to analyze traces! It can validate traces, identify patterns, and fix issues with code instrumentation. It’s the easy way to adhere to OpenTelemetry rules and standards to ensure high-quality telemetry data.

Let’s create a new test in Tracetest and run the Analyzer.

To create a test in Tracetest, see the docs or follow these instructions:

  1. Click Create
  2. Click Create New Test
  3. Select HTTP Request
  4. Add a name for your test
  5. The URL field should be POST http://demo-api:8081/pokemon/import
  6. The Header list should be Content-Type: application/json
  7. The Request Body json with the content {"id":6}
  8. Click Create and Run

This will trigger the test and display a distributed trace in the Trace tab and run validations against it.

https://res.cloudinary.com/djwdcmwdz/image/upload/v1687264364/Blogposts/grafana-tracetest/screely-1687264357784_ugzefx.png

This allows you to validate your OpenTelemetry instrumentation before committing code. All rules and standards you need to adhere to will be displayed for you to see exactly what to improve!

Next, when you’re happy with the traces, move on to creating test specifications.

Define Test Scenarios with Tracetest

This section will cover adding four different test scenarios.

  1. Validate that all HTTP spans return a status code 200.
  2. Validate that a span exists after the RabbitMQ queue meaning a value has been picked up from it.
  3. Validate that Redis is using the correct Pokemon id.
  4. Validate that Postgres is inserting the correct Pokemon.

Opening the Test tab will let you create Test Specs.

Adding Common Test Specs from Snippets

Once you land on the Test tab, you’ll be greeted with 6 test snippets for common test cases.

https://res.cloudinary.com/djwdcmwdz/image/upload/v1687270092/Blogposts/grafana-tracetest/screely-1687270083384_npdbdq.png

These assertions will validate the properties from the trace spans the Pokeshop API generates.

By default, Tracetest will give you snippets to add common assertions like:

  • All HTTP spans return the status code 200
  • All database spans return in less than 100ms

Start by adding a first test spec for validating all HTTP spans return status code 200.

  • Click All HTTP Spans: Status code is 200
  • Save Test Spec
  • Save

https://res.cloudinary.com/djwdcmwdz/image/upload/v1687270388/Blogposts/grafana-tracetest/screely-1687270382058_x2ilav.png

But this case is common and easy to test with traditional tools. However, running tests on message queues and caches is not. Let’s jump into that.

Adding Test Cases for RabbitMQ, Redis and Postgres

Create another test spec by clicking on the import pokemon span and the Add Test Spec button.

To learn more about selectors and expressions check the docs.

The selector you need is:

span[tracetest.span.type="general" name="import pokemon"]
Enter fullscreen mode Exit fullscreen mode

To validate that this span exists at all will validate the value has been picked up from the RabbitMQ queue.

attr:tracetest.selected_spans.count = 1
Enter fullscreen mode Exit fullscreen mode

https://res.cloudinary.com/djwdcmwdz/image/upload/v1687271255/Blogposts/grafana-tracetest/screely-1687271249650_boqri6.png

Save the test spec and move to add a test spec for Redis. To validate that Redis is using the correct Pokemon ID we are comparing it to the value returned from Redis.

Select the Redis span. You’ll use this selector:

span[tracetest.span.type="database" name="get pokemon_6" db.system="redis" db.operation="get" db.redis.database_index="0"]
Enter fullscreen mode Exit fullscreen mode

And this assertion:

attr:db.payload = '{"key":"pokemon_6"}'
Enter fullscreen mode Exit fullscreen mode

https://res.cloudinary.com/djwdcmwdz/image/upload/v1687271649/Blogposts/grafana-tracetest/screely-1687271642790_kigthc.png

Lastly, select the Postgres span. Here you’re validating that the value inserted into the Postgres database contains the correct Pokemon name.

Create another test spec. Use this selector:

span[tracetest.span.type="database" name="create postgres.pokemon" db.system="postgres" db.name="postgres" db.user="postgres" db.operation="create" db.sql.table="pokemon"]
Enter fullscreen mode Exit fullscreen mode

And this assertion:

attr:db.result contains "charizard"
Enter fullscreen mode Exit fullscreen mode

https://res.cloudinary.com/djwdcmwdz/image/upload/v1687271685/Blogposts/grafana-tracetest/screely-1687271679294_iumerc.png

After all this work, you’ll end up with 4 test specs.

https://res.cloudinary.com/djwdcmwdz/image/upload/v1687272161/Blogposts/grafana-tracetest/screely-1687272153898_ei28gt.png

This complex test scenario will run an API test with specs against trace data and give you deep assertion capabilities for microservices and async processes that are incredibly difficult to test with legacy testing tools.

With the test scenarios laid out, let’s automate!

Run Automated Tests with Tracetest

Tracetest is designed to work with all CI/CD platforms and automation tools. To enable Tracetest to run in CI/CD environments, make sure to install the Tracetest CLI and configure it to access your Tracetest server.

Installing the CLI is a single command.

brew install kubeshop/tracetest/tracetest
Enter fullscreen mode Exit fullscreen mode

Note: Check out the download page for more info about installing in either Linux or Windows. You can also follow the official documentation to install the Tracetest server in your existing infrastructure running in Kubernetes or Docker.

Configuring the CLI is one more command.

tracetest configure --endpoint http://localhost:11633
Enter fullscreen mode Exit fullscreen mode

Make sure to run the Docker Compose stack before configuring the CLI.

You can see the --endpoint is set to http://localhost:11633. It’s where your Tracetest server is running.

You’re ready to run automated tests!

Create a Tracetest Test Definition

But, first, you need a test definition. In the Tracetest Web UI open the test you created, click the ⚙️ in the top right and then the Test Definition button.

https://res.cloudinary.com/djwdcmwdz/image/upload/v1687285591/Blogposts/grafana-tracetest/screely-1687285584545_jqsokl.png

Download the file and give it a name. I’ll call it test.yaml because reasons. 😁

# test.yaml

type: Test
spec:
  id: -ao9stJVg
  name: Pokeshop - Import
  description: Import a Pokemon
  trigger:
    type: http
    httpRequest:
      url: http://demo-api:8081/pokemon/import
      method: POST
      headers:
      - key: Content-Type
        value: application/json
      body: '{"id":6}'
  specs:
  - name: Import Pokemon Span Exists
    selector: span[tracetest.span.type="general" name="import pokemon"]
    assertions:
    - attr:tracetest.selected_spans.count = 1
  - name: Matching db result with the Pokemon Name
    selector: span[tracetest.span.type="database" name="create postgres.pokemon" db.system="postgres"
      db.name="postgres" db.user="postgres" db.operation="create" db.sql.table="pokemon"]
    assertions:
    - attr:db.result  contains      "charizard"
  - name: Uses correct Pokemon ID
    selector: span[tracetest.span.type="database" name="get pokemon_6" db.system="redis"
      db.operation="get" db.redis.database_index="0"]
    assertions:
    - attr:db.payload  =  '{"key":"pokemon_6"}'
  - name: 'All HTTP Spans: Status  code is 200'
    selector: span[tracetest.span.type="http"]
    assertions:
    - attr:http.status_code = 200
Enter fullscreen mode Exit fullscreen mode

This test definition contains the HTTP trigger and test specs for the API test.


# Trigger
  trigger:
    type: http
    httpRequest:
      url: http://demo-api:8081/pokemon/import
      method: POST
      headers:
      - key: Content-Type
        value: application/json
      body: '{"id":6}'

# Test Specs
  specs:
  - name: Import Pokemon Span Exists
    selector: span[tracetest.span.type="general" name="import pokemon"]
    assertions:
    - attr:tracetest.selected_spans.count = 1
  - name: Matching db result with the Pokemon Name
    selector: span[tracetest.span.type="database" name="create postgres.pokemon" db.system="postgres"
      db.name="postgres" db.user="postgres" db.operation="create" db.sql.table="pokemon"]
    assertions:
    - attr:db.result  contains      "charizard"
  - name: Uses correct Pokemon ID
    selector: span[tracetest.span.type="database" name="get pokemon_6" db.system="redis"
      db.operation="get" db.redis.database_index="0"]
    assertions:
    - attr:db.payload  =  '{"key":"pokemon_6"}'
  - name: 'All HTTP Spans: Status  code is 200'
    selector: span[tracetest.span.type="http"]
    assertions:
    - attr:http.status_code = 200
Enter fullscreen mode Exit fullscreen mode

If you wanted to, you could have written this entire test in YAML right away!

Run a Tracetest Test with the CLI

Once you’ve saved the file, triggering the test with the CLI is done like this.

tracetest test run --definition ./tests/test.yaml --wait-for-result

[Output]
✔ Pokeshop - Import (http://localhost:11633/test/-ao9stJVg/run/8/test)
    ✔ Import Pokemon Span Exists
    ✔ Matching db result with the Pokemon Name
    ✔ Uses correct Pokemon ID
    ✔ All HTTP Spans: Status  code is 200
Enter fullscreen mode Exit fullscreen mode

You can access the test run by following the URL in the test response.

To automate this behavior, you’ll specify a list of test definitions and run them with the CLI in your preferred CI/CD platform.

Alternatively, you do not need to download the CLI in your CI/CD platform. Instead, use the official Tracetest Docker image that comes with the CLI installed.

Here’s a list of guides we’ve compiled for you in the docs.

Analyze Test Results

You have successfully configured both Grafana and Tracetest. By enabling distributed tracing and trace-based testing, you can now monitor test executions and analyze the captured traces to gain insights into your API's performance and identify any issues.

Use Grafana Tempo's querying capabilities to filter and search for specific traces based on attributes like service name, operation name, or tags. This will help you pinpoint the root cause of any performance degradation or errors.

Leverage Grafana's rich visualization capabilities to create dashboards and charts to track the performance and health of your APIs over time.

Use Tracetest to leverage existing distributed traces to power trace-based testing. You can define tests and assertions against every single microservice that a trace goes through. With Tracetest, you can work with Grafana Tempo to define assertions against both the response and trace data. This ensures both your response and the underlying processes work as expected. Finally, save and run the tests in CI pipelines with the Tracetest CLI to enable automation.

How Grafana Works with Tracetest to Enhance Observability

In conclusion, by combining Grafana Tempo with Tracetest, you can effectively monitor and test your APIs using distributed tracing. This tutorial has provided a step-by-step guide to setting up and using these powerful tools, enabling you to ensure the reliability, performance, and scalability of your APIs in complex distributed systems.

Do you want to learn more about Tracetest and what it brings to the table? Check the docs and try it out by downloading it today!

Also, please feel free to join our Discord community, give Tracetest a star on GitHub, or schedule a time to chat 1:1.

Top comments (0)