Grafana Tempo Has a Free API — Heres How to Do Distributed Tracing at Scale

#observability #devops #opensource #tutorial

Grafana Tempo is a high-scale distributed tracing backend. Store and query traces from OpenTelemetry, Jaeger, and Zipkin — with only object storage, no Elasticsearch or Cassandra.

Why Tempo?

Cost-effective: Uses S3/GCS/Azure blob instead of Elasticsearch
Massive scale: Handles billions of traces
Multi-format: OpenTelemetry, Jaeger, Zipkin protocols
Grafana native: Built-in trace visualization
TraceQL: Query language for traces
Metrics from traces: Auto-generate RED metrics

Docker Setup

services:
  tempo:
    image: grafana/tempo:latest
    command: ['-config.file=/etc/tempo.yaml']
    volumes:
      - ./tempo.yaml:/etc/tempo.yaml
      - tempo-data:/tmp/tempo
    ports:
      - '3200:3200'   # Tempo API
      - '4317:4317'   # OTLP gRPC
      - '4318:4318'   # OTLP HTTP

  grafana:
    image: grafana/grafana:latest
    ports:
      - '3000:3000'
    environment:
      - GF_AUTH_ANONYMOUS_ENABLED=true

volumes:
  tempo-data:

Tempo Config

# tempo.yaml
server:
  http_listen_port: 3200

distributor:
  receivers:
    otlp:
      protocols:
        grpc:
        http:
    jaeger:
      protocols:
        thrift_http:

storage:
  trace:
    backend: local
    local:
      path: /tmp/tempo/blocks
    wal:
      path: /tmp/tempo/wal

Send Traces (OpenTelemetry JS)

import { NodeSDK } from '@opentelemetry/sdk-node';
import { OTLPTraceExporter } from '@opentelemetry/exporter-trace-otlp-http';
import { getNodeAutoInstrumentations } from '@opentelemetry/auto-instrumentations-node';

const sdk = new NodeSDK({
  traceExporter: new OTLPTraceExporter({
    url: 'http://localhost:4318/v1/traces',
  }),
  instrumentations: [getNodeAutoInstrumentations()],
});

sdk.start();

Send Traces (Python)

from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter

provider = TracerProvider()
provider.add_span_processor(BatchSpanProcessor(OTLPSpanExporter(endpoint='http://localhost:4318/v1/traces')))
trace.set_tracer_provider(provider)

tracer = trace.get_tracer('my-service')
with tracer.start_as_current_span('process-order') as span:
    span.set_attribute('order.id', '12345')
    # Your code here

API: Search Traces

# By trace ID
curl http://localhost:3200/api/traces/TRACE_ID

# Search with TraceQL
curl -G http://localhost:3200/api/search \
  --data-urlencode 'q={span.http.status_code >= 500}' \
  --data-urlencode 'limit=20'

TraceQL Examples

# Find slow database queries
{span.db.system = "postgresql" && duration > 500ms}

# Find errors in specific service
{resource.service.name = "api-gateway" && status = error}

# Find traces with specific user
{span.user.id = "user-123"}

Real-World Use Case

A SaaS company replaced Jaeger + Elasticsearch ($500/mo in storage) with Tempo + S3 ($15/mo). Query performance improved 3x because Tempo is purpose-built for trace lookups, not general search.

Need to automate data collection? Check out my Apify actors for ready-made scrapers, or email spinov001@gmail.com for custom solutions.

DEV Community