DEV Community

Cover image for Deploying Vector High-Performance Observability Data Pipeline on Ubuntu 24.04
Sanskriti Harmukh for Vultr

Posted on with Aashish Chaurasiya • Originally published at docs.vultr.com

Deploying Vector High-Performance Observability Data Pipeline on Ubuntu 24.04

Vector is a high-performance observability data pipeline from Datadog that collects, transforms, and routes logs, metrics, and traces across heterogeneous backends. This guide deploys Vector using Docker Compose with Traefik handling automatic HTTPS for the GraphQL API and HTTP ingest endpoint, plus a working sources → transforms → sinks pipeline. By the end, you'll have Vector accepting JSON over HTTPS and forwarding it to multiple sinks on your server.


Set Up the Directory Structure and Configuration

1. Create the project directory structure:

$ mkdir -p ~/vector/{config,data}
$ cd ~/vector
Enter fullscreen mode Exit fullscreen mode

2. Create the environment file:

$ nano .env
Enter fullscreen mode Exit fullscreen mode
DOMAIN=vector.example.com
LETSENCRYPT_EMAIL=admin@example.com
Enter fullscreen mode Exit fullscreen mode

3. Create the Vector pipeline configuration:

$ nano config/vector.yaml
Enter fullscreen mode Exit fullscreen mode
api:
  enabled: true
  address: "0.0.0.0:8686"

sources:
  demo_logs:
    type: "demo_logs"
    format: "syslog"
    interval: 1.0

  http_input:
    type: "http_server"
    address: "0.0.0.0:8080"
    decoding:
      codec: "json"

transforms:
  parse_logs:
    type: "remap"
    inputs:
      - "demo_logs"
      - "http_input"
    source: |
      .processed_at = now()
      .pipeline = "vector-demo"

sinks:
  console_output:
    type: "console"
    inputs:
      - "parse_logs"
    encoding:
      codec: "json"

  file_output:
    type: "file"
    inputs:
      - "parse_logs"
    path: "/var/lib/vector/logs-%Y-%m-%d.log"
    encoding:
      codec: "json"

  http_output:
    type: "http"
    inputs:
      - "parse_logs"
    uri: "https://httpbin.org/post"
    encoding:
      codec: "json"
    batch:
      max_bytes: 1048576
      timeout_secs: 10
Enter fullscreen mode Exit fullscreen mode

Deploy with Docker Compose

1. Create the Docker Compose manifest:

$ nano docker-compose.yaml
Enter fullscreen mode Exit fullscreen mode
services:
  traefik:
    image: traefik:v3.6
    container_name: traefik
    command:
      - "--providers.docker=true"
      - "--providers.docker.exposedbydefault=false"
      - "--entrypoints.web.address=:80"
      - "--entrypoints.websecure.address=:443"
      - "--entrypoints.web.http.redirections.entrypoint.to=websecure"
      - "--entrypoints.web.http.redirections.entrypoint.scheme=https"
      - "--certificatesresolvers.letsencrypt.acme.httpchallenge=true"
      - "--certificatesresolvers.letsencrypt.acme.httpchallenge.entrypoint=web"
      - "--certificatesresolvers.letsencrypt.acme.email=${LETSENCRYPT_EMAIL}"
      - "--certificatesresolvers.letsencrypt.acme.storage=/letsencrypt/acme.json"
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - "./letsencrypt:/letsencrypt"
      - "/var/run/docker.sock:/var/run/docker.sock:ro"
    restart: unless-stopped

  vector:
    image: timberio/vector:0.44.0-alpine
    container_name: vector
    expose:
      - "8080"
      - "8686"
    volumes:
      - "./config/vector.yaml:/etc/vector/vector.yaml:ro"
      - "./data:/var/lib/vector"
    labels:
      - "traefik.enable=true"
      - "traefik.http.routers.vector-api.rule=Host(`${DOMAIN}`) && (PathPrefix(`/playground`) || PathPrefix(`/graphql`) || PathPrefix(`/health`))"
      - "traefik.http.routers.vector-api.entrypoints=websecure"
      - "traefik.http.routers.vector-api.tls.certresolver=letsencrypt"
      - "traefik.http.routers.vector-api.service=vector-api"
      - "traefik.http.services.vector-api.loadbalancer.server.port=8686"
      - "traefik.http.routers.vector-ingest.rule=Host(`${DOMAIN}`) && PathPrefix(`/ingest`)"
      - "traefik.http.routers.vector-ingest.entrypoints=websecure"
      - "traefik.http.routers.vector-ingest.tls.certresolver=letsencrypt"
      - "traefik.http.routers.vector-ingest.service=vector-ingest"
      - "traefik.http.services.vector-ingest.loadbalancer.server.port=8080"
      - "traefik.http.middlewares.strip-ingest.stripprefix.prefixes=/ingest"
      - "traefik.http.routers.vector-ingest.middlewares=strip-ingest"
    restart: unless-stopped
Enter fullscreen mode Exit fullscreen mode

2. Start the services:

$ docker compose up -d
Enter fullscreen mode Exit fullscreen mode

3. Verify the services are running:

$ docker compose ps
$ docker compose logs vector
Enter fullscreen mode Exit fullscreen mode

Verify the Pipeline

1. POST a JSON log to the ingest endpoint:

$ curl -X POST https://vector.example.com/ingest \
    -H "Content-Type: application/json" \
    -d '{"level":"error","service":"api","message":"Database connection timeout","user_id":12345}'
Enter fullscreen mode Exit fullscreen mode

2. Confirm the file sink wrote the event:

$ ls -lh data/
$ grep "Database connection timeout" data/logs-*.log
Enter fullscreen mode Exit fullscreen mode

3. Stream the live console sink:

$ docker compose logs -f vector
Enter fullscreen mode Exit fullscreen mode

Next Steps

Vector is running with HTTPS ingest and three sinks active. From here you can:

  • Add sources for files, Kafka, syslog, journald, or Kubernetes logs
  • Route to production sinks (Loki, Elasticsearch, S3, Datadog, Splunk)
  • Use VRL (Vector Remap Language) for richer transforms and enrichment

For the full guide with additional tips, visit the original article on Vultr Docs.

Top comments (0)