DEV Community

Alexandr Bandurchin for Uptrace

Posted on

Uptrace v2.0: How ClickHouse JSON Type Accelerates Trace Queries by 10x

Uptrace v2.0 introduces native ClickHouse JSON type support for storing observability data, resulting in 10x query performance improvements. This guide covers deployment, configuration, and optimization strategies for production environments.

Key improvements in v2.0:

  • Native JSON column storage with dot notation queries
  • UI-based data transformations
  • Flexible retention policies per data type
  • Built-in Let's Encrypt SSL automation
  • Enhanced query builder with autocomplete

The Problem with Traditional Approaches

Observability systems handle nested JSON structures with unpredictable attribute sets. Each span or log entry contains variable fields:

{
  "trace_id": "abc123",
  "service_name": "checkout",
  "http.method": "POST",
  "http.target": "/api/v1/orders/12345",
  "user_id": "user_987",
  "error": true,
  "db.statement": "SELECT * FROM orders WHERE id = ?"
}
Enter fullscreen mode Exit fullscreen mode

Traditional storage approaches have significant limitations:

Approach 1: Flattened Schema

CREATE TABLE spans (
  trace_id String,
  service_name String,
  http_method String,
  http_target String,
  user_id String,
  -- Hundreds of columns for all possible attributes
)
Enter fullscreen mode Exit fullscreen mode

Drawbacks:

  • Schema bloat
  • Loss of flexibility
  • ETL pipeline required for each new attribute

Approach 2: String Storage + JSONExtractString

CREATE TABLE spans (
  trace_id String,
  attributes String  -- Raw JSON string
)

-- Query performance issue:
SELECT count()
FROM spans
WHERE JSONExtractString(attributes, 'service_name') = 'checkout';
Enter fullscreen mode Exit fullscreen mode

Drawbacks:

  • JSON parsing on every query
  • No indexing on nested fields
  • 2.7 seconds on 50M records

What Changed in Uptrace v2.0

ClickHouse introduced a native JSON data type, and Uptrace v2.0 fully leverages this capability.

Before (v1.x):

SELECT count()
FROM spans_old
WHERE JSONExtractString(attributes, 'service_name') = 'checkout';

-- Result: 2.754 seconds on 50M records
-- Processed: 50.00 million rows, 8.43 GB
Enter fullscreen mode Exit fullscreen mode

After (v2.0):

SELECT count()
FROM spans_new
WHERE attributes.service_name = 'checkout';

-- Result: 0.287 seconds on 50M records ⚡
-- Processed: 50.00 million rows, 3.12 GB
Enter fullscreen mode Exit fullscreen mode

10x performance improvement achieved through:

  1. JSON parsing during insertion (once)
  2. Columnar storage of JSON data
  3. Native indexing on nested fields
  4. Direct dot notation access

Installation Guide

Prerequisites

  • Docker and Docker Compose
  • 4GB RAM minimum (8GB recommended)
  • 10GB disk space

Quick Start

# Clone repository
git clone https://github.com/uptrace/uptrace
cd uptrace/example/docker

# Start all services
docker-compose up -d
Enter fullscreen mode Exit fullscreen mode

Services included:

  • ClickHouse - Data storage
  • Uptrace - UI and API
  • PostgreSQL - Metadata storage

Access the UI at http://localhost:14318:

  • Username: admin@uptrace.local
  • Password: admin

Docker Compose Structure

The default setup includes:

services:
  clickhouse:    # Data storage engine
  postgres:      # Metadata and configuration
  uptrace:       # Main application
  otelcol:       # OpenTelemetry Collector
  redis:         # Caching layer
Enter fullscreen mode Exit fullscreen mode

Configuration

Project Setup via UI

Navigate to Organization → New Org → New Project to create your first project through the web interface.

Required fields:

  • Name: Project identifier (e.g., production)
  • Organization: Company/team name
  • Token: Auto-generated for OTLP authentication

Configuration File Alternative

For infrastructure-as-code deployments, use uptrace.yml:

seed_data:
  users:
    - key: admin_user
      name: Admin
      email: admin@example.com
      password: change_this_password

  orgs:
    - key: main_org
      name: Company Name

  org_users:
    - key: org_admin
      org: main_org
      user: admin_user
      role: owner

  projects:
    - key: prod_project
      name: production
      org: main_org
Enter fullscreen mode Exit fullscreen mode

Key feature: The key field enables declarative resource management. Uptrace automatically creates, updates, or removes resources based on configuration changes.

Sending Traces with OpenTelemetry

Example Node.js integration:

const { NodeTracerProvider } = require('@opentelemetry/sdk-trace-node');
const { BatchSpanProcessor } = require('@opentelemetry/sdk-trace-base');
const { OTLPTraceExporter } = require('@opentelemetry/exporter-trace-otlp-http');
const { Resource } = require('@opentelemetry/resources');
const { SemanticResourceAttributes } = require('@opentelemetry/semantic-conventions');

// Configure exporter
const exporter = new OTLPTraceExporter({
  url: 'http://localhost:14318/v1/traces',
  headers: {
    'uptrace-dsn': 'http://project_token@localhost:14318/1'
  }
});

// Setup provider
const provider = new NodeTracerProvider({
  resource: new Resource({
    [SemanticResourceAttributes.SERVICE_NAME]: 'api-service',
    [SemanticResourceAttributes.SERVICE_VERSION]: '1.0.0',
  })
});

provider.addSpanProcessor(new BatchSpanProcessor(exporter));
provider.register();

// Use in application
const tracer = provider.getTracer('api-service');

async function handleRequest(req, res) {
  const span = tracer.startSpan('handleRequest', {
    attributes: {
      'http.method': req.method,
      'http.target': req.url,
      'user_id': req.user?.id
    }
  });

  try {
    await processOrder(req.body);
    span.setStatus({ code: 1 }); // OK
  } catch (error) {
    span.recordException(error);
    span.setStatus({ code: 2, message: error.message }); // ERROR
    throw error;
  } finally {
    span.end();
  }
}
Enter fullscreen mode Exit fullscreen mode

Query Builder Features

Uptrace v2.0 introduces an enhanced query builder with several powerful features:

1. Attribute Autocomplete

The query builder provides intelligent suggestions for available attributes as you type. Start typing user_ and see all user-related attributes.

2. Toggle Query Parts

Temporarily disable query conditions without deleting them, useful for debugging and exploration.

3. Search Clause

Combine structured filters with full-text search:

where service_name = "checkout" | search error|timeout|failed
Enter fullscreen mode Exit fullscreen mode

This query finds all spans from the checkout service containing the words "error", "timeout", or "failed" in any attribute or log message.

Example: Debugging Failed Payment

where attributes.user_id = "user_12345" 
  and timestamp >= now() - interval 1 hour
  | search payment|stripe|checkout
Enter fullscreen mode Exit fullscreen mode

This query locates all payment-related spans for a specific user within the last hour.

Data Transformations

Data transformations process telemetry data before storage, enabling:

  • Attribute normalization
  • PII removal
  • Data type conversion
  • Sampling strategies

Use Case 1: Reducing URL Cardinality

Problem: Dynamic URLs create high cardinality:

/user/123/orders/456
/user/124/orders/457
/user/125/orders/458
...
Enter fullscreen mode Exit fullscreen mode

This causes index bloat and slow queries.

Solution:

// Project → Transformations → New Operation
setAttr("http_target", 
  replaceGlob(
    attr("http_target"), 
    "/user/*/orders/*", 
    "/user/{userId}/orders/{orderId}"
  )
);
Enter fullscreen mode Exit fullscreen mode

Result: All URLs normalized to:

/user/{userId}/orders/{orderId}
Enter fullscreen mode Exit fullscreen mode

Use Case 2: Type Conversion

Some libraries send numeric values as strings:

{
  "elapsed_ms": "1234.56"  // String instead of number
}
Enter fullscreen mode Exit fullscreen mode

Fix with transformation:

setAttr("elapsed_ms", parseFloat(attr("elapsed_ms")))
Enter fullscreen mode Exit fullscreen mode

Benefit: Enable aggregate functions:

SELECT avg(attributes.elapsed_ms) FROM spans;
Enter fullscreen mode Exit fullscreen mode

Use Case 3: PII Removal

Remove personally identifiable information for GDPR compliance:

if (has(attr("user.email"))) {
  setAttr("user.email", "***@***")
}
Enter fullscreen mode Exit fullscreen mode

Transformation language: Uptrace uses expr-lang for transformation logic, providing a simple yet powerful expression language.

Configuration location: Project → Transformations → New Operation

Retention Policies {#retention}

Different data types require different retention periods:

  • Traces: Recent debugging (7 days)
  • Logs: Audit trail (30 days)
  • Metrics: Long-term trends (90 days)

Configuration

projects:
  - name: production
    retention:
      spans: 168h      # 7 days
      logs: 720h       # 30 days  
      events: 720h     # 30 days
      metrics: 2160h   # 90 days
Enter fullscreen mode Exit fullscreen mode

Alternative: Settings → Project → Data Retention

Cost Impact

Example calculation:

  • Traces: ~100GB/day
  • Retention all types for 90 days: 9TB
  • Retention with 7/30/90 split: 4TB
  • Storage savings: 56%

Migration Strategy

For zero-downtime migration from v1.x or other systems, use parallel deployment:

┌─────────────┐
│   Services  │
└──────┬──────┘
       │
       │ OpenTelemetry
       │
┌──────▼───────────────┐
│ OTel Collector       │
└──┬────────────────┬──┘
   │                │
   │                │
┌──▼────┐      ┌───▼─────┐
│ v1.x  │      │ v2.0    │
│ (old) │      │ (new)   │
└───────┘      └─────────┘
Enter fullscreen mode Exit fullscreen mode

OpenTelemetry Collector Configuration

exporters:
  otlphttp/v1:
    endpoint: http://uptrace-v1:14318
    headers:
      uptrace-dsn: "http://token_v1@uptrace-v1:14318/1"

  otlphttp/v2:
    endpoint: http://uptrace-v2:14318
    headers:
      uptrace-dsn: "http://token_v2@uptrace-v2:14318/1"

service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [batch]
      exporters: [otlphttp/v1, otlphttp/v2]  # Dual export
Enter fullscreen mode Exit fullscreen mode

Migration Steps

  1. Deploy v2.0 alongside existing v1.x
  2. Configure dual export to both instances
  3. Monitor for 7 days to ensure stability
  4. Migrate dashboards and alerts to v2.0
  5. Decommission v1.x

Performance Benchmarks

Real-world performance comparison on 500M spans over 7 days:

Query 1: Top 5 Slowest Endpoints

SELECT 
  attributes.http_target as endpoint,
  count() as requests,
  quantile(0.95)(duration) as p95_duration
FROM spans
WHERE service_name = 'api-gateway'
  AND timestamp >= now() - interval 24 hour
GROUP BY endpoint
ORDER BY p95_duration DESC
LIMIT 5;
Enter fullscreen mode Exit fullscreen mode
  • v1.x: 4.2 seconds
  • v2.0: 0.38 seconds
  • Improvement: 11x faster

Query 2: User Error Traces

SELECT 
  trace_id,
  span_name,
  attributes.error_message
FROM spans
WHERE attributes.user_id = 'user_12345'
  AND attributes.error = true
  AND timestamp >= now() - interval 7 day
ORDER BY timestamp DESC;
Enter fullscreen mode Exit fullscreen mode
  • v1.x: 6.8 seconds
  • v2.0: 0.52 seconds
  • Improvement: 13x faster

Performance Factors

  1. Columnar JSON storage
  2. Native indexing on nested fields
  3. Zero JSON parsing overhead per query

SSL Configuration

Uptrace v2.0 includes built-in Let's Encrypt integration:

# uptrace.yml
certmagic:
  enabled: true
  staging_ca: false  # Use true for testing
  http_challenge_addr: :80

listen:
  https:
    addr: :443
    domain: uptrace.example.com
Enter fullscreen mode Exit fullscreen mode

Automatic features:

  • Certificate issuance
  • HTTP to HTTPS redirect
  • Auto-renewal every 60 days

Requirements:

  • Domain DNS points to server
  • Port 80 open for HTTP-01 challenge

When to Use Uptrace v2.0

Ideal Use Cases

✅ Microservices architecture (5+ services)

✅ High query performance requirements

✅ Need unified traces, logs, and metrics

✅ Comfortable running ClickHouse

✅ Want flexible data transformations

Not Recommended For

❌ Small projects (1-2 services)

❌ Existing Grafana stack working well

❌ Need managed SaaS only

❌ No infrastructure management capability

Conclusion

Uptrace v2.0's adoption of ClickHouse JSON type provides substantial performance improvements for observability workloads. The 10x query acceleration comes from architectural changes that eliminate JSON parsing overhead while maintaining storage flexibility.

Key benefits:

  • Native JSON storage with columnar performance
  • Real-time data transformations
  • Flexible retention policies
  • Built-in SSL automation
  • Unified observability platform

Resources:


Top comments (0)