Step-by-Step: Implementing OpenTelemetry Declarative Configuration with Traces
Introduction
It was quite some time ago that I first stumbled upon the article “How to Write Your First OpenTelemetry Declarative Config File with Trace” (by Nawaz Dhandala). As is so often the case with compelling technical readouts, it immediately went onto my “to-read” stack. Recently, I finally dug into it — with a little help from Bob, naturally.
Historically, keeping all three observability signals (traces, metrics, and logs) configured in harmony has felt like a high-wire juggling act involving a messy mix of environment variables, programmatic SDK initialization boilerplate, and complex collector configurations. The OpenTelemetry declarative configuration format completely changes the game. By allowing you to define trace, metric, and log pipelines within a single YAML file that the SDK parses at startup, it drastically streamlines how we approach system visibility.
Excerpt of the original article;
Getting all three observability signals (traces, metrics, and logs) configured in a single place has historically been a juggling act of environment variables, programmatic SDK initialization code, and collector configs. The OpenTelemetry declarative configuration format changes that by letting you define trace, metric, and log pipelines in one YAML file that the SDK reads at startup.
Intrigued by the article’s premise, I tasked Bob with taking the concept from theory to reality by building a complete, end-to-end working application. What follows is a practical breakdown of that implementation, demonstrating exactly how this declarative approach functions under the hood.
Implementation-System Architecture & Component Mapping
To showcase the power of OpenTelemetry’s programmatic-free initialization, Bob and I designed a lightweight microservice (order-service) that ships its telemetry signals without relying on complex SDK setup code inside the application logic.
Here is how the components interact in our target architecture:
+--------------------------------------------------------------+
| Local Machine |
| |
| +-----------------------+ +-----------------------+ |
| | order-service | | otel-collector | |
| | (Python/FastAPI) | | (OTel Collector) | |
| | | | | |
| | +---------------+ | | +---------------+ | |
| | | otel-config | | | | otel-collector| | |
| | | .yaml | | | | -config.yaml | | |
| | +-------+-------+ | | +-------+-------+ | |
| | | | | | | |
| | v | | | | |
| | [OTel SDK Engine] | | | | |
| | | | | | | |
| +-----------|-----------+ +-----------|-----------+ |
| | (OTLP/gRPC) | |
| v | |
| +-------+-------+ | |
| | otel-collector| <--------------------+ |
| | :4317 | |
| +-------+-------+ |
| | |
| +---------------+---------------+ |
| | (OTLP) | (OTLP) | (Prometheus) |
| v v v |
| +-----------+ +-----------+ +-----------+ |
| | jaeger | | loki | |prometheus | |
| | :4317 | | :3100 | | :9090 | |
| +-----------+ +-----------+ +-----+-----+ |
| | |
| v |
| +-----------+ |
| | grafana | |
| | :3000 | |
| +-----------+ |
+--------------------------------------------------------------+
Practically the architecture and application structure are;
otel-declarative/
├── app/
│ └── main.py # Flask application with OTel instrumentation
├── config/
│ ├── otel-config.yaml # OpenTelemetry declarative configuration
│ ├── otel-collector-config.yaml # Collector configuration
│ ├── prometheus.yml # Prometheus configuration
│ └── grafana-datasources.yml # Grafana datasources
├── dashboard/
│ ├── index.html # Dashboard HTML
│ ├── styles.css # Dashboard styles
│ └── dashboard.js # Dashboard JavaScript
├── scripts/
│ ├── setup.sh # Setup script
│ ├── start-backend.sh # Start backend services
│ ├── run-app.sh # Run application
│ ├── stop-app.sh # Stop application
│ ├── stop-backend.sh # Stop backend services
│ └── open-dashboard.sh # Open dashboard
├── Docs/
│ ├── Architecture.md # Detailed architecture documentation
│ ├── Configuration.md # Configuration reference guide
│ ├── QuickStart.md # Quick start guide
│ ├── Podman-Support.md # Podman setup and support
│ ├── Verification-Guide.md # Implementation verification guide
│ ├── Jaeger-Guide.md # Complete Jaeger tracing guide
│ ├── Prometheus-Guide.md # Comprehensive Prometheus metrics guide
│ └── Grafana-Guide.md # Complete Grafana visualization guide
├── Blog/
│ ├── building-observability-stack-with-otel.md # Comprehensive blog post
│ ├── README.md # Blog post overview
│ └── images/ # Supporting images
├── output/
│ └── 2026-05-31_OTEL_DEMO_COMPLETION_SUMMARY.md # Project summary
├── docker-compose.yml # Docker Compose configuration
├── requirements.txt # Python dependencies
├── .gitignore # Git ignore rules
└── README.md # This file
Component Breakdown
-
order-service: A Python-based FastAPI mock server exposing endpoints like user and order data processing, errors, and performance anomalies. - OpenTelemetry SDK (In-App): Reads the single otel-config.yaml file upon startup to define and handle telemetry pipelines dynamically.
- OpenTelemetry Collector: Receives the standardized OTLP signals over
gRPCon port4317, sorting and routing them to the specialized backends. - Backend Observability Engines: Jaeger captures distributed trace contexts, Loki aggregates system and app logs, and Prometheus scrapes metric counters, all unified visually under Grafana dashboards.
Setting Up the Declarative Configuration (otel-config.yaml)
This is the file that replaces lines of boilerplate setup logic. By feeding this schema straight to the OpenTelemetry SDK environment, we establish a clean separation of concerns.
# OpenTelemetry Declarative Configuration
# File format version 0.3
# This configuration file defines the complete OpenTelemetry setup for the application
file_format: "0.3"
# Disabled flag - set to true to disable the SDK entirely
disabled: false
# Resource attributes shared by all signals (traces, metrics, logs)
# These attributes identify the service and its environment
resource:
attributes:
- name: service.name
value: "order-service"
- name: service.version
value: "1.0.0"
- name: deployment.environment
value: "production"
- name: service.namespace
value: "ecommerce"
- name: service.instance.id
value: "${HOSTNAME}"
- name: host.name
value: "${HOSTNAME}"
# Attribute limits to protect against runaway instrumentation
attribute_limits:
attribute_value_length_limit: 4096
attribute_count_limit: 128
# Trace pipeline configuration
tracer_provider:
# Span processors define how spans are processed before export
processors:
# Batch processor groups spans before export for efficiency
- batch:
schedule_delay: 5000 # milliseconds between exports
max_queue_size: 2048 # max spans queued in memory
max_export_batch_size: 512 # max spans per export call
export_timeout: 30000 # export timeout in ms
# OTLP exporter sends spans to the OpenTelemetry Collector
exporter:
otlp:
protocol: grpc
endpoint: "http://otel-collector:4317"
timeout: 10000 # export timeout in ms
compression: gzip
headers: {}
# Sampler controls what percentage of traces are recorded
# This helps manage data volume in high-traffic scenarios
sampler:
parent_based:
root:
trace_id_ratio_based:
ratio: 0.25 # sample 25% of root spans
# Limits to protect against runaway instrumentation
limits:
attribute_count_limit: 128
attribute_value_length_limit: 4096
event_count_limit: 128
link_count_limit: 128
event_attribute_count_limit: 128
link_attribute_count_limit: 128
# Metric pipeline configuration
meter_provider:
# Metric readers define how metrics are collected and exported
readers:
# Periodic reader exports metrics at regular intervals
- periodic:
interval: 30000 # export every 30 seconds
timeout: 5000 # export timeout in ms
# OTLP exporter sends metrics to the OpenTelemetry Collector
exporter:
otlp:
protocol: grpc
endpoint: "http://otel-collector:4317"
timeout: 10000
compression: gzip
headers: {}
temporality_preference: cumulative
# Views allow customization of metric aggregation
views:
- selector:
instrument_name: "*"
instrument_type: histogram
stream:
aggregation:
explicit_bucket_histogram:
boundaries: [0, 5, 10, 25, 50, 75, 100, 250, 500, 750, 1000, 2500, 5000, 7500, 10000]
# Logger pipeline configuration
logger_provider:
# Log record processors define how logs are processed before export
processors:
# Batch processor groups log records before export
- batch:
schedule_delay: 5000
max_queue_size: 2048
max_export_batch_size: 512
export_timeout: 30000
# OTLP exporter sends logs to the OpenTelemetry Collector
exporter:
otlp:
protocol: grpc
endpoint: "http://otel-collector:4317"
timeout: 10000
compression: gzip
headers: {}
# Limits for log records
limits:
attribute_count_limit: 128
attribute_value_length_limit: 4096
# Context propagation configuration
# Defines how trace context is propagated across service boundaries
propagator:
composite:
- tracecontext: {} # W3C Trace Context
- baggage: {} # W3C Baggage
- b3: {} # Zipkin B3 (for compatibility)
# Made with Bob
Key Takeaway: Notice that we aren’t writing any explicit backend exporters for Prometheus or Jaeger here. The application only knows how to talk to a local OTel Collector over a single
OTLP/gRPCendpoint (http://localhost:4317).
Core Application Implementation (main.py)
With our configuration defined in YAML, the application script remains delightfully free of SDK initialization setups. We simply write idiomatic Python code using FastAPI, relying on automatic instrumentation or basic OpenTelemetry API access hooks for custom measurements.
"""
OpenTelemetry Declarative Config Demo Application
A Flask-based REST API demonstrating OpenTelemetry instrumentation
using declarative configuration.
"""
import os
import time
import random
from flask import Flask, jsonify, request
from flask_cors import CORS
from opentelemetry import trace, metrics
from opentelemetry.sdk.resources import Resource
from opentelemetry.instrumentation.flask import FlaskInstrumentor
# Initialize Flask application
app = Flask(__name__)
# Enable CORS for all routes
CORS(app, resources={r"/*": {"origins": "*"}})
# Get tracer and meter for manual instrumentation
tracer = trace.get_tracer(__name__)
meter = metrics.get_meter(__name__)
# Create custom metrics
request_counter = meter.create_counter(
name="http_requests_total",
description="Total number of HTTP requests",
unit="1"
)
request_duration = meter.create_histogram(
name="http_request_duration_seconds",
description="HTTP request duration in seconds",
unit="s"
)
order_counter = meter.create_counter(
name="orders_created_total",
description="Total number of orders created",
unit="1"
)
error_counter = meter.create_counter(
name="errors_total",
description="Total number of errors",
unit="1"
)
# Simulate in-memory data store
users_db = [
{"id": 1, "name": "Alice Johnson", "email": "alice@example.com"},
{"id": 2, "name": "Bob Smith", "email": "bob@example.com"},
{"id": 3, "name": "Charlie Brown", "email": "charlie@example.com"}
]
orders_db = []
order_id_counter = 1
@app.before_request
def before_request():
"""Record request start time for duration calculation"""
request.start_time = time.time()
@app.after_request
def after_request(response):
"""Record metrics after each request"""
if hasattr(request, 'start_time'):
duration = time.time() - request.start_time
# Record request counter
request_counter.add(
1,
{
"method": request.method,
"endpoint": request.endpoint or "unknown",
"status": response.status_code
}
)
# Record request duration
request_duration.record(
duration,
{
"method": request.method,
"endpoint": request.endpoint or "unknown",
"status": response.status_code
}
)
return response
@app.route('/')
def home():
"""Home endpoint with basic service information"""
with tracer.start_as_current_span("home_handler") as span:
span.set_attribute("http.route", "/")
info = {
"service": "order-service",
"version": "1.0.0",
"status": "healthy",
"endpoints": [
"/",
"/api/users",
"/api/users/<id>",
"/api/orders",
"/api/orders/<id>",
"/api/error",
"/api/slow",
"/metrics"
]
}
return jsonify(info)
@app.route('/api/users', methods=['GET'])
def get_users():
"""Get all users"""
with tracer.start_as_current_span("get_users") as span:
span.set_attribute("http.route", "/api/users")
span.set_attribute("user.count", len(users_db))
# Simulate database query delay
time.sleep(random.uniform(0.01, 0.05))
return jsonify({"users": users_db, "count": len(users_db)})
@app.route('/api/users/<int:user_id>', methods=['GET'])
def get_user(user_id):
"""Get a specific user by ID"""
with tracer.start_as_current_span("get_user") as span:
span.set_attribute("http.route", "/api/users/<id>")
span.set_attribute("user.id", user_id)
# Simulate database query delay
time.sleep(random.uniform(0.01, 0.03))
user = next((u for u in users_db if u["id"] == user_id), None)
if user:
span.set_attribute("user.found", True)
return jsonify(user)
else:
span.set_attribute("user.found", False)
span.add_event("User not found", {"user.id": user_id})
return jsonify({"error": "User not found"}), 404
@app.route('/api/orders', methods=['GET', 'POST'])
def orders():
"""Get all orders or create a new order"""
if request.method == 'GET':
with tracer.start_as_current_span("get_orders") as span:
span.set_attribute("http.route", "/api/orders")
span.set_attribute("order.count", len(orders_db))
# Simulate database query delay
time.sleep(random.uniform(0.02, 0.08))
return jsonify({"orders": orders_db, "count": len(orders_db)})
else: # POST
with tracer.start_as_current_span("create_order") as span:
global order_id_counter
span.set_attribute("http.route", "/api/orders")
data = request.get_json()
# Validate request
if not data or 'user_id' not in data or 'items' not in data:
span.set_attribute("order.validation", "failed")
error_counter.add(1, {"type": "validation_error"})
return jsonify({"error": "Invalid request"}), 400
# Simulate user lookup
with tracer.start_as_current_span("lookup_user") as user_span:
user_span.set_attribute("user.id", data['user_id'])
time.sleep(random.uniform(0.01, 0.03))
user = next((u for u in users_db if u["id"] == data['user_id']), None)
if not user:
user_span.set_attribute("user.found", False)
error_counter.add(1, {"type": "user_not_found"})
return jsonify({"error": "User not found"}), 404
user_span.set_attribute("user.found", True)
# Calculate total
with tracer.start_as_current_span("calculate_total") as calc_span:
total = sum(item.get('price', 0) * item.get('quantity', 1)
for item in data['items'])
calc_span.set_attribute("order.total", total)
calc_span.set_attribute("order.item_count", len(data['items']))
# Create order
order = {
"id": order_id_counter,
"user_id": data['user_id'],
"user_name": user['name'],
"items": data['items'],
"total": total,
"status": "pending",
"created_at": time.time()
}
orders_db.append(order)
order_id_counter += 1
# Record metrics
order_counter.add(1, {"user_id": str(data['user_id'])})
span.set_attribute("order.id", order['id'])
span.set_attribute("order.total", total)
span.add_event("Order created", {
"order.id": order['id'],
"user.id": data['user_id']
})
# Simulate order processing delay
time.sleep(random.uniform(0.05, 0.15))
return jsonify(order), 201
@app.route('/api/orders/<int:order_id>', methods=['GET'])
def get_order(order_id):
"""Get a specific order by ID"""
with tracer.start_as_current_span("get_order") as span:
span.set_attribute("http.route", "/api/orders/<id>")
span.set_attribute("order.id", order_id)
# Simulate database query delay
time.sleep(random.uniform(0.02, 0.05))
order = next((o for o in orders_db if o["id"] == order_id), None)
if order:
span.set_attribute("order.found", True)
return jsonify(order)
else:
span.set_attribute("order.found", False)
span.add_event("Order not found", {"order.id": order_id})
return jsonify({"error": "Order not found"}), 404
@app.route('/api/error')
def simulate_error():
"""Endpoint to simulate errors for testing"""
with tracer.start_as_current_span("simulate_error") as span:
span.set_attribute("http.route", "/api/error")
error_type = random.choice(['validation', 'database', 'timeout', 'internal'])
span.set_attribute("error.type", error_type)
error_counter.add(1, {"type": error_type})
if error_type == 'validation':
span.add_event("Validation error occurred")
return jsonify({"error": "Validation failed"}), 400
elif error_type == 'database':
span.add_event("Database error occurred")
return jsonify({"error": "Database connection failed"}), 500
elif error_type == 'timeout':
span.add_event("Timeout error occurred")
time.sleep(2)
return jsonify({"error": "Request timeout"}), 504
else:
span.add_event("Internal error occurred")
span.record_exception(Exception("Simulated internal error"))
return jsonify({"error": "Internal server error"}), 500
@app.route('/api/slow')
def slow_endpoint():
"""Endpoint with intentional delay for testing"""
with tracer.start_as_current_span("slow_endpoint") as span:
span.set_attribute("http.route", "/api/slow")
delay = random.uniform(1.0, 3.0)
span.set_attribute("delay.seconds", delay)
time.sleep(delay)
return jsonify({
"message": "Slow operation completed",
"delay_seconds": delay
})
@app.route('/metrics')
def metrics_endpoint():
"""Simple metrics endpoint showing current stats"""
with tracer.start_as_current_span("metrics_endpoint") as span:
span.set_attribute("http.route", "/metrics")
stats = {
"users_count": len(users_db),
"orders_count": len(orders_db),
"total_order_value": sum(o.get('total', 0) for o in orders_db)
}
return jsonify(stats)
@app.errorhandler(404)
def not_found(error):
"""Handle 404 errors"""
error_counter.add(1, {"type": "not_found"})
return jsonify({"error": "Not found"}), 404
@app.errorhandler(500)
def internal_error(error):
"""Handle 500 errors"""
error_counter.add(1, {"type": "internal_error"})
return jsonify({"error": "Internal server error"}), 500
if __name__ == '__main__':
# Instrument Flask application
FlaskInstrumentor().instrument_app(app)
# Run the application
port = int(os.environ.get('PORT', 8080))
app.run(host='0.0.0.0', port=port, debug=False)
# Made with Bob
Bootstrapping the Run Environment
To make the OpenTelemetry engine pick up our declarative schema file rather than demanding configuration from standard code hooks, we need to pass standard environment variables during runtime.
-
Running the Application Local Instance: you must point the
OTEL_EXPERIMENTAL_CONFIG_FILEvariable directly to ourYAMLconfiguration asset:
export OTEL_EXPERIMENTAL_CONFIG_FILE="otel-config.yaml"
uvicorn main:app --host 0.0.0.0 --port 8000 --reload
- Running with Container Isolation (Podman/Docker): if you prefer building and executing in rootless container environments like Podman, ensure your compose layout securely maps your configuration assets:
podman-compose.yaml fragment
services:
order-service:
build: .
ports:
- "8000:8000"
environment:
- OTEL_EXPERIMENTAL_CONFIG_FILE=/app/otel-config.yaml
volumes:
- ./otel-config.yaml:/app/otel-config.yaml:ro
depends_on:
- otel-collector
Correlating Telemetry Signals in Visual Dashboards
Once your application endpoints are queried, individual traces are systematically bound to specific metric loops and custom logged fields inside your target visualization layer.
-
Jaeger Traces View: Navigating to
http://localhost:16686allows you to pickorder-serviceand track the nested hierarchy of process_orders_request down to sub-child executions likedb_query_orders.
-
Prometheus Metrics Scrape: Queries against your exposed internal counters (
orders_processed_total) can be evaluated via custom PromQL expressions. - Unified Grafana Panel: By defining standard Loki and Prometheus datasources, you can tie your error logs instantly to the spikes observed inside system trace identifiers, solving performance issues without combing through disjointed system data dumps.
Data Flow
The data flow begins within the application layer, where incoming HTTP traffic hits the Python service’s API routes. Instead of relying on manual code setups, the internal OpenTelemetry SDK bootstraps itself instantly by parsing the declarative otel-config.yaml file at launch, establishing trace, metric, and log providers. As runtime events execute, these providers automatically package structural instrumentation details into standard OpenTelemetry Protocol (OTLP) signal streams. The application pushes these telemetry batches over high-throughput OTLP/gRPC channels to an independent OpenTelemetry Collector instance listening on port 4317. Once received, the Collector processes, filters, and distributes the data to its respective domain backends: distributed trace contexts land in Jaeger, scraped metric dimensions map into Prometheus, and application execution logs route directly into Loki—all eventually correlated and visualized under unified Grafana dashboards.
Deployment Flow
The infrastructure is deployed using a local container network managed via Docker Compose or rootless Podman environments. Initialization begins by firing a setup script that sets up a clean Python virtual environment, installs runtime dependencies, and checks container socket availability. Next, a background script spins up the core observability stack inside an isolated virtual network, provisioning individual container images for the OTel Collector, Jaeger, Prometheus, Loki, and Grafana. When the core application container or local process initializes, the host orchestrator maps the local otel-config.yaml file into the runtime environment as a read-only volume, passing its path through the OTEL_EXPERIMENTAL_CONFIG_FILE variable. This decoupled deployment topology ensures that modifying telemetry processing rules, adjusting metric sampling frequencies, or switching destination backends can be handled entirely within configuration files without forcing a rebuild or modification of the application code.
services:
# OpenTelemetry Collector
otel-collector:
image: otel/opentelemetry-collector-contrib:0.91.0
container_name: otel-collector
command: ["--config=/etc/otel-collector-config.yaml"]
volumes:
- ./config/otel-collector-config.yaml:/etc/otel-collector-config.yaml
ports:
- "4317:4317" # OTLP gRPC receiver
- "4318:4318" # OTLP HTTP receiver
- "8888:8888" # Prometheus metrics exposed by the collector
- "8889:8889" # Prometheus exporter metrics
- "13133:13133" # health_check extension
networks:
- otel-network
depends_on:
- jaeger
- prometheus
# Jaeger - Distributed Tracing
jaeger:
image: jaegertracing/all-in-one:1.52
container_name: jaeger
environment:
- COLLECTOR_OTLP_ENABLED=true
ports:
- "16686:16686" # Jaeger UI
- "14250:14250" # gRPC
networks:
- otel-network
# Prometheus - Metrics Storage
prometheus:
image: prom/prometheus:v2.48.0
container_name: prometheus
command:
- '--config.file=/etc/prometheus/prometheus.yml'
- '--storage.tsdb.path=/prometheus'
- '--web.console.libraries=/usr/share/prometheus/console_libraries'
- '--web.console.templates=/usr/share/prometheus/consoles'
volumes:
- ./config/prometheus.yml:/etc/prometheus/prometheus.yml
- prometheus-data:/prometheus
ports:
- "9090:9090"
networks:
- otel-network
# Loki - Log Aggregation
loki:
image: grafana/loki:2.9.3
container_name: loki
ports:
- "3100:3100"
command: -config.file=/etc/loki/local-config.yaml
networks:
- otel-network
# Grafana - Visualization
grafana:
image: grafana/grafana:10.2.2
container_name: grafana
environment:
- GF_SECURITY_ADMIN_PASSWORD=admin
- GF_USERS_ALLOW_SIGN_UP=false
volumes:
- ./config/grafana-datasources.yml:/etc/grafana/provisioning/datasources/datasources.yml
- grafana-data:/var/lib/grafana
ports:
- "3001:3000"
networks:
- otel-network
depends_on:
- prometheus
- loki
- jaeger
networks:
otel-network:
driver: bridge
volumes:
prometheus-data:
grafana-data:
# Made with Bob
Real-Time Visualization: The Frontend Metrics Dashboard
While backend telemetry storage layers like Jaeger, Prometheus, and Grafana are vital for deep-dive diagnostics, Bob and I wanted immediate feedback during local development. To achieve this, the project features a lightweight, real-time frontend dashboard (index.html and dashboard.js) powered by Chart.js.
This dashboard sits directly on top of our instrumented application, letting us execute actions with a single click and watch our OpenTelemetry metric pipelines react instantly.
Dashboard Architecture & Flow
+------------------+ Click Action +-----------------------+
| HTML UI Button | -------------------> | Flask App Route |
| (index.html) | | (main.py) |
+------------------+ +-----------------------+
^ |
| Polls Metrics (/metrics) | Emits OTLP
| Every 5 Seconds v
+------------------+ +-----------------------+
| dashboard.js | | OpenTelemetry Engine |
| (Chart.js UI) | | (Reads otel-config) |
+------------------+ +-----------------------+
Frontend Implementation (index.html)
The user interface splits your workspace into active experiment buttons and live-updating operational cards. It includes direct links to your backend triage tooling for an integrated debugging loop.
<div class="container">
<header>
<h1>🔭 OpenTelemetry Declarative Config Demo</h1>
<p class="subtitle">Real-time Metrics Dashboard</p>
</header>
<div class="metrics-grid">
<div class="card">
<h2>📈 Request Traffic Rate</h2>
<canvas id="requestRateChart"></canvas>
</div>
<div class="card">
<h2>⏱️ Mean Response Latency</h2>
<canvas id="responseTimeChart"></canvas>
</div>
</div>
<div class="actions-section">
<h2>🎯 Quick Actions</h2>
<div class="button-group">
<button onclick="testEndpoint('/api/users')" class="btn btn-primary">Test Users API</button>
<button onclick="testEndpoint('/api/orders')" class="btn btn-primary">Test Orders API</button>
<button onclick="createTestOrder()\" class="btn btn-success">Create Test Order</button>
<button onclick="testEndpoint('/api/slow')" class="btn btn-warning">Test Slow Endpoint</button>
<button onclick="testEndpoint('/api/error')" class="btn btn-danger">Trigger Error</button>
</div>
</div>
<div class="links-section">
<h2>🔗 Production Observability Tools</h2>
<div class="button-group">
<a href="http://localhost:16686" target="_blank" class="btn btn-link">Jaeger UI (Traces)</a>
<a href="http://localhost:9090" target="_blank" class="btn btn-link">Prometheus (Metrics)</a>
<a href="http://localhost:3001" target="_blank" class="btn btn-link">Grafana (Dashboards)</a>
</div>
</div>
</div>
State Coordination Engine (dashboard.js)
To capture and render data points without overcomplicating the setup, dashboard.js continually polls the endpoint database arrays, feeding structural properties straight into the Chart.js line matrices.
// dashboard.js excerpt
const API_BASE_URL = 'http://localhost:8080';
const UPDATE_INTERVAL = 5000; // Poll metrics every 5 seconds
let state = {
requestCount: 0,
errorCount: 0,
requestHistory: [],
responseTimeHistory: []
};
// Periodic background metric gathering loop
async function updateDashboard() {
try {
const response = await fetch(`${API_BASE_URL}/metrics`);
if (!response.ok) throw new Error('Metrics unreachable');
const stats = await response.json();
// Update UI info cards dynamically
document.getElementById('service-status').innerText = "Healthy";
document.getElementById('service-status').className = "status-value success";
// Push fresh structural values to Chart data lists
const now = new Date().toLocaleTimeString();
updateChartData(requestRateChart, now, stats.orders_count);
} catch (error) {
document.getElementById('service-status').innerText = "Unreachable";
document.getElementById('service-status').className = "status-value danger";
}
}
function startAutoUpdate() {
setInterval(updateDashboard, UPDATE_INTERVAL);
}
document.addEventListener('DOMContentLoaded', () => {
initializeCharts();
startAutoUpdate();
});
Interacting with the Loop
When you run this architecture locally and load the web dashboard, you get an immediate visual playing field for testing the telemetry pipelines:
- Simulating Traffic Spikes: Clicking “Create Test Order” repeatedly fires backend hooks. The OpenTelemetry metric provider logs the counter adjustments, which are picked up in seconds by the polling graph.
- Diagnosing Anomalies: Clicking “Test Slow Endpoint” forces a mock processing lag. You will see an immediate, distinct jump on the Mean Response Latency graph.
- Correlating Errors: Clicking “Trigger Error” logs a 500 status block. This updates the status panels on your dashboard and lets you jump over to Jaeger or Grafana via the quick links to locate that specific trace context and identify exactly which function failed.
Bob can Provide more than code
Beyond handling complex architectures, configuration schemas, and frontend wiring, Bob proved to be just as capable when it came to content creation — generating a comprehensive, production-ready blog post draft (building-observability-stack-with-otel.md) alongside the code. He effortlessly translated the technical implementation details into a structured, deep-dive narrative, mapping out everything from the core philosophy of programmatic-free initialization to granular troubleshooting tips and telemetry validation steps. Whether you need a rapid prototype or the documentation to explain it to the community, Bob bridges the gap between engineering and technical writing seamlessly, shrinking a task that would normally take days into a matter of minutes.
Maybe the next time I’ll just copy/paste Bob’s content 😂
Conclusion
Ultimately, this practical implementation proves that the era of cluttering application source code with verbose, programmatic SDK initialization boilerplate is officially behind us. By offloading the configuration of traces, metrics, and logs into a single, structured otel-config.yaml file, we have achieved a clean separation of concerns where the application remains entirely focused on business logic, while the OpenTelemetry engine dynamically orchestrates the telemetry pipelines at startup. From mapping out a decoupled architecture to wiring up a real-time Chart.js interactive frontend, every layer of this project demonstrates how accessible enterprise-grade observability can be. With Bob seamlessly bridging the gap between rapid code prototyping and comprehensive technical documentation, this end-to-end demonstration stands as a definitive blueprint for modern, declarative, and developer-friendly system visibility.
Thanks Bob 🤗 and thanks for reading!
Links
- The original article “How to Write Your First OpenTelemetry Declarative Config File with Trace”: https://github.com/open-telemetry/opentelemetry-configuration
- OpenTelemetry Configuration Repository: https://github.com/open-telemetry/opentelemetry-configuration
- The full code for the blog post: https://github.com/aairom/otel-declarative







Top comments (0)