DEV Community

Cover image for OpenTelemetry OpenAI Instrumentation
Alexandr Bandurchin for Uptrace

Posted on • Originally published at uptrace.dev

OpenTelemetry OpenAI Instrumentation

By adding OpenTelemetry instrumentation to the OpenAI SDK, you get automatic tracing for every API call — model name, token usage, finish reason, and errors — without modifying your existing OpenAI code. This data flows into any OpenTelemetry APM including Uptrace.

Quick Setup

Step Action Code/Command
1. Install Install the official OTel instrumentation package pip install opentelemetry-instrumentation-openai-v2
2. Instrument Call instrument() before any OpenAI calls OpenAIInstrumentor().instrument()
3. Configure Point the exporter at your backend See exporter setup below
4. Run Make OpenAI calls as normal Spans captured automatically

Minimal working example:

from opentelemetry.instrumentation.openai_v2 import OpenAIInstrumentor

# Call once at application startup, before any OpenAI API calls
OpenAIInstrumentor().instrument()

from openai import OpenAI
client = OpenAI()

response = client.chat.completions.create(
    model="gpt-5",
    messages=[{"role": "user", "content": "What is OpenTelemetry?"}]
)
Enter fullscreen mode Exit fullscreen mode

This single call automatically creates a span for every client.chat.completions.create(), client.embeddings.create(), and other OpenAI SDK operations your code makes.

Two packages exist. opentelemetry-instrumentation-openai-v2 is the official OpenTelemetry implementation (maintained in the opentelemetry-python-contrib repo). The older opentelemetry-instrumentation-openai is a community package by Traceloop/OpenLLMetry — still maintained but not the OTel standard. Use the -v2 package for new projects.

What Gets Captured

The instrumentation populates standard OpenTelemetry GenAI semantic convention attributes on each span:

Attribute Example value Description
gen_ai.system openai AI provider
gen_ai.operation.name chat Operation type
gen_ai.request.model gpt-5 Requested model
gen_ai.response.model gpt-5 Actual model version returned by API
gen_ai.usage.input_tokens 142 Prompt tokens consumed
gen_ai.usage.output_tokens 38 Completion tokens generated
gen_ai.response.finish_reason stop Why generation ended

These attributes follow the OpenTelemetry GenAI semantic conventions and are supported by Uptrace, Datadog (v1.37+), and Grafana. The gen_ai.usage.input_tokens and gen_ai.usage.output_tokens values are the foundation for LLM cost monitoring — multiply them by model price per token to get per-request spend.

Full Setup with OTLP Exporter

Note on Examples: This guide uses Uptrace as the OpenTelemetry backend in code examples. OpenTelemetry is vendor-neutral and works with any OTLP-compatible backend (Jaeger, Grafana Tempo, Prometheus, etc.). Simply replace the endpoint and headers with your preferred backend configuration.

A complete setup configures the SDK, adds a span processor, and instruments OpenAI before your application logic runs:

import os
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
from opentelemetry.instrumentation.openai_v2 import OpenAIInstrumentor

def setup_tracing():
    exporter = OTLPSpanExporter(
        endpoint=os.environ["OTLP_ENDPOINT"],   # e.g. https://api.uptrace.dev:4317
        headers={"uptrace-dsn": os.environ["UPTRACE_DSN"]},
    )
    provider = TracerProvider()
    provider.add_span_processor(BatchSpanProcessor(exporter))
    trace.set_tracer_provider(provider)

    OpenAIInstrumentor().instrument()

setup_tracing()
Enter fullscreen mode Exit fullscreen mode

After this, all OpenAI SDK calls in the process produce spans automatically — no changes to existing code required.

Capturing Prompt and Completion Content

By default the instrumentation records token counts and metadata but not the actual message content, to avoid accidentally logging sensitive data. To capture prompts and completions as span events, set the environment variable before starting your application:

OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT=true
Enter fullscreen mode Exit fullscreen mode

With this enabled, each span gets log events containing the full input messages and the model's response text. Use this in development and debugging environments; evaluate carefully before enabling in production where prompts may contain PII.

Streaming Responses

The instrumentation handles streaming responses. Token counts are recorded when the stream completes:

with client.chat.completions.stream(
    model="gpt-5",
    messages=[{"role": "user", "content": "List 5 observability best practices"}],
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)

# gen_ai.usage.input_tokens and gen_ai.usage.output_tokens are
# set on the span after the stream closes
Enter fullscreen mode Exit fullscreen mode

Adding Custom Attributes

For context beyond what the auto-instrumentation captures — user ID, feature flag, request source — add attributes to the current span:

from opentelemetry import trace

tracer = trace.get_tracer(__name__)

def generate_summary(text: str, user_id: str) -> str:
    with tracer.start_as_current_span("summarize") as span:
        span.set_attribute("app.user_id", user_id)
        span.set_attribute("app.input_length", len(text))

        response = client.chat.completions.create(
            model="gpt-5.4-nano",
            messages=[
                {"role": "system", "content": "Summarize the following text."},
                {"role": "user", "content": text},
            ]
        )
        return response.choices[0].message.content
Enter fullscreen mode Exit fullscreen mode

The auto-instrumented gen_ai.* attributes appear as child spans under summarize, giving you both application context and model-level detail in the same trace.

Async Support

The instrumentation works with the async OpenAI client without additional configuration:

from openai import AsyncOpenAI

async_client = AsyncOpenAI()

async def async_chat(prompt: str) -> str:
    response = await async_client.chat.completions.create(
        model="gpt-5",
        messages=[{"role": "user", "content": prompt}]
    )
    return response.choices[0].message.content
Enter fullscreen mode Exit fullscreen mode

Sending Traces to Uptrace

Uptrace accepts OpenAI traces via OTLP and stores gen_ai.* attributes as queryable fields. Find your DSN under Project Settings → Connection details in the Uptrace dashboard:

from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter

exporter = OTLPSpanExporter(
    endpoint="https://api.uptrace.dev:4317",
    headers={"uptrace-dsn": "https://<token>@api.uptrace.dev?grpc=4317"},
)
Enter fullscreen mode Exit fullscreen mode

In Uptrace you can filter traces by gen_ai.request.model, group by model to compare token usage, and sum gen_ai.usage.input_tokens + gen_ai.usage.output_tokens per trace to track consumption over time. If you're using LangChain, combine this setup with the LangChain observability guide to trace chain and agent structure alongside individual OpenAI calls.

JavaScript / TypeScript

The equivalent package for Node.js is @opentelemetry/instrumentation-openai (community maintained):

import { OpenAIInstrumentation } from '@opentelemetry/instrumentation-openai';
import { NodeSDK } from '@opentelemetry/sdk-node';

const sdk = new NodeSDK({
  instrumentations: [new OpenAIInstrumentation()],
});

sdk.start();
Enter fullscreen mode Exit fullscreen mode

The same gen_ai.* attributes are captured on the Node side.

Top comments (0)