Alexandr Bandurchin for Uptrace

Posted on Apr 9 • Originally published at uptrace.dev

OpenTelemetry OpenAI Instrumentation

#ai #monitoring #openai #python

By adding OpenTelemetry instrumentation to the OpenAI SDK, you get automatic tracing for every API call — model name, token usage, finish reason, and errors — without modifying your existing OpenAI code. This data flows into any OpenTelemetry APM including Uptrace.

Quick Setup

Step	Action	Code/Command
1. Install	Install the official OTel instrumentation package	`pip install opentelemetry-instrumentation-openai-v2`
2. Instrument	Call instrument() before any OpenAI calls	`OpenAIInstrumentor().instrument()`
3. Configure	Point the exporter at your backend	See exporter setup below
4. Run	Make OpenAI calls as normal	Spans captured automatically

Minimal working example:

from opentelemetry.instrumentation.openai_v2 import OpenAIInstrumentor

# Call once at application startup, before any OpenAI API calls
OpenAIInstrumentor().instrument()

from openai import OpenAI
client = OpenAI()

response = client.chat.completions.create(
    model="gpt-5",
    messages=[{"role": "user", "content": "What is OpenTelemetry?"}]
)

This single call automatically creates a span for every client.chat.completions.create(), client.embeddings.create(), and other OpenAI SDK operations your code makes.

Two packages exist. opentelemetry-instrumentation-openai-v2 is the official OpenTelemetry implementation (maintained in the opentelemetry-python-contrib repo). The older opentelemetry-instrumentation-openai is a community package by Traceloop/OpenLLMetry — still maintained but not the OTel standard. Use the -v2 package for new projects.

What Gets Captured

The instrumentation populates standard OpenTelemetry GenAI semantic convention attributes on each span:

Attribute	Example value	Description
`gen_ai.system`	`openai`	AI provider
`gen_ai.operation.name`	`chat`	Operation type
`gen_ai.request.model`	`gpt-5`	Requested model
`gen_ai.response.model`	`gpt-5`	Actual model version returned by API
`gen_ai.usage.input_tokens`	`142`	Prompt tokens consumed
`gen_ai.usage.output_tokens`	`38`	Completion tokens generated
`gen_ai.response.finish_reason`	`stop`	Why generation ended

These attributes follow the OpenTelemetry GenAI semantic conventions and are supported by Uptrace, Datadog (v1.37+), and Grafana. The gen_ai.usage.input_tokens and gen_ai.usage.output_tokens values are the foundation for LLM cost monitoring — multiply them by model price per token to get per-request spend.

Full Setup with OTLP Exporter

Note on Examples: This guide uses Uptrace as the OpenTelemetry backend in code examples. OpenTelemetry is vendor-neutral and works with any OTLP-compatible backend (Jaeger, Grafana Tempo, Prometheus, etc.). Simply replace the endpoint and headers with your preferred backend configuration.

A complete setup configures the SDK, adds a span processor, and instruments OpenAI before your application logic runs:

import os
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
from opentelemetry.instrumentation.openai_v2 import OpenAIInstrumentor

def setup_tracing():
    exporter = OTLPSpanExporter(
        endpoint=os.environ["OTLP_ENDPOINT"],   # e.g. https://api.uptrace.dev:4317
        headers={"uptrace-dsn": os.environ["UPTRACE_DSN"]},
    )
    provider = TracerProvider()
    provider.add_span_processor(BatchSpanProcessor(exporter))
    trace.set_tracer_provider(provider)

    OpenAIInstrumentor().instrument()

setup_tracing()

After this, all OpenAI SDK calls in the process produce spans automatically — no changes to existing code required.

Capturing Prompt and Completion Content

By default the instrumentation records token counts and metadata but not the actual message content, to avoid accidentally logging sensitive data. To capture prompts and completions as span events, set the environment variable before starting your application:

OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT=true

With this enabled, each span gets log events containing the full input messages and the model's response text. Use this in development and debugging environments; evaluate carefully before enabling in production where prompts may contain PII.

Streaming Responses

The instrumentation handles streaming responses. Token counts are recorded when the stream completes:

with client.chat.completions.stream(
    model="gpt-5",
    messages=[{"role": "user", "content": "List 5 observability best practices"}],
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)

# gen_ai.usage.input_tokens and gen_ai.usage.output_tokens are
# set on the span after the stream closes

Adding Custom Attributes

For context beyond what the auto-instrumentation captures — user ID, feature flag, request source — add attributes to the current span:

from opentelemetry import trace

tracer = trace.get_tracer(__name__)

def generate_summary(text: str, user_id: str) -> str:
    with tracer.start_as_current_span("summarize") as span:
        span.set_attribute("app.user_id", user_id)
        span.set_attribute("app.input_length", len(text))

        response = client.chat.completions.create(
            model="gpt-5.4-nano",
            messages=[
                {"role": "system", "content": "Summarize the following text."},
                {"role": "user", "content": text},
            ]
        )
        return response.choices[0].message.content

The auto-instrumented gen_ai.* attributes appear as child spans under summarize, giving you both application context and model-level detail in the same trace.

Async Support

The instrumentation works with the async OpenAI client without additional configuration:

from openai import AsyncOpenAI

async_client = AsyncOpenAI()

async def async_chat(prompt: str) -> str:
    response = await async_client.chat.completions.create(
        model="gpt-5",
        messages=[{"role": "user", "content": prompt}]
    )
    return response.choices[0].message.content

Sending Traces to Uptrace

Uptrace accepts OpenAI traces via OTLP and stores gen_ai.* attributes as queryable fields. Find your DSN under Project Settings → Connection details in the Uptrace dashboard:

from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter

exporter = OTLPSpanExporter(
    endpoint="https://api.uptrace.dev:4317",
    headers={"uptrace-dsn": "https://<token>@api.uptrace.dev?grpc=4317"},
)

In Uptrace you can filter traces by gen_ai.request.model, group by model to compare token usage, and sum gen_ai.usage.input_tokens + gen_ai.usage.output_tokens per trace to track consumption over time. If you're using LangChain, combine this setup with the LangChain observability guide to trace chain and agent structure alongside individual OpenAI calls.

JavaScript / TypeScript

The equivalent package for Node.js is @opentelemetry/instrumentation-openai (community maintained):

import { OpenAIInstrumentation } from '@opentelemetry/instrumentation-openai';
import { NodeSDK } from '@opentelemetry/sdk-node';

const sdk = new NodeSDK({
  instrumentations: [new OpenAIInstrumentation()],
});

sdk.start();

The same gen_ai.* attributes are captured on the Node side.

DEV Community