AI agents are distributed systems. They fan out across LLM calls, tool invocations, memory lookups, and multi-step reasoning loops — often asynchronously. But until recently, the observability tooling hadn't caught up. You'd get logs, maybe a dashboard, but no trace of what actually happened across a full agent run.
That's the gap Jaeger v2 is positioned to close — and it's not a stretch.
What actually changed in Jaeger v2
Jaeger v2, released in late 2024, didn't just add features. It replaced its entire internal architecture with the OpenTelemetry Collector framework as the core foundation.
What that means in practice:
- Native OTLP ingestion. No more translation layer from OTLP → Jaeger internal format. Telemetry flows in as-is, with no data loss from conversion.
-
Single binary, OTel-native config. The old
jaeger-agent,jaeger-collector,jaeger-ingester,jaeger-querysplit is gone. One binary, configured via the same YAML model as OTel Collector. - Access to the full OTel Collector ecosystem. Tail-based sampling, span-to-metric connectors, PII filtering processors, Kafka pipelines — all available without Jaeger maintaining separate implementations.
- Tail-based sampling, previously hard to retrofit, is now first-class via the upstream OTel contrib processor.
The architecture shift means Jaeger v2 inherits everything OTel ships — including the new GenAI semantic conventions.
The GenAI conventions: tracing AI agents properly
OpenTelemetry is now actively developing semantic conventions specifically for AI workloads. These define how to represent:
- Model spans — individual LLM inference calls (token counts, model name, latency)
- Agent spans — the higher-level reasoning loops and orchestration steps
- Events — prompt inputs, completions, tool call results
- Metrics — token usage, latency distributions, error rates
And coverage is already provider-specific: OpenAI, Anthropic, AWS Bedrock, and Azure AI Inference all have dedicated conventions. There's even a draft for Model Context Protocol (MCP) — so tool calls via MCP-compatible servers can be traced as first-class spans.
These conventions are still in Development status, but the instrumentation is shipping now. Libraries like LangChain, LlamaIndex, and OpenAI's own SDKs are beginning to emit OTel-compatible telemetry. Jaeger v2 — being natively OTLP — can receive all of it.
Why this matters for teams building agents
The classic distributed tracing use case is: trace a request across microservices, find the slow hop, fix it. The AI agent version is: trace a user prompt → agent planning span → LLM call → tool invocation → second LLM call → final response. Across potentially different services, with retries, branching, and non-determinism.
Without proper trace context propagation, this is a black box. With OTel GenAI conventions + Jaeger v2, you get the full picture — latency per LLM call, token consumption, which tool calls fired and how long they took, where the reasoning went sideways.
That's debugging capability that didn't exist in a standardised form until now.
What to do
- Already on Jaeger v1? Check the v1→v2 migration guide. The architecture shift is real but the storage backends are backward-compatible.
- Building AI agents? Start instrumenting with OTel GenAI semconv now, even in Development status. You'll be ahead of the curve when it stabilises, and Jaeger v2 will ingest it today.
- Using LangChain/LlamaIndex/OpenAI SDKs? Check their OTel instrumentation status — several already support it or have experimental packages.
- Not on Jaeger? The GenAI conventions are backend-agnostic. Any OTLP-compatible backend (Grafana Tempo, Honeycomb, etc.) can receive this telemetry.
Sources: The New Stack · Jaeger v2 release post · OTel GenAI semantic conventions
✏️ Drafted with KewBot (AI), edited and approved by Drew.
Top comments (0)