AI agents fail in ways that logs don't capture. The agent called the right function, got a valid response, then produced the wrong output. By the time you notice, the trace is gone.
OpenTelemetry fixes this. Here's the full setup for a Claude-based agent.
The Problem With Console.log Debugging
A typical agent debugging session:
- User reports wrong output
- You add
console.logat suspected failure points - Reproduce the failure (if you can)
- Find the log line, add more logs around it
- Repeat
This works for synchronous code. For agents that run multi-step workflows, call tools in parallel, or execute asynchronously — it breaks down. You can't correlate log lines across steps without request IDs threaded through every call.
OpenTelemetry gives you distributed tracing: every step of agent execution is a span, spans are linked into a trace, and you can visualize the full execution tree.
Setup: Jaeger + OTEL SDK
Run Jaeger locally:
docker run -d --name jaeger \
-p 16686:16686 \
-p 4318:4318 \
jaegertracing/all-in-one:latest
Install OTEL packages:
npm install @opentelemetry/sdk-node @opentelemetry/auto-instrumentations-node \
@opentelemetry/exporter-trace-otlp-http @opentelemetry/api
Create the tracer setup (load before anything else):
// instrumentation.ts
import { NodeSDK } from '@opentelemetry/sdk-node'
import { OTLPTraceExporter } from '@opentelemetry/exporter-trace-otlp-http'
import { getNodeAutoInstrumentations } from '@opentelemetry/auto-instrumentations-node'
import { Resource } from '@opentelemetry/resources'
import { SEMRESATTRS_SERVICE_NAME } from '@opentelemetry/semantic-conventions'
const sdk = new NodeSDK({
resource: new Resource({
[SEMRESATTRS_SERVICE_NAME]: 'claude-agent',
}),
traceExporter: new OTLPTraceExporter({
url: process.env.OTEL_EXPORTER_OTLP_ENDPOINT ?? 'http://localhost:4318/v1/traces',
}),
instrumentations: [
getNodeAutoInstrumentations({
'@opentelemetry/instrumentation-http': { enabled: true },
'@opentelemetry/instrumentation-fetch': { enabled: true },
}),
],
})
sdk.start()
process.on('SIGTERM', () => sdk.shutdown())
Instrumenting the Agent
// lib/agent/traced-agent.ts
import Anthropic from '@anthropic-ai/sdk'
import { trace, SpanStatusCode } from '@opentelemetry/api'
const tracer = trace.getTracer('claude-agent', '1.0.0')
const client = new Anthropic()
interface Tool {
name: string
description: string
input_schema: object
execute: (input: unknown) => Promise<unknown>
}
export async function runAgent(userMessage: string, tools: Tool[], sessionId: string) {
return tracer.startActiveSpan('agent.run', async (rootSpan) => {
rootSpan.setAttributes({
'agent.session_id': sessionId,
'agent.user_message': userMessage.slice(0, 200),
})
try {
const messages: Anthropic.MessageParam[] = [{ role: 'user', content: userMessage }]
let iteration = 0
while (iteration < 10) {
const response = await tracer.startActiveSpan('agent.llm_call', async (llmSpan) => {
llmSpan.setAttributes({
'llm.model': 'claude-sonnet-4-6',
'llm.iteration': iteration,
'llm.message_count': messages.length,
})
const result = await client.messages.create({
model: 'claude-sonnet-4-6',
max_tokens: 4096,
tools: tools.map(t => ({
name: t.name,
description: t.description,
input_schema: t.input_schema as Anthropic.Tool['input_schema'],
})),
messages,
})
llmSpan.setAttributes({
'llm.input_tokens': result.usage.input_tokens,
'llm.output_tokens': result.usage.output_tokens,
'llm.stop_reason': result.stop_reason ?? '',
})
llmSpan.end()
return result
})
if (response.stop_reason === 'end_turn') {
const output = response.content
.filter(b => b.type === 'text')
.map(b => (b as Anthropic.TextBlock).text)
.join('')
rootSpan.setAttribute('agent.output', output.slice(0, 500))
rootSpan.setStatus({ code: SpanStatusCode.OK })
rootSpan.end()
return output
}
const toolUses = response.content.filter(b => b.type === 'tool_use')
messages.push({ role: 'assistant', content: response.content })
const toolResults = await Promise.all(
toolUses.map(async (block) => {
const toolBlock = block as Anthropic.ToolUseBlock
const tool = tools.find(t => t.name === toolBlock.name)
return tracer.startActiveSpan(`agent.tool.${toolBlock.name}`, async (toolSpan) => {
toolSpan.setAttributes({
'tool.name': toolBlock.name,
'tool.input': JSON.stringify(toolBlock.input).slice(0, 500),
})
try {
const result = await tool!.execute(toolBlock.input)
toolSpan.setStatus({ code: SpanStatusCode.OK })
toolSpan.end()
return {
type: 'tool_result' as const,
tool_use_id: toolBlock.id,
content: JSON.stringify(result),
}
} catch (err) {
toolSpan.setStatus({ code: SpanStatusCode.ERROR, message: String(err) })
toolSpan.recordException(err as Error)
toolSpan.end()
return {
type: 'tool_result' as const,
tool_use_id: toolBlock.id,
content: `Error: ${String(err)}`,
is_error: true,
}
}
})
})
)
messages.push({ role: 'user', content: toolResults })
iteration++
}
throw new Error('Max iterations reached')
} catch (err) {
rootSpan.setStatus({ code: SpanStatusCode.ERROR, message: String(err) })
rootSpan.recordException(err as Error)
rootSpan.end()
throw err
}
})
}
What You See in Jaeger
After running a few agent calls, open http://localhost:16686. Select the claude-agent service and pick any trace. You'll see:
agent.run (340ms)
├── agent.llm_call [iteration=0] (210ms)
│ input_tokens=847, output_tokens=312
├── agent.tool.search_documents (45ms)
│ query="invoice #1234"
├── agent.tool.get_customer (23ms)
│ customer_id="cust_abc"
├── agent.llm_call [iteration=1] (180ms)
│ input_tokens=1204, output_tokens=89
└── [end_turn]
When a tool fails, the span turns red. When the LLM loops unexpectedly, you see the iteration count climb. Token costs per session are visible without any extra instrumentation.
Production Considerations
- Sample aggressively — trace 10% of traffic, 100% of errors
- Redact PII — never put user content in span attributes; use hashed IDs
-
Set span limits — truncate
agent.outputto 500 chars to prevent attribute size errors -
Use baggage for session ID — propagate session_id through async boundaries with
context.with()
Full Observability Stack
OpenTelemetry traces + structured logs + Stripe event webhooks give you the complete picture of every agent session. This pattern is built into the Workflow Automator MCP — it adds tracing to any Claude agent running in the IDE.
- Workflow Automator MCP — $15/mo — pre-built OTEL instrumentation for Claude agent loops
- AI SaaS Starter Kit — $99 one-time — full production agent stack with tracing, auth, and billing
Top comments (0)