Adding observability to your Vercel AI SDK app in 30 seconds

#ai #vercel #javascript #npm

Last week I was debugging a streamText call in my Next.js chatbot and realized I had no idea how many tokens it actually used, what the latency was, or how much it cost — three things I really should know in production.

The Vercel AI SDK emits OpenTelemetry spans natively the moment you flip experimental_telemetry: { isEnabled: true }. The wire is there. You just need something to listen on the other end.

This is the 30-second setup I ended up with.

Install

npm install @voightxyz/vercel-ai @vercel/otel @ai-sdk/otel

Three small packages:

@vercel/otel — Vercel's OpenTelemetry bootstrap for Next.js
@ai-sdk/otel — bridges the AI SDK's telemetry into OpenTelemetry
@voightxyz/vercel-ai — the SpanExporter that sends the captured spans somewhere you can read them (in my case, the Voight dashboard, but this is just a standard OTel SpanExporter — pair it with whatever)

Register the exporter

In instrumentation.ts (Next.js convention — root of the project, or src/instrumentation.ts if you use src/):

import { registerTelemetry } from 'ai'
import { LegacyOpenTelemetry } from '@ai-sdk/otel'
import { registerOTel } from '@vercel/otel'
import { VoightExporter } from '@voightxyz/vercel-ai'

registerTelemetry(new LegacyOpenTelemetry())

export function register() {
  registerOTel({
    serviceName: 'my-chatbot',
    traceExporter: new VoightExporter({
      agent: 'production-chat-api',
      privacy: 'standard',
    }),
  })
}

That's the whole instrumentation step. Done.

Enable telemetry on your LLM calls

In your route handler:

import { openai } from '@ai-sdk/openai'
import { streamText } from 'ai'

export async function POST(req: Request) {
  const result = streamText({
    model: openai('gpt-4o-mini'),
    prompt: (await req.json()).prompt,
    experimental_telemetry: {
      isEnabled: true,
      functionId: 'stream-text',
    },
  })
  return result.toAIStreamResponse()
}

Now every streamText / generateText / streamObject / generateObject call carries token counts, model id, prompt, response text, tool calls, finish reason, and latency — all the way to the exporter.

Bonus: per-user cost attribution in one line

This is the part that surprised me. You can attach arbitrary metadata to the telemetry block, and a good exporter will surface it as searchable tags:

experimental_telemetry: {
  isEnabled: true,
  metadata: {
    userId: session.user.id,
    plan: session.user.plan,
    org: session.user.org,
  },
}

In the Voight dashboard this populates a "Users" sub-tab automatically — you get cost-per-end-user without writing any analytics code. The metadata.userId key is the one that triggers the per-user aggregation; everything else becomes a filterable tag.

If you've ever needed to answer "which of my users is costing me the most?" — this is how. Same pattern works regardless of which observability backend you wire up; it's just ai.telemetry.metadata.<key> span attributes under the hood.

What you actually get

After the 3 steps above, every LLM call carries:

Signal	Where
Model ID	`model`
Provider (`openai`, `anthropic`, …)	`metadata.provider`
Prompts and response text	with optional PII scrubbing
Token counts (input, output, cache reads)	`metadata.tokens`
Tool calls with arguments	`metadata.toolCalls`
Streaming flag	`metadata.streaming`
Latency	`durationMs`
Errors	`outcome: 'failed'` + `errorMessage`

Why this approach (not a middleware)

The Vercel AI SDK docs list a long row of observability providers (Langfuse, Phoenix, Braintrust, Datadog, Sentry, W&B). They all consume the same OpenTelemetry wire — none of them require a custom middleware that wraps your streamText call.

That's by design. If you commit to a middleware, you've coupled to one vendor's API. If you commit to OpenTelemetry, you can swap exporters whenever your needs change, pair multiple exporters via MultiSpanProcessor, or wire into your existing OTel pipeline at zero cost.

Wrapping up

If you've been putting off adding observability to your AI app because the existing tutorials are 50 lines of setup — try the 30-second version. The Vercel AI SDK gives you the telemetry for free; all you're really doing is pointing it at a backend.

If you're curious about Voight specifically (the exporter I used here), it's Apache 2.0: