<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Dangel Jesus Rodríguez</title>
    <description>The latest articles on DEV Community by Dangel Jesus Rodríguez (@seenfinity).</description>
    <link>https://dev.to/seenfinity</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3944854%2F45018f74-5e40-4256-a481-f3e7c5bd6d5c.jpg</url>
      <title>DEV Community: Dangel Jesus Rodríguez</title>
      <link>https://dev.to/seenfinity</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/seenfinity"/>
    <language>en</language>
    <item>
      <title>Per-user cost attribution for your AI APP</title>
      <dc:creator>Dangel Jesus Rodríguez</dc:creator>
      <pubDate>Thu, 21 May 2026 23:39:13 +0000</pubDate>
      <link>https://dev.to/seenfinity/per-user-cost-attribution-for-your-ai-app-16o</link>
      <guid>https://dev.to/seenfinity/per-user-cost-attribution-for-your-ai-app-16o</guid>
      <description>&lt;p&gt;You ship your AI feature. It works. A week later your OpenAI bill is $400 and you have no idea which of your users caused which $0.05.&lt;/p&gt;

&lt;p&gt;This is the single most underrated metric in production LLM apps — &lt;strong&gt;cost per end-user&lt;/strong&gt; — and it's surprisingly easy to instrument if you know what to do.&lt;/p&gt;

&lt;p&gt;Here are the three approaches I've found work in practice, ranked by setup time.&lt;/p&gt;

&lt;h2&gt;
  
  
  Approach 1: Wrap your provider client (5 minutes)
&lt;/h2&gt;

&lt;p&gt;Works for Express, Next.js Route Handlers, Fastify — anything that has a single OpenAI or Anthropic client instance.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;OpenAI&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;openai&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;wrapOpenAI&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;withTrace&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@voightxyz/openai&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;openai&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;wrapOpenAI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;OpenAI&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;production-chat-api&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/api/chat&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;withTrace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;r&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;openai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
        &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;gpt-4o-mini&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;})&lt;/span&gt;
      &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;reply&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;choices&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;routeTag&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;POST /api/chat&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;tags&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;plan&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;plan&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The trick is &lt;code&gt;withTrace({ tags: { userId } })&lt;/code&gt; at the request boundary. Every LLM call inside the block — direct or nested — inherits those tags automatically via &lt;code&gt;AsyncLocalStorage&lt;/code&gt;. You don't have to thread &lt;code&gt;userId&lt;/code&gt; through every function.&lt;/p&gt;

&lt;p&gt;Pros: simplest. Pros: works with both OpenAI and Anthropic the same way.&lt;br&gt;
Cons: requires you to use the dedicated wrapper SDKs.&lt;/p&gt;
&lt;h2&gt;
  
  
  Approach 2: OpenTelemetry telemetry metadata (Vercel AI SDK)
&lt;/h2&gt;

&lt;p&gt;If you're on the Vercel AI SDK, &lt;code&gt;experimental_telemetry.metadata&lt;/code&gt; is the equivalent hook:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;openai&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@ai-sdk/openai&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;streamText&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ai&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;POST&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Request&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;streamText&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;openai&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;gpt-4o-mini&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="na"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()).&lt;/span&gt;&lt;span class="nx"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;experimental_telemetry&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;isEnabled&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;plan&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;plan&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;})&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;toAIStreamResponse&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This lifts onto &lt;code&gt;ai.telemetry.metadata.&amp;lt;key&amp;gt;&lt;/code&gt; span attributes that any OpenTelemetry-compatible observability tool (Langfuse, Phoenix, Voight, Braintrust, Datadog) picks up.&lt;/p&gt;

&lt;p&gt;Pros: zero coupling — pure OTel, swap exporters whenever.&lt;br&gt;
Cons: only works if your SDK emits OTel spans. AI SDK does. Many others don't yet.&lt;/p&gt;
&lt;h2&gt;
  
  
  Approach 3: Raw event emission (autonomous bots / non-HTTP)
&lt;/h2&gt;

&lt;p&gt;For background workers, agents calling LLMs in loops, or anything that doesn't have a request boundary — emit events manually:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Voight&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@voightxyz/sdk&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;voight&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Voight&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;agentId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;my-bot&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;t0&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;https://api.openai.com/v1/chat/completions&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;method&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;POST&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;authorization&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`Bearer &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;gpt-4o-mini&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[...],&lt;/span&gt;
  &lt;span class="p"&gt;}),&lt;/span&gt;
&lt;span class="p"&gt;}).&lt;/span&gt;&lt;span class="nf"&gt;then&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;

&lt;span class="nx"&gt;voight&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;reasoning&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;gpt-4o-mini&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;durationMs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;t0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;outcome&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;success&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;tokens&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;input&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;usage&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;prompt_tokens&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;output&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;usage&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;completion_tokens&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;tags&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;job&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;tenantId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;job&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;tenantId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is more code per call, but you control everything. Useful when the LLM call doesn't fit cleanly inside a wrapper (e.g. you're proxying through your own router).&lt;/p&gt;

&lt;p&gt;Pros: full control over what gets emitted.&lt;br&gt;
Cons: more boilerplate. You're responsible for token counting.&lt;/p&gt;

&lt;h2&gt;
  
  
  What you can answer once &lt;code&gt;userId&lt;/code&gt; is in your tags
&lt;/h2&gt;

&lt;p&gt;Once &lt;code&gt;tags.userId&lt;/code&gt; (or whatever you name it) is on every event, the questions you can answer change shape:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Question&lt;/th&gt;
&lt;th&gt;How&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Which user costs me the most this month?&lt;/td&gt;
&lt;td&gt;Group by &lt;code&gt;tags.userId&lt;/code&gt;, sum cost&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Is my free tier subsidising power users?&lt;/td&gt;
&lt;td&gt;Filter by &lt;code&gt;tags.plan: 'free'&lt;/code&gt; + sort by cost&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Did our last release explode anyone's bill?&lt;/td&gt;
&lt;td&gt;Filter by &lt;code&gt;tags.userId&lt;/code&gt; + date range&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;What's our cost-to-revenue ratio per customer?&lt;/td&gt;
&lt;td&gt;Join with your Stripe data on &lt;code&gt;userId&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;You don't need a separate analytics SDK on the client. You don't need to copy &lt;code&gt;userId&lt;/code&gt; into LLM messages. You don't need anything custom on top — the tags propagate from the request boundary down to every span.&lt;/p&gt;

&lt;h2&gt;
  
  
  A note on GDPR / multi-tenant safety
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;userId&lt;/code&gt; here means your internal stable identifier — &lt;code&gt;user_a3f9c2&lt;/code&gt; or whatever — not the user's email or wallet. Never put PII into telemetry metadata. The good observability tools scrub PII anyway, but garbage-in is still garbage.&lt;/p&gt;

&lt;p&gt;For multi-tenant SaaS, add a second tag: &lt;code&gt;tags: { userId, tenantId }&lt;/code&gt;. That way you can ask both "which customer is this?" and "which of their users?".&lt;/p&gt;

&lt;h2&gt;
  
  
  Wrapping up
&lt;/h2&gt;

&lt;p&gt;Three approaches, one mental model: stamp &lt;code&gt;userId&lt;/code&gt; at the boundary, let it propagate to every LLM call inside the request.&lt;/p&gt;

&lt;p&gt;The wrappers I used here are Apache 2.0:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://www.npmjs.com/package/@voightxyz/openai" rel="noopener noreferrer"&gt;@voightxyz/openai&lt;/a&gt; for OpenAI&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.npmjs.com/package/@voightxyz/anthropic" rel="noopener noreferrer"&gt;@voightxyz/anthropic&lt;/a&gt; for Anthropic&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.npmjs.com/package/@voightxyz/vercel-ai" rel="noopener noreferrer"&gt;@voightxyz/vercel-ai&lt;/a&gt; for the Vercel AI SDK&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.npmjs.com/package/@voightxyz/sdk" rel="noopener noreferrer"&gt;@voightxyz/sdk&lt;/a&gt; for library mode&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Same approach works with &lt;a href="https://langfuse.com" rel="noopener noreferrer"&gt;Langfuse&lt;/a&gt;, &lt;a href="https://phoenix.arize.com" rel="noopener noreferrer"&gt;Phoenix&lt;/a&gt;, &lt;a href="https://braintrust.dev" rel="noopener noreferrer"&gt;Braintrust&lt;/a&gt;, or your existing OTel pipeline — the metadata.userId pattern is the universal part.&lt;/p&gt;




&lt;p&gt;How do you currently track per-user spend in your AI app? Stripe metering? Server logs? Or have you been flying blind?&lt;/p&gt;

</description>
      <category>ai</category>
      <category>saas</category>
      <category>opensource</category>
      <category>agents</category>
    </item>
    <item>
      <title>Adding observability to your Vercel AI SDK app in 30 seconds</title>
      <dc:creator>Dangel Jesus Rodríguez</dc:creator>
      <pubDate>Thu, 21 May 2026 21:33:06 +0000</pubDate>
      <link>https://dev.to/seenfinity/adding-observability-to-your-vercel-ai-sdk-app-in-30-seconds-2n4m</link>
      <guid>https://dev.to/seenfinity/adding-observability-to-your-vercel-ai-sdk-app-in-30-seconds-2n4m</guid>
      <description>&lt;p&gt;Last week I was debugging a &lt;code&gt;streamText&lt;/code&gt; call in my Next.js chatbot and realized I had &lt;strong&gt;no idea&lt;/strong&gt; how many tokens it actually used, what the latency was, or how much it cost — three things I really should know in production.&lt;/p&gt;

&lt;p&gt;The Vercel AI SDK emits OpenTelemetry spans natively the moment you flip &lt;code&gt;experimental_telemetry: { isEnabled: true }&lt;/code&gt;. The wire is there. You just need something to listen on the other end.&lt;/p&gt;

&lt;p&gt;This is the 30-second setup I ended up with.&lt;/p&gt;

&lt;h2&gt;
  
  
  Install
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; @voightxyz/vercel-ai @vercel/otel @ai-sdk/otel
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Three small packages:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;@vercel/otel&lt;/code&gt; — Vercel's OpenTelemetry bootstrap for Next.js&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;@ai-sdk/otel&lt;/code&gt; — bridges the AI SDK's telemetry into OpenTelemetry&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;@voightxyz/vercel-ai&lt;/code&gt; — the SpanExporter that sends the captured spans somewhere you can read them (in my case, the Voight dashboard, but this is just a standard OTel SpanExporter — pair it with whatever)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Register the exporter
&lt;/h2&gt;

&lt;p&gt;In &lt;code&gt;instrumentation.ts&lt;/code&gt; (Next.js convention — root of the project, or &lt;code&gt;src/instrumentation.ts&lt;/code&gt; if you use &lt;code&gt;src/&lt;/code&gt;):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;registerTelemetry&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ai&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;LegacyOpenTelemetry&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@ai-sdk/otel&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;registerOTel&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@vercel/otel&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;VoightExporter&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@voightxyz/vercel-ai&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;

&lt;span class="nf"&gt;registerTelemetry&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;LegacyOpenTelemetry&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;register&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nf"&gt;registerOTel&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;serviceName&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;my-chatbot&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;traceExporter&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;VoightExporter&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
      &lt;span class="na"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;production-chat-api&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;privacy&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;standard&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;}),&lt;/span&gt;
  &lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's the whole instrumentation step. Done.&lt;/p&gt;

&lt;h2&gt;
  
  
  Enable telemetry on your LLM calls
&lt;/h2&gt;

&lt;p&gt;In your route handler:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;openai&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@ai-sdk/openai&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;streamText&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ai&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;POST&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Request&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;streamText&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;openai&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;gpt-4o-mini&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="na"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()).&lt;/span&gt;&lt;span class="nx"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;experimental_telemetry&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;isEnabled&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;functionId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;stream-text&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;})&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;toAIStreamResponse&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now every &lt;code&gt;streamText&lt;/code&gt; / &lt;code&gt;generateText&lt;/code&gt; / &lt;code&gt;streamObject&lt;/code&gt; / &lt;code&gt;generateObject&lt;/code&gt; call carries token counts, model id, prompt, response text, tool calls, finish reason, and latency — all the way to the exporter.&lt;/p&gt;

&lt;h2&gt;
  
  
  Bonus: per-user cost attribution in one line
&lt;/h2&gt;

&lt;p&gt;This is the part that surprised me. You can attach arbitrary metadata to the telemetry block, and a good exporter will surface it as searchable tags:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="nx"&gt;experimental_telemetry&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;isEnabled&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nl"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nx"&gt;plan&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;plan&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nx"&gt;org&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;org&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In the Voight dashboard this populates a "Users" sub-tab automatically — you get cost-per-end-user without writing any analytics code. The &lt;code&gt;metadata.userId&lt;/code&gt; key is the one that triggers the per-user aggregation; everything else becomes a filterable tag.&lt;/p&gt;

&lt;p&gt;If you've ever needed to answer "which of my users is costing me the most?" — this is how. Same pattern works regardless of which observability backend you wire up; it's just &lt;code&gt;ai.telemetry.metadata.&amp;lt;key&amp;gt;&lt;/code&gt; span attributes under the hood.&lt;/p&gt;

&lt;h2&gt;
  
  
  What you actually get
&lt;/h2&gt;

&lt;p&gt;After the 3 steps above, every LLM call carries:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Signal&lt;/th&gt;
&lt;th&gt;Where&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Model ID&lt;/td&gt;
&lt;td&gt;&lt;code&gt;model&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Provider (&lt;code&gt;openai&lt;/code&gt;, &lt;code&gt;anthropic&lt;/code&gt;, …)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;metadata.provider&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Prompts and response text&lt;/td&gt;
&lt;td&gt;with optional PII scrubbing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Token counts (input, output, cache reads)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;metadata.tokens&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Tool calls with arguments&lt;/td&gt;
&lt;td&gt;&lt;code&gt;metadata.toolCalls&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Streaming flag&lt;/td&gt;
&lt;td&gt;&lt;code&gt;metadata.streaming&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Latency&lt;/td&gt;
&lt;td&gt;&lt;code&gt;durationMs&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Errors&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;outcome: 'failed'&lt;/code&gt; + &lt;code&gt;errorMessage&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Why this approach (not a middleware)
&lt;/h2&gt;

&lt;p&gt;The Vercel AI SDK docs list a long row of observability providers (Langfuse, Phoenix, Braintrust, Datadog, Sentry, W&amp;amp;B). They all consume the same OpenTelemetry wire — none of them require a custom middleware that wraps your &lt;code&gt;streamText&lt;/code&gt; call.&lt;/p&gt;

&lt;p&gt;That's by design. If you commit to a middleware, you've coupled to one vendor's API. If you commit to OpenTelemetry, you can swap exporters whenever your needs change, pair multiple exporters via &lt;code&gt;MultiSpanProcessor&lt;/code&gt;, or wire into your existing OTel pipeline at zero cost.&lt;/p&gt;

&lt;h2&gt;
  
  
  Wrapping up
&lt;/h2&gt;

&lt;p&gt;If you've been putting off adding observability to your AI app because the existing tutorials are 50 lines of setup — try the 30-second version. The Vercel AI SDK gives you the telemetry for free; all you're really doing is pointing it at a backend.&lt;/p&gt;

&lt;p&gt;If you're curious about Voight specifically (the exporter I used here), it's Apache 2.0:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;npm: &lt;a href="https://www.npmjs.com/package/@voightxyz/vercel-ai" rel="noopener noreferrer"&gt;&lt;code&gt;@voightxyz/vercel-ai&lt;/code&gt;&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Docs: &lt;a href="https://docs.voight.xyz/ai-apps/vercel-ai" rel="noopener noreferrer"&gt;docs.voight.xyz/ai-apps/vercel-ai&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;What observability backend are you using for your AI SDK apps? Did you go with one of the big OTel-friendly options or roll your own?&lt;/p&gt;

</description>
      <category>ai</category>
      <category>vercel</category>
      <category>javascript</category>
      <category>npm</category>
    </item>
  </channel>
</rss>
