<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Andrei Popescu</title>
    <description>The latest articles on DEV Community by Andrei Popescu (@andreipopescu).</description>
    <link>https://dev.to/andreipopescu</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F4007858%2Ff232349d-d416-4e88-94ff-5dffe9fb367e.png</url>
      <title>DEV Community: Andrei Popescu</title>
      <link>https://dev.to/andreipopescu</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/andreipopescu"/>
    <language>en</language>
    <item>
      <title>Tracing LLM Requests End-to-End</title>
      <dc:creator>Andrei Popescu</dc:creator>
      <pubDate>Thu, 02 Jul 2026 17:25:33 +0000</pubDate>
      <link>https://dev.to/andreipopescu/tracing-llm-requests-end-to-end-4fg2</link>
      <guid>https://dev.to/andreipopescu/tracing-llm-requests-end-to-end-4fg2</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fph6x6aonx4cbl8bgjb4r.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fph6x6aonx4cbl8bgjb4r.png" alt="Tracing LLM Requests End-to-End" width="800" height="457"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Traditional application logs can tell you that an LLM-powered system is running, but they can't tell you if it's working correctly. End-to-end tracing provides the necessary visibility to debug failures, optimize performance, and understand the complex, multi-step execution paths of modern AI applications.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;LLM-powered applications often fail silently. Instead of throwing a 500 error, they return a confident, grammatically perfect, and completely wrong answer. This makes debugging with traditional logs a process of guesswork. When a user gets a bad response, was the cause a poorly formed prompt, a slow database query, a retrieval step that pulled irrelevant context, or a model hallucination? Without a clear view of the application's internal workflow, it's nearly impossible to know.&lt;/p&gt;

&lt;p&gt;This is the problem that distributed tracing solves. By recording the path of a single request as it flows through the various components of an application, tracing transforms an opaque black box into a transparent system. It's an essential practice for building reliable AI, especially for complex Retrieval-Augmented Generation (RAG) pipelines and multi-agent systems.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is an LLM Trace?
&lt;/h2&gt;

&lt;p&gt;An LLM trace is a complete, structured record of a single request's journey through your application. It's composed of a hierarchy of timed operations called &lt;strong&gt;spans&lt;/strong&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Trace&lt;/strong&gt;: Represents the entire end-to-end execution for a single user request, like a user asking a question to a chatbot. A trace is essentially a collection of all its related spans.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Span&lt;/strong&gt;: Represents a single, discrete unit of work within the trace. In an LLM application, a span could be a call to a vector database, a function that formats a prompt, or an API call to an LLM provider.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each span contains a name, a start and end time, and a rich set of key-value metadata called &lt;strong&gt;attributes&lt;/strong&gt;. These attributes are critical for LLM observability, capturing details like the model name, prompt/completion content, token counts, and temperature settings.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F082u4vq258uwwyq0c3cm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F082u4vq258uwwyq0c3cm.png" alt="A stylized, abstract representation of a single timeline branching into several smaller, nested timelines, depicting a t" width="800" height="457"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This hierarchical structure allows developers to visualize the entire workflow, see the duration of each step, and inspect the specific data that flowed through it. If a RAG application returns an irrelevant answer, a trace can immediately show whether the problem was in the retrieval step (e.g., wrong documents were fetched) or the generation step (e.g., the LLM failed to use the provided context correctly).&lt;/p&gt;

&lt;h2&gt;
  
  
  Why OpenTelemetry is the Standard
&lt;/h2&gt;

&lt;p&gt;To make tracing work across different services, languages, and platforms, a standardized approach is necessary. &lt;strong&gt;OpenTelemetry (OTel)&lt;/strong&gt;, a Cloud Native Computing Foundation (CNCF) project, has emerged as the industry standard for instrumenting, generating, and collecting telemetry data. It provides a unified set of APIs and libraries that let you instrument your code once and send the data to any compatible backend.&lt;/p&gt;

&lt;p&gt;OpenTelemetry solves the problem of vendor lock-in and fragmented observability. Before OTel, tracing systems used proprietary headers, causing traces to break at the boundaries between services instrumented by different vendors. OTel standardizes this with components like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;APIs and SDKs&lt;/strong&gt;: For instrumenting code in various languages.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;The OTel Collector&lt;/strong&gt;: A flexible component for receiving, processing, and exporting telemetry data.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;OpenTelemetry Protocol (OTLP)&lt;/strong&gt;: A general-purpose protocol for transmitting telemetry data between sources, collectors, and backends.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For LLM applications, projects like &lt;a href="https://opentelemetry.io/blog/2023/openllmetry/" rel="noopener noreferrer"&gt;OpenLLMetry&lt;/a&gt; extend the OpenTelemetry standard with semantic conventions specific to generative AI, ensuring that data like prompt content and token usage are captured consistently.&lt;/p&gt;

&lt;h3&gt;
  
  
  How Context Propagation Works
&lt;/h3&gt;

&lt;p&gt;The magic that stitches spans together across service boundaries is called &lt;strong&gt;context propagation&lt;/strong&gt;. Distributed tracing relies on passing a unique identifier with every request as it hops between services. The &lt;a href="https://www.w3.org/TR/trace-context/" rel="noopener noreferrer"&gt;W3C Trace Context specification&lt;/a&gt; defines a standard set of HTTP headers that all compliant tools can understand, solving the interoperability problem.&lt;/p&gt;

&lt;p&gt;The two key headers are:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;code&gt;traceparent&lt;/code&gt;: Carries the essential, universally understood context: a version, a unique &lt;code&gt;trace-id&lt;/code&gt;, a &lt;code&gt;parent-id&lt;/code&gt; (the ID of the calling span), and &lt;code&gt;trace-flags&lt;/code&gt; for sampling decisions.&lt;/li&gt;
&lt;li&gt; &lt;code&gt;tracestate&lt;/code&gt;: An optional header that allows different tracing vendors to include their own proprietary information without breaking the trace.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;OpenTelemetry uses W3C Trace Context as its default format, so any application instrumented with OTel can automatically participate in a distributed trace.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F79a5vuvp5bvi2ail7ozr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F79a5vuvp5bvi2ail7ozr.png" alt="A visual metaphor of several services as distinct islands, with glowing light bridges connecting them, representing W3C " width="800" height="457"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Implementing Tracing in an LLM App
&lt;/h2&gt;

&lt;p&gt;Getting started with tracing involves a few key steps.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;Choose a Tracing Framework&lt;/strong&gt;: For most teams, this means adopting OpenTelemetry. It's vendor-agnostic and has broad support across languages and frameworks like LangChain and LlamaIndex.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Instrument Your Application&lt;/strong&gt;: Instrumentation is the process of adding code to your application to capture and export trace data.

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Auto-instrumentation&lt;/strong&gt;: Many OpenTelemetry SDKs provide automatic instrumentation for common libraries (e.g., HTTP clients, database drivers, LLM SDKs). This is the fastest way to get started.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Manual Instrumentation&lt;/strong&gt;: For more granular control, you can manually create spans to wrap specific functions or business logic. This allows you to define custom attributes and get deeper visibility into your application's behavior.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Configure an Exporter&lt;/strong&gt;: The instrumented code uses an exporter to send trace data to a backend. The OTLP exporter can send data to an OpenTelemetry Collector or directly to a compatible observability platform.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Select a Backend&lt;/strong&gt;: A backend is where you store, visualize, and analyze your traces. Options range from open-source tools like Jaeger and Zipkin to comprehensive commercial and open-source observability platforms like LangSmith, Langfuse, Arize, and many others.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Here is a simplified Python example showing manual instrumentation with the OpenTelemetry SDK for a RAG pipeline:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;opentelemetry&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;trace&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;opentelemetry.sdk.trace&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;TracerProvider&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;opentelemetry.sdk.trace.export&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ConsoleSpanExporter&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;SimpleSpanProcessor&lt;/span&gt;

&lt;span class="c1"&gt;# Configure the tracer to print to the console
&lt;/span&gt;&lt;span class="n"&gt;trace&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set_tracer_provider&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;TracerProvider&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="n"&gt;trace&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_tracer_provider&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;add_span_processor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="nc"&gt;SimpleSpanProcessor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;ConsoleSpanExporter&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;tracer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;trace&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_tracer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;__name__&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;retrieve_documents&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;tracer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;start_as_current_span&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;retrieve_documents&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;span&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;span&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set_attribute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;db.query&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="c1"&gt;# In a real app, this would query a vector database
&lt;/span&gt;        &lt;span class="n"&gt;documents&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Document about &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;'"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;span&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set_attribute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;db.retrieved_count&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;documents&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;generate_response&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;tracer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;start_as_current_span&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;generate_response&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;span&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Query: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s"&gt;Context: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="n"&gt;span&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set_attribute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;llm.prompt&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;span&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set_attribute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;llm.model_name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-4&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="c1"&gt;# In a real app, this would call an LLM API
&lt;/span&gt;        &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;This is a generated answer about &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="n"&gt;span&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set_attribute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;llm.response&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;rag_pipeline&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;tracer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;start_as_current_span&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;rag_pipeline_trace&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;parent_span&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;parent_span&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set_attribute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user.query&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;documents&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;retrieve_documents&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;final_answer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;generate_response&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;final_answer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;rag_pipeline&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;What is distributed tracing?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Tracing Beyond the Basics: Multi-Agent Systems
&lt;/h2&gt;

&lt;p&gt;As applications evolve from simple RAG pipelines to complex, multi-agent systems, the need for robust tracing becomes even more critical. In an agentic workflow, an initial user request can trigger a cascade of interactions between different agents, tools, and API calls. Distributed tracing is the only way to visualize these causal chains and understand how an initial prompt leads to a series of handoffs and tool executions.&lt;/p&gt;

&lt;p&gt;By instrumenting each agent and tool call as a span, developers can debug non-deterministic behaviors, optimize token usage across an entire fleet of agents, and pinpoint the root cause of failures in complex, emergent workflows.&lt;/p&gt;

&lt;p&gt;Tracing is no longer a "nice-to-have" for LLM applications; it is a foundational component of a modern observability stack. It provides the ground truth needed to move from guessing to knowing, enabling teams to build, deploy, and scale reliable AI products with confidence.&lt;/p&gt;

&lt;h2&gt;
  
  
  Sources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;a href="https://opentelemetry.io/" rel="noopener noreferrer"&gt;OpenTelemetry Documentation&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;  &lt;a href="https://www.w3.org/TR/trace-context/" rel="noopener noreferrer"&gt;W3C Trace Context Specification&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;  &lt;a href="https://www.traceloop.com/docs/openllmetry/traces-and-spans" rel="noopener noreferrer"&gt;Understanding Traces and Spans in LLM Applications | Traceloop&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;  &lt;a href="https://fastio.com/blog/ai-agent-distributed-tracing/" rel="noopener noreferrer"&gt;AI Agent Distributed Tracing: The Complete Guide | Fastio&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>observability</category>
      <category>opentelemetry</category>
      <category>llm</category>
      <category>ai</category>
    </item>
  </channel>
</rss>
