<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Sergiy Dybskiy</title>
    <description>The latest articles on DEV Community by Sergiy Dybskiy (@sergical).</description>
    <link>https://dev.to/sergical</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3953361%2Fb181fe5e-334c-4297-a081-54d48ebe515f.jpeg</url>
      <title>DEV Community: Sergiy Dybskiy</title>
      <link>https://dev.to/sergical</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/sergical"/>
    <language>en</language>
    <item>
      <title>Errors, traces, logs, metrics: when to reach for what</title>
      <dc:creator>Sergiy Dybskiy</dc:creator>
      <pubDate>Mon, 08 Jun 2026 20:33:37 +0000</pubDate>
      <link>https://dev.to/sentry/errors-traces-logs-metrics-when-to-reach-for-what-3j5f</link>
      <guid>https://dev.to/sentry/errors-traces-logs-metrics-when-to-reach-for-what-3j5f</guid>
      <description>&lt;p&gt;When should I reach for a log, a trace, or a metric? I hit that question constantly when I instrument code, and I watch coding agents hit it too. It sounds like it should be obvious. Errors, traces, logs, and metrics are the four kinds of telemetry most apps run on, four tools in one box, and they overlap enough that the honest answer is every developer’s favourite: &lt;em&gt;it depends&lt;/em&gt;. You can stuff context into span attributes instead of logging it. You can count log events instead of emitting a metric. You can add a duration to a log and call it a span.&lt;/p&gt;

&lt;p&gt;[I had a spiderman meme here but legal told me it would be infringing so I removed it]&lt;/p&gt;

&lt;p&gt;But the fact that you &lt;em&gt;can&lt;/em&gt; doesn’t mean you &lt;em&gt;should&lt;/em&gt;. Each signal exists because it answers a different question, and feeds a different workflow once it lands. Left without solid guidelines, the default is to reach for whatever’s most familiar or already there, and miss what the other kinds are for.&lt;/p&gt;

&lt;p&gt;This post is the guidance I wanted to have, for myself and my robots. Want just the skill? Skip to the end.&lt;/p&gt;

&lt;p&gt;In Sentry, errors, traces, logs, and metrics all come from one SDK, included on every plan. Errors and &lt;a href="https://sentry.io/product/tracing/" rel="noopener noreferrer"&gt;tracing&lt;/a&gt; have been around for years (&lt;a href="https://blog.sentry.io/the-story-of-sentry/" rel="noopener noreferrer"&gt;2012&lt;/a&gt; and &lt;a href="https://blog.sentry.io/see-slow-faster-with-performance-monitoring/" rel="noopener noreferrer"&gt;2020&lt;/a&gt;), &lt;a href="https://sentry.io/product/logs/" rel="noopener noreferrer"&gt;structured logs landed last year&lt;/a&gt;, and &lt;a href="https://sentry.io/product/metrics/" rel="noopener noreferrer"&gt;Application Metrics&lt;/a&gt; completed the set back in May of this year. If you’ve had your application instrumented with Sentry for a while, errors and traces are probably already flowing, with logs and metrics left as tools for you to complete your telemetry story.&lt;/p&gt;

&lt;h2&gt;
  
  
  Errors, traces, logs, metrics: one question each
&lt;/h2&gt;

&lt;h4&gt;
  
  
  &lt;a href="https://docs.sentry.io/product/issues/" rel="noopener noreferrer"&gt;Errors&lt;/a&gt;: “What just broke?”
&lt;/h4&gt;

&lt;p&gt;A stack trace and an exception type, grouped into an Issue that gets deduplicated, assigned, and tracked until it’s resolved. If your code threw an exception, it’s an error.&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;a href="https://docs.sentry.io/product/explore/trace-explorer/" rel="noopener noreferrer"&gt;Traces&lt;/a&gt;: “Did the request flow the way it was supposed to?”
&lt;/h4&gt;

&lt;p&gt;A trace is a waterfall of timed spans. It’s how you follow a request across your services and see where the time went: the DB query that dragged, the API call that timed out, the LLM tool call that took 8 seconds instead of 200ms.&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;a href="https://docs.sentry.io/product/explore/metrics/" rel="noopener noreferrer"&gt;Metrics&lt;/a&gt;: “How’s this trending over time?”
&lt;/h4&gt;

&lt;p&gt;Counters, gauges, and distributions, each kept as an individual measurement you can slice by any attribute and drill from an aggregate back into the samples (and the trace) behind it. Not just “12,000 checkouts this week,” but 8,400 from the US, 2,600 from the EU, and 1,000 from everywhere else, and how that line moved across the last deploy. Metrics are a historical signal as much as a right-now one, which makes them an easy candidate for dashboards and alerts (but you can still set up alerts on pretty much all signals from Sentry).&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;a href="https://docs.sentry.io/product/explore/logs/" rel="noopener noreferrer"&gt;Logs&lt;/a&gt;: “What was happening at this point in the code?”
&lt;/h4&gt;

&lt;p&gt;The state of the system at one specific moment, captured as a structured event: config values, feature flags, the inputs and outputs of a function, the user ID. Logs are the trail through a function’s decision tree: the markers you drop at the points where the code makes a choice, so that later, a human or an agent can follow the reasoning. They fill in the &lt;em&gt;why&lt;/em&gt; once errors and traces have told you what broke and where the time went.&lt;/p&gt;

&lt;h2&gt;
  
  
  A real(ish) world example
&lt;/h2&gt;

&lt;p&gt;Let’s say you run a storefront with a React frontend and a Python API. Support starts forwarding tickets: the product recommendations on the account page look generic for a chunk of logged-in customers: bestsellers, not the personalized picks they’re used to. The vibes are off.&lt;/p&gt;

&lt;h3&gt;
  
  
  Did anything crash?
&lt;/h3&gt;

&lt;p&gt;First place I’d look is Issues. No exception in the React app, no failed request, every call to &lt;code&gt;/recommendations/{user_id}&lt;/code&gt; came back 200. As far as error tracking is concerned, the app is perfectly healthy.&lt;/p&gt;

&lt;h3&gt;
  
  
  Was anything slow, or did the request go off-path?
&lt;/h3&gt;

&lt;p&gt;Pull a trace for one of the affected requests. The route and the database queries are auto-instrumented; I added a few &lt;a href="https://docs.sentry.io/platforms/python/tracing/instrumentation/custom-instrumentation/#add-spans-to-a-transaction" rel="noopener noreferrer"&gt;named spans&lt;/a&gt; for the recommendation steps:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblog.sentry.io%2F_vercel%2Fimage%3Furl%3D_astro%252Frecommendations-trace-waterfall.DhViXsUW.png%26w%3D1920%26q%3D100" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblog.sentry.io%2F_vercel%2Fimage%3Furl%3D_astro%252Frecommendations-trace-waterfall.DhViXsUW.png%26w%3D1920%26q%3D100" alt="An affected request's trace in Sentry: an http.server span for the GET /recommendations route over child spans for the user lookup, the ranking\_v2 flag check, the empty recommendations\_v2 query, the fallback to popular items, and ranking." width="1600" height="276"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The request loaded the user, evaluated the &lt;code&gt;ranking_v2&lt;/code&gt; flag, queried &lt;code&gt;recommendations_v2&lt;/code&gt;, fell back to popular items, and ranked them. The path is right and the timing’s fine. That &lt;code&gt;recommendations_v2&lt;/code&gt; query &lt;em&gt;succeeded&lt;/em&gt; (returning zero rows is a perfectly successful query), so the code did what it was built to do and fell back. The trace tells me the request flowed as designed. It can’t tell me the design just quietly failed this user. On the surface, everything is fine.&lt;/p&gt;

&lt;h3&gt;
  
  
  Can we dig a little deeper?
&lt;/h3&gt;

&lt;p&gt;Search the logs for the user from the ticket, and the structured log from inside the handler will give you the state at the moment it decided to fall back.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblog.sentry.io%2F_vercel%2Fimage%3Furl%3D_astro%252Frecommendations-logs-user-search.BpBrroFS.png%26w%3D1920%26q%3D100" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblog.sentry.io%2F_vercel%2Fimage%3Furl%3D_astro%252Frecommendations-logs-user-search.BpBrroFS.png%26w%3D1920%26q%3D100" alt="The recommendations lookup log for user.id 124 in Sentry, expanded to show its attributes: the ranking\_v2 flag is on, source\_table is recommendations\_v2, candidate\_count is 0, and outcome is fallback." width="1600" height="928"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This user got bucketed into the &lt;code&gt;ranking_v2&lt;/code&gt; feature flag, which reads personalized picks from a new &lt;code&gt;recommendations_v2&lt;/code&gt; table. The table shipped, but the rows were never backfilled, so the lookup came back empty. To the code, an empty result is a perfectly valid “no personalized recs for this user,” the same thing a brand-new user with no history would get. So it falls back to bestsellers and returns 200.&lt;/p&gt;

&lt;p&gt;Why not just attach this data on the span? You could set &lt;code&gt;outcome&lt;/code&gt; and &lt;code&gt;candidate_count&lt;/code&gt; as span attributes. But traces might be sampled, and the one request a customer is complaining about &lt;em&gt;usually&lt;/em&gt; ends up being the one that’s sampled out (at least with my luck). A span attribute is great for reading a trace you’ve found; it can’t help you find one. Logs aren’t sampled.&lt;/p&gt;

&lt;h3&gt;
  
  
  How many people hit it?
&lt;/h3&gt;

&lt;p&gt;One affected customer is a support ticket. Knowing whether it’s a small subset of users or a significant chunk is the difference between fixing it Monday and paging someone tonight. A &lt;code&gt;recommendations.served&lt;/code&gt; counter, tagged with &lt;code&gt;ranking_version&lt;/code&gt; and &lt;code&gt;outcome&lt;/code&gt;, draws the line:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblog.sentry.io%2F_vercel%2Fimage%3Furl%3D_astro%252Frecommendations-metric-rate.D44k6wab.png%26w%3D1920%26q%3D100" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblog.sentry.io%2F_vercel%2Fimage%3Furl%3D_astro%252Frecommendations-metric-rate.D44k6wab.png%26w%3D1920%26q%3D100" alt="Sentry's Application Metrics explorer showing the recommendations.served counter with two queries (one filtered to outcome:personalized, one for the total) and an equation A / B \* 100 grouped by ranking\_version, producing a personalized rate of 97.9% for v1 and 3.3% for v2." width="1600" height="1030"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The v2 path is serving almost nothing but fallbacks, v1 is normal, and the drop lines up with the flag rollout. Scope and trigger, without opening a single trace.&lt;/p&gt;

&lt;p&gt;No one signal cracked it; each ruled something out. No Issues in the feed meant it wasn’t a crash. The metric said it wasn’t a one-off: the whole &lt;code&gt;v2&lt;/code&gt; cohort was falling back. The trace, where one was sampled, showed the path running exactly as designed, which is why it slipped through. The log, pulled up by the &lt;code&gt;user_id&lt;/code&gt; from the ticket, said &lt;em&gt;why&lt;/em&gt;, and I never needed the trace to get to it.&lt;/p&gt;

&lt;h2&gt;
  
  
  When to reach for what
&lt;/h2&gt;

&lt;p&gt;I use this as a gut check:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;What you want to know&lt;/th&gt;
&lt;th&gt;Reach for&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Something crashed, show the stack trace&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Errors&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;How long did this take? Which step was slow?&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Traces&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Did the request flow through the steps I expected?&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Traces&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;What was the state when the code made this decision?&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Logs&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;What did this function receive and return?&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Logs&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;How often does X happen? Is the rate normal?&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Metrics&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Did something change after the deploy?&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Metrics&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The tricky cases are the overlaps, and of course there is nuance to all of this because the same value can show up in more than one signal.&lt;/p&gt;

&lt;h4&gt;
  
  
  Span attribute or metric?
&lt;/h4&gt;

&lt;p&gt;If it’s context about &lt;em&gt;one request’s flow through the system&lt;/em&gt; and you want it while reading that trace, it’s a span attribute. It rides on the span in the waterfall. If it’s a standalone value you want to chart, alert on, or slice over time across &lt;em&gt;all&lt;/em&gt; requests, it’s a metric. The same number can warrant both: &lt;code&gt;candidate_count&lt;/code&gt; as a span attribute lets me read one request; &lt;code&gt;recommendations.served&lt;/code&gt; as a metric lets me watch the rate. One is for inspecting a single flow, the other for watching the aggregate.&lt;/p&gt;

&lt;h4&gt;
  
  
  Log or span?
&lt;/h4&gt;

&lt;p&gt;The span is the timed node in the flow, and most of them are auto-instrumented, so you rarely write them. The log is the decision-point state &lt;em&gt;inside&lt;/em&gt; that node, and you always write it on purpose. Span answers &lt;em&gt;where&lt;/em&gt; and &lt;em&gt;how long&lt;/em&gt;; log answers &lt;em&gt;what was true and why&lt;/em&gt;.&lt;/p&gt;

&lt;h4&gt;
  
  
  Log or metric?
&lt;/h4&gt;

&lt;p&gt;A log is one request’s story, the needle. A metric is the aggregate, the question of whether the haystack is normal. When you want to find the specific request that went wrong, that’s a log. When you want to know how many requests went wrong, that’s a metric.&lt;/p&gt;

&lt;h4&gt;
  
  
  Error or log?
&lt;/h4&gt;

&lt;p&gt;If it needs a stack trace and should be tracked as an Issue, it’s an error. If it’s an unexpected-but-handled condition worth recording, it’s a log. If it’s truly non-critical, &lt;code&gt;logger.warning(exc_info=True)&lt;/code&gt; captures the traceback in logs without creating noise in your error feed.&lt;/p&gt;

&lt;h2&gt;
  
  
  What the instrumentation looks like
&lt;/h2&gt;

&lt;p&gt;Everything above came out of one endpoint: the &lt;code&gt;GET /recommendations/{user_id}&lt;/code&gt; route from the walkthrough, the function that loads the user, checks the &lt;code&gt;ranking_v2&lt;/code&gt; flag, queries &lt;code&gt;recommendations_v2&lt;/code&gt;, and falls back to popular items when it comes back empty. Here’s that same handler with the instrumentation in place.&lt;/p&gt;

&lt;p&gt;Most of it you don’t write. The FastAPI integration traces the request, the database integration traces every query, so you get the path and the timing without a single hand-written span.&lt;/p&gt;

&lt;p&gt;What you do place by hand are the deliberate signals: a span attribute or two to enrich the flow, the decision-point log, and the metric.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;sentry_sdk&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;sentry_sdk&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;logger&lt;/span&gt;

&lt;span class="c1"&gt;# The route is auto-instrumented. FastAPI gives you the request span;
# the DB integration gives you a span for every query below. You write none of it.
&lt;/span&gt;&lt;span class="nd"&gt;@app.get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/recommendations/{user_id}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_recommendations&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;user&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_user&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;                          &lt;span class="c1"&gt;# auto-instrumented db span
&lt;/span&gt;    &lt;span class="n"&gt;use_v2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;flag_enabled&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ranking_v2&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;ranking_version&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;v2&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;use_v2&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;v1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

    &lt;span class="n"&gt;candidates&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;personalized_recs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;version&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;ranking_version&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# auto db span
&lt;/span&gt;    &lt;span class="n"&gt;outcome&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;personalized&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;candidates&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fallback&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="n"&gt;items&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;candidates&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;popular_items&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;             &lt;span class="c1"&gt;# auto db span on the fallback
&lt;/span&gt;
    &lt;span class="c1"&gt;# SPAN ATTRIBUTE: context about THIS request's flow, read inside the trace.
&lt;/span&gt;    &lt;span class="c1"&gt;# It rides on the auto-instrumented request span; no new span needed.
&lt;/span&gt;    &lt;span class="n"&gt;span&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sentry_sdk&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_current_span&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;span&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set_data&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ranking_version&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ranking_version&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;span&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set_data&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;recommendation.outcome&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;outcome&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# LOG: the trail through the decision tree, the state at the moment the
&lt;/span&gt;    &lt;span class="c1"&gt;# code chose personalized vs. fallback. The only signal that records *why*.
&lt;/span&gt;    &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;recommendations lookup&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;attributes&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ranking_version&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ranking_version&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;flag.ranking_v2&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;use_v2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;source_table&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;recommendations_&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;ranking_version&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;candidate_count&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;candidates&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;outcome&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;outcome&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# METRIC: the rate across all requests, sliceable by version and outcome.
&lt;/span&gt;    &lt;span class="n"&gt;sentry_sdk&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;metrics&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;count&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;recommendations.served&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;attributes&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ranking_version&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ranking_version&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;outcome&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;outcome&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;items&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Three deliberate touches, each carrying a piece the others can’t. The span attribute tags the request’s flow with the ranking path so it’s right there when I open the trace. The log records what the function decided and why, at the instant it decided. The metric counts the outcome with enough dimension to slice it later.&lt;/p&gt;

&lt;p&gt;If you &lt;em&gt;do&lt;/em&gt; want a sub-operation timed in the waterfall (say the ranking step, or a call to an external recommender), you can wrap it in a custom span with &lt;a href="https://docs.sentry.io/platforms/python/tracing/instrumentation/custom-instrumentation/" rel="noopener noreferrer"&gt;&lt;code&gt;sentry_sdk.start_span&lt;/code&gt;&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Beyond what you write, the SDK fills in even more on its own. Frontend SDKs tag everything with the browser, OS, and release. Call &lt;code&gt;sentry_sdk.set_user()&lt;/code&gt; once and that user follows the errors, spans, logs, and metrics for the request. And because all four come from the same SDK, they share a &lt;code&gt;trace_id&lt;/code&gt; and correlate on their own: every log carries the trace it belongs to, and you can jump from a metric spike straight into the traces behind it, without gluing four vendors together to get there.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblog.sentry.io%2F_vercel%2Fimage%3Furl%3D_astro%252Ftrace-connected-waterfall.NzzCgsnV.png%26w%3D1920%26q%3D100" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblog.sentry.io%2F_vercel%2Fimage%3Furl%3D_astro%252Ftrace-connected-waterfall.NzzCgsnV.png%26w%3D1920%26q%3D100" alt="Sentry trace view for the GET /recommendations route: the http.server route span and the database query spans are auto-instrumented, alongside a few custom spans for the recommendation steps, with Waterfall, Logs, and Application Metrics tabs all hanging off the same trace." width="1600" height="832"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;All of this is ready for you to use and included in every plan. The deliberate signals (the span attributes, the decision-point logs, the metrics) are the ones you place yourself, and they only help if you do it ahead of time, at the spots where your code makes a decision worth questioning later.&lt;/p&gt;

&lt;h2&gt;
  
  
  Right tool for the job
&lt;/h2&gt;

&lt;p&gt;The split above isn’t just conceptual. It’s baked into the APIs, and each one is tuned for its job. The &lt;strong&gt;Metrics API&lt;/strong&gt; is built for emitting counts and measures you’ll aggregate. The &lt;strong&gt;span API&lt;/strong&gt; is built for measuring durations and the shape of a request. The &lt;strong&gt;log API&lt;/strong&gt; integrates with your favourite &lt;a href="https://sentry.io/product/logs/" rel="noopener noreferrer"&gt;structured logging&lt;/a&gt; library, so the lines you already write become queryable events. Reaching for the API that matches the workflow usually means reaching for the one that matches the &lt;em&gt;kind&lt;/em&gt; of value you have: a count, a duration, or a moment.&lt;/p&gt;

&lt;p&gt;Sampling falls out of the same logic. Traces are best as a &lt;a href="https://docs.sentry.io/platforms/python/tracing/configure-sampling/" rel="noopener noreferrer"&gt;&lt;em&gt;sampled representation&lt;/em&gt;&lt;/a&gt; of your traffic: you don’t need every request to understand where time goes, so a percentage is plenty (and cheaper). Logs are the opposite: you keep all of them, because the entire point is to find the one rare request that went sideways, and you can’t find what you sampled away. Metrics aren’t sampled either; like logs, you filter them with &lt;a href="https://docs.sentry.io/platforms/python/metrics/#before_send_metric" rel="noopener noreferrer"&gt;&lt;code&gt;before_send_metric&lt;/code&gt;&lt;/a&gt;. Match the retention to the question: a representative sample for “where does time go,” every single event for “what happened to &lt;em&gt;this&lt;/em&gt; request.”&lt;/p&gt;

&lt;h2&gt;
  
  
  You’re not the only one debugging your codebase anymore
&lt;/h2&gt;

&lt;p&gt;Cody from &lt;a href="https://modem.dev/" rel="noopener noreferrer"&gt;Modem&lt;/a&gt; instrumented his AI agent to find out where it was spending time. He worked with Codex to wrap the async work and the logical chunks (everything that runs before the call to the model, say) in spans. Cache hits and time-to-first-token became metrics he could watch over time. Values that only meant something next to a specific operation stayed as span attributes, and the lightweight “this happened here” markers became logs. The span-attribute-versus-metric call wasn’t always obvious to him; his rule was that if a value only made sense in the context of a span, it lived on the span.&lt;/p&gt;

&lt;p&gt;With the tracing in place, he pointed Codex at the Sentry data through the MCP server, feeding it real runs from his Playwright tests in development, and gave it one goal: optimize the code path. The agent read the spans, found work that could run in parallel, and rewrote the code to stop awaiting results until they were actually needed.&lt;/p&gt;

&lt;p&gt;It could do that because a trace is a structured dependency tree with timing on every node, a format an agent can reason about directly. Hand it the same information as a stream of log lines and it would have to reconstruct the call graph from timestamps and string matching first.&lt;/p&gt;

&lt;h2&gt;
  
  
  But what about wide events?
&lt;/h2&gt;

&lt;p&gt;There’s a popular argument that the four signals are overkill: emit one rich, wide event per request and derive the rest later. It’s half right.&lt;/p&gt;

&lt;p&gt;Emit wide, absolutely. The best version of any signal is a structured event packed with context (the flag that was on, the user, the inputs and the outputs), not a bare number or a one-line string.&lt;/p&gt;

&lt;p&gt;But the shape you emit is the shape you get to work with. One fat event in a columnar store charts fine after the fact, but it can’t group itself into a deduplicated Issue, render itself as a waterfall, or fire a real-time alert on a threshold you haven’t defined yet. Those are workflows, and each needs its data in a particular shape.&lt;/p&gt;

&lt;p&gt;So emit wide, into the signal whose workflow you actually need. That’s why the handler emits both a metric and a log: same decision, same trace, two shapes, because watching a rate and reconstructing one request are different jobs.&lt;/p&gt;

&lt;h2&gt;
  
  
  Getting started
&lt;/h2&gt;

&lt;p&gt;Logs and metrics are the two you probably haven’t turned on yet — they’re relatively new to Sentry, and people are still just finding them. Both are included on every plan.&lt;/p&gt;

&lt;p&gt;You don’t have to wire them up by hand. Point your coding agent at &lt;a href="https://skills.sentry.dev/" rel="noopener noreferrer"&gt;Sentry’s setup skills&lt;/a&gt; for your stack and it installs the SDK, turns on tracing, logs, and metrics, and drops instrumentation at the decision points. Then aim it at your Sentry data through the &lt;a href="https://mcp.sentry.dev/" rel="noopener noreferrer"&gt;MCP server&lt;/a&gt; and give it something real: your slowest trace, your newest issue.&lt;/p&gt;

&lt;p&gt;Prefer to grab just the decision framework? It’s a skill of its own:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx skills add getsentry/sentry-for-ai &lt;span class="nt"&gt;--skill&lt;/span&gt; sentry-instrumentation-guide
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The telemetry you emit to debug is the same telemetry it reads to help.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This article was originally published on the &lt;a href="https://blog.sentry.io/errors-traces-logs-metrics-when-to-reach-for-what/" rel="noopener noreferrer"&gt;Sentry Blog&lt;/a&gt; by &lt;a href="https://blog.sentry.io/authors/sergiy-dybskiy/" rel="noopener noreferrer"&gt;Sergiy Dybskiy&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>monitoring</category>
      <category>devops</category>
      <category>logging</category>
      <category>observability</category>
    </item>
    <item>
      <title>Your agent can't fix what it can't see</title>
      <dc:creator>Sergiy Dybskiy</dc:creator>
      <pubDate>Thu, 28 May 2026 14:04:58 +0000</pubDate>
      <link>https://dev.to/sentry/your-agent-cant-fix-what-it-cant-see-4391</link>
      <guid>https://dev.to/sentry/your-agent-cant-fix-what-it-cant-see-4391</guid>
      <description>&lt;p&gt;Agents are getting better and better at fixing bugs. They’re even getting better at testing their work, thanks to headless browsers, sandboxes, simulators, etc.&lt;/p&gt;

&lt;p&gt;But what about the bugs that only show up once you bring in different browsers, languages, extensions, internet speeds, and all the other variables that get mixed in the second you ship to prod? Or all the bugs that only show up when you account for… well, humans being humans and doing weird stuff you didn’t expect them to do?&lt;/p&gt;

&lt;p&gt;The bottleneck for self-healing software isn’t agent intelligence. It’s that agents have no idea what actually broke. They’re debugging from source code alone, which is roughly as effective as diagnosing a server outage by skimming the README. What they’re missing is production context: the stack trace, the request payload, the environment, the breadcrumbs leading up to the failure.&lt;/p&gt;

&lt;p&gt;Your agents need someone/something telling them what’s breaking in the wild &lt;em&gt;and&lt;/em&gt; giving them the context they need to understand why.&lt;/p&gt;

&lt;p&gt;We built &lt;a href="https://mcp.sentry.dev/" rel="noopener noreferrer"&gt;Sentry MCP&lt;/a&gt; and the &lt;a href="https://cli.sentry.dev/" rel="noopener noreferrer"&gt;Sentry CLI&lt;/a&gt; to make that context available to both humans, and increasingly as important, their agents. You can wire up a system today where a Sentry alert triggers an agent, the agent investigates the issue using the same evidence you would, and a draft PR with a fix lands in your repo before you open a browser.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why draft PRs, not auto-merge
&lt;/h2&gt;

&lt;p&gt;Let’s be honest about what’s realistic. A system that detects, fixes, tests, deploys, and monitors its own patches without human involvement is not something you should build today. That’s how you get a very exciting incident review.&lt;/p&gt;

&lt;p&gt;The useful version is more modest: a production error fires, an agent investigates it with real Sentry context, writes a small fix with a regression test, and opens a draft PR. A human is very much in the loop.&lt;/p&gt;

&lt;p&gt;That’s not fully autonomous, but it’s not trivial either. Most bugs sit in a queue, triaged, prioritized, assigned, waiting, and often lose out to new features. Seer diagnoses the root cause in under two minutes. A complete Autofix run, from root cause analysis to an opened PR, takes about six minutes.&lt;/p&gt;

&lt;p&gt;An agent that opens a reviewable, mergeable fix six minutes after the error fires is a meaningful change to your mean time to resolution, even if a human still clicks merge.&lt;/p&gt;

&lt;h2&gt;
  
  
  Two ways to give your agent production context
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Sentry MCP&lt;/strong&gt; is the right choice for agents that support the Model Context Protocol (Claude Code, Cursor, Codex, Windsurf, VS Code with Copilot). Your agent connects to the hosted server, authenticates via OAuth, and gets structured access to issues, events, traces, and Seer analysis. No local install required.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# One-liner for any MCP-compatible client&lt;/span&gt;
npx add-mcp https://mcp.sentry.dev/mcp

&lt;span class="c"&gt;# Or for Claude Code specifically&lt;/span&gt;
claude mcp add &lt;span class="nt"&gt;--transport&lt;/span&gt; http sentry https://mcp.sentry.dev/mcp
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If your client doesn’t support the one-liner, add the config manually:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mcpServers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"sentry"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://mcp.sentry.dev/mcp"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;The Sentry CLI&lt;/strong&gt; is the right choice for scripted workflows, CI pipelines, or any automation where you need structured output you can pipe to &lt;code&gt;jq&lt;/code&gt; or feed into another process.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl https://cli.sentry.dev/install &lt;span class="nt"&gt;-fsS&lt;/span&gt; | bash
sentry auth login
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here’s what that looks like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;sentry issue list

Issues &lt;span class="k"&gt;in &lt;/span&gt;acme/checkout:
╭──────────────┬──────────────────────────────────────────────────────┬──────┬─────┬────────┬───────┬──────────────╮
│ SHORT ID     │ ISSUE                                                │ SEEN │ AGE │ EVENTS │ USERS │ TRIAGE       │
├──────────────┼──────────────────────────────────────────────────────┼──────┼─────┼────────┼───────┼──────────────┤
│ CHECKOUT-P1  │ TimeoutError: Payment charge exceeded 30s            │   3h │  3h │  1.8k  │   340 │ High  86%    │
├──────────────┼──────────────────────────────────────────────────────┼──────┼─────┼────────┼───────┼──────────────┤
│ CHECKOUT-N7  │ TypeError: Cannot &lt;span class="nb"&gt;read &lt;/span&gt;property &lt;span class="s1"&gt;'total'&lt;/span&gt;              │   1d │  5d │    215 │    82 │ High  71%    │
├──────────────┼──────────────────────────────────────────────────────┼──────┼─────┼────────┼───────┼──────────────┤
│ API-34       │ RateLimitError: Too many requests to /v1/charges     │   3d │ 21d │     67 │    24 │ Med   42%    │
╰──────────────┴──────────────────────────────────────────────────────┴──────┴─────┴────────┴───────┴──────────────╯
Tip: Use &lt;span class="s1"&gt;'sentry issue view &amp;lt;ID&amp;gt;'&lt;/span&gt; to view details.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;CHECKOUT-P1&lt;/code&gt; is at the top, a timeout in the checkout service with 1.8k events and an 86% fixability score. Drill in:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;sentry issue view CHECKOUT-P1

CHECKOUT-P1: TimeoutError: Payment charge exceeded 30s
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
╭────────────┬─────────────────────────────────────────────╮
│ Status     │ ● Unresolved &lt;span class="o"&gt;(&lt;/span&gt;Ongoing&lt;span class="o"&gt;)&lt;/span&gt;                      │
│ Fixability │ High &lt;span class="o"&gt;(&lt;/span&gt;86%&lt;span class="o"&gt;)&lt;/span&gt;                                  │
│ Level      │ error                                       │
│ Platform   │ node                                        │
│ Project    │ checkout-service                            │
│ Events     │ 1832                                        │
│ Users      │ 340                                         │
│ First seen │ 3 hours ago                                 │
│ Last seen  │ 12 minutes ago                              │
│ Culprit    │ chargeCustomer &lt;span class="o"&gt;(&lt;/span&gt;src/payment.ts&lt;span class="o"&gt;)&lt;/span&gt;             │
│ Link       │ https://acme.sentry.io/issues/CHECKOUT-P1/  │
╰────────────┴─────────────────────────────────────────────╯

Tip: Use &lt;span class="s1"&gt;'sentry issue explain CHECKOUT-P1'&lt;/span&gt; &lt;span class="k"&gt;for &lt;/span&gt;AI root cause analysis
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Looks like a straightforward timeout. An agent with just this would add retry logic or bump the timeout. But run &lt;code&gt;sentry issue explain&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;sentry issue explain CHECKOUT-P1

ℹ Starting root cause analysis, it can take several minutes...

Root Cause Analysis Complete
━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Cause &lt;span class="c"&gt;#0: The checkout service's /charge endpoint times out&lt;/span&gt;
waiting &lt;span class="k"&gt;for &lt;/span&gt;the payment service, which blocks on an inventory
availability check. The inventory service&lt;span class="s1"&gt;'s check_stock query
regressed from ~200ms to ~28s after migration
0047_drop_unused_indexes removed the compound index on
(product_id, warehouse_id).

Repository: acme/inventory-service
Affected: src/queries/check_stock.ts:18
First seen: release-3.1.0 (deployed 3h ago)

Reproduction steps:
1. User submits checkout → POST /charge
2. Payment service calls inventory.check_stock(items)
3. check_stock runs full table scan (missing index) → 28s
4. Payment call exceeds 30s timeout → TimeoutError bubbles up to checkout

To create a plan, run: sentry issue plan CHECKOUT-P1
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The root cause isn’t in the checkout service at all. It’s a dropped database index in the inventory service, two hops away in the trace. No amount of retry logic in &lt;code&gt;payment.ts&lt;/code&gt; fixes that.&lt;/p&gt;

&lt;h2&gt;
  
  
  From alert to draft PR
&lt;/h2&gt;

&lt;p&gt;When a Sentry alert fires on a new or regressed issue, a webhook triggers a worker that checks out your repo and runs a coding agent with a prompt grounded in the specific issue:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;A production error was captured by Sentry. The issue ID is CHECKOUT-P1.

Use Sentry MCP to retrieve the full issue details: stack trace,
breadcrumbs, tags, release, environment, distributed traces,
suspect commits, and Seer analysis.

Based on the evidence:

1. Identify the root cause. Follow traces across services.
2. Make the smallest safe fix in the right repository.
3. Add or update a regression test that covers this failure.
4. Run the test suite.
5. Open a draft PR with the Sentry issue link, root-cause
   summary, files changed, and test results.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The agent pulls the issue via MCP. The distributed trace shows the checkout call chaining through the payment service into an inventory check that’s taking 28 seconds. Metrics confirm the inventory service’s p99 spiked from 200ms to 28s three hours ago. Suspect commits point at a migration in &lt;code&gt;acme/inventory-service&lt;/code&gt; that dropped a compound index. Session replay shows users rage-clicking “Pay” while nothing happens, generating duplicate charge attempts.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;sentry issue plan CHECKOUT-P1&lt;/code&gt; lays out the fix: restore the compound index on &lt;code&gt;(product_id, warehouse_id)&lt;/code&gt;. A draft PR lands in &lt;code&gt;acme/inventory-service&lt;/code&gt; with the migration, a root-cause summary linking back to the Sentry trace, and a regression test.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblog.sentry.io%2F_vercel%2Fimage%3Furl%3D_astro%252Fsentry-self-healing-loop-diagram.BgBqAHWo.png%26w%3D828%26q%3D100" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblog.sentry.io%2F_vercel%2Fimage%3Furl%3D_astro%252Fsentry-self-healing-loop-diagram.BgBqAHWo.png%26w%3D828%26q%3D100" alt="Self-healing loop: production error flows to Sentry for context and root cause, triggers a coding agent that opens a draft PR, human reviews and merges the fix" width="828" height="669"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Try it with Cursor Automations
&lt;/h2&gt;

&lt;p&gt;We publish a &lt;a href="https://sentry.io/cookbook/regressed-issue-to-pr-cursor/" rel="noopener noreferrer"&gt;cookbook recipe&lt;/a&gt; for this exact workflow using Cursor’s Automations feature. It walks through connecting your repo to Sentry, adding the MCP server to an automation, and configuring a webhook alert to trigger on regressed issues.&lt;/p&gt;

&lt;p&gt;Because Sentry knows the release history and suspect commits, the agent doesn’t search the entire repo for the problem. It starts where the evidence points. For regressed issues specifically, it can identify which commit reintroduced the bug, read the original fix, and understand what went wrong the second time around.&lt;/p&gt;

&lt;h2&gt;
  
  
  What’s next
&lt;/h2&gt;

&lt;p&gt;The more telemetry your app sends to Sentry (traces, metrics, logs, session replays), the harder the bugs an agent can tackle. Today it’s dropped indexes across service boundaries. Six months ago it was null checks. The merge rate on Autofix PRs has climbed from 41% to 46% in that time, and the diagnosis complexity is growing with it.&lt;/p&gt;

&lt;p&gt;There are real limits. Bugs that need product judgment, issues in code the agent can’t reach, and problems where there isn’t enough telemetry to connect the dots: those still need you. But the surface area of what agents can fix is expanding every month.&lt;/p&gt;

&lt;p&gt;Connect &lt;a href="https://mcp.sentry.dev/" rel="noopener noreferrer"&gt;Sentry MCP&lt;/a&gt; to your editor or install the &lt;a href="https://cli.sentry.dev" rel="noopener noreferrer"&gt;CLI&lt;/a&gt;. Hook up your repos for code mappings and tracing. Run &lt;code&gt;sentry issue explain&lt;/code&gt; on something that’s been sitting in your backlog and see what it finds.&lt;/p&gt;

&lt;p&gt;Check out the &lt;a href="https://docs.sentry.io/product/ai-in-sentry/seer/autofix/" rel="noopener noreferrer"&gt;Seer Autofix docs&lt;/a&gt; for more on coding agent handoff to Claude Code and Cursor.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This article was originally published on the &lt;a href="https://blog.sentry.io/agents-need-production-context/" rel="noopener noreferrer"&gt;Sentry Blog&lt;/a&gt; by &lt;a href="https://blog.sentry.io/authors/sergiy-dybskiy/" rel="noopener noreferrer"&gt;Sergiy Dybskiy&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>programming</category>
      <category>agents</category>
    </item>
  </channel>
</rss>
