DEV Community

POTHURAJU JAYAKRISHNA YADAV
POTHURAJU JAYAKRISHNA YADAV

Posted on

Reducing Sentry APM Costs in FastAPI by Sending Only What Matters

When I first enabled managed Sentry APM for a FastAPI application, the visibility felt amazing.
Every request was traced. Every endpoint had performance data.

But that excitement didn’t last long.

After a few days in production, I realized something important:

Most of my Sentry APM usage was coming from perfectly healthy, fast requests that I never looked at again.

Successful GETs, quick POSTs, OpenAPI calls — all of them were being sent to Sentry, eating up the quota and increasing cost without providing real value.

So instead of scaling my Sentry plan, I chose a different approach:

👉 Send fewer transactions, but send the right ones.

The Real Problem with Default APM

By default, Sentry APM is very generous:

  • Every request becomes a transaction
  • Even sub-second successful calls are recorded
  • Docs and schema endpoints are also traced

For high-traffic APIs, this quickly turns into:

  • Large transaction volume
  • Faster quota exhaustion
  • Paying for noise instead of insight

In reality, I only needed visibility into:

  • Requests that fail (5xx)
  • Requests that are slow
  • Anything abnormal or risky

Everything else was just background noise.

The Cost-Saving Strategy

I defined very simple rules:

Always Send to Sentry

  • Any request returning 5xx
  • Any request taking more than 5 seconds

Drop from Sentry

  • Fast GET / POST / PUT requests
  • Successful requests completing under 3 seconds
  • /docs and /openapi.json endpoints

This keeps Sentry focused on problems, not traffic volume.

Why Two Middlewares Are Required

This part is important and often misunderstood.

app.add_middleware(SentryAsgiMiddleware)
app.add_middleware(TimingMiddleware)
Enter fullscreen mode Exit fullscreen mode

These two middlewares do different jobs, and both are required.

SentryAsgiMiddleware – Enables APM

SentryAsgiMiddleware is what actually:

  • Starts and finishes Sentry transactions
  • Hooks into the ASGI request lifecycle
  • Sends performance data to Sentry

Without this middleware:

  • No transactions are created
  • before_send_transaction is never called
  • APM simply does not work

In short:

No SentryAsgiMiddleware = No APM

TimingMiddleware – Adds Intelligence

The second middleware is custom.

It measures the real execution time of each request and attaches it to the Sentry scope.

class TimingMiddleware(BaseHTTPMiddleware):
    async def dispatch(self, request: Request, call_next):
        start_time = time.time()
        response = await call_next(request)
        duration = time.time() - start_time

        with sentry_sdk.configure_scope() as scope:
            scope.set_extra("duration", duration)

        return response
Enter fullscreen mode Exit fullscreen mode

Why this is needed:

  • Execution time is required to decide whether a request is “important”
  • Sentry’s internal timing isn’t easily usable for filtering
  • Without this, cost-control logic becomes guesswork

Think of it this way:

  • SentryAsgiMiddleware is the pipeline
  • TimingMiddleware is the brain

Filtering Transactions Before They Are Sent

Sentry provides a hook called before_send_transaction.
This runs just before a transaction is sent to Sentry and allows you to drop it.

This is where the cost optimization happens.

def before_send_transaction(event, hint):
    transaction_name = event.get("transaction", "")
    request_method = event.get("request", {}).get("method", "")
    status_code = event.get("contexts", {}).get("response", {}).get("status_code", 0)

    duration = event.get("extra", {}).get("duration")

    # Ignore docs and schema
    if "/docs" in transaction_name or "/openapi.json" in transaction_name:
        return None

    # Always send server errors
    if 500 <= status_code <= 599:
        return event

    # Send slow requests
    if duration and duration > 5:
        return event

    # Drop fast successful requests
    if request_method in ["GET", "POST", "PUT"] \
       and 200 <= status_code < 400 \
       and duration and duration < 3:
        return None

    return event
Enter fullscreen mode Exit fullscreen mode

If this function returns:

  • event → transaction is sent
  • None → transaction is dropped

Simple, predictable, and fully under your control.

Initializing Sentry with Custom Filtering

sentry_sdk.init(
    dsn="SENTRY_DSN",
    send_default_pii=True,
    traces_sample_rate=1.0,
    before_send_transaction=before_send_transaction,
)
Enter fullscreen mode Exit fullscreen mode

Instead of relying on random sampling, this approach gives deterministic filtering based on real behavior.

What Changed After This

Lower Cost

Transaction volume dropped sharply.
Sentry usage slowed down immediately.

Cleaner Dashboards

Only slow or failing requests appeared.
Debugging became easier, not harder.

Better Signal

Every transaction in Sentry now means:

  • “This is worth looking at.”

When This Approach Makes Sense

This works best when:

  • Your API traffic is high
  • Most requests are successful and fast
  • You care more about issues than metrics

If you want every request traced forever, this is not the right approach.
If you want useful observability without burning money, it absolutely is.

Final Thoughts

APM should help you find problems, not create new ones in your billing dashboard.

By combining:

  • SentryAsgiMiddleware
  • A simple timing middleware
  • before_send_transaction

You turn Sentry from:

“collect everything”
into
“collect what actually matters”

And that small change makes a huge difference in real production systems.

Top comments (0)