POTHURAJU JAYAKRISHNA YADAV

Posted on Jan 8

Reducing Sentry APM Costs in FastAPI by Sending Only What Matters

#sentry #python

When I first enabled managed Sentry APM for a FastAPI application, the visibility felt amazing.
Every request was traced. Every endpoint had performance data.

But that excitement didn’t last long.

After a few days in production, I realized something important:

Most of my Sentry APM usage was coming from perfectly healthy, fast requests that I never looked at again.

Successful GETs, quick POSTs, OpenAPI calls — all of them were being sent to Sentry, eating up the quota and increasing cost without providing real value.

So instead of scaling my Sentry plan, I chose a different approach:

👉 Send fewer transactions, but send the right ones.

The Real Problem with Default APM

By default, Sentry APM is very generous:

Every request becomes a transaction
Even sub-second successful calls are recorded
Docs and schema endpoints are also traced

For high-traffic APIs, this quickly turns into:

Large transaction volume
Faster quota exhaustion
Paying for noise instead of insight

In reality, I only needed visibility into:

Requests that fail (5xx)
Requests that are slow
Anything abnormal or risky

Everything else was just background noise.

The Cost-Saving Strategy

I defined very simple rules:

Always Send to Sentry

Any request returning 5xx
Any request taking more than 5 seconds

Drop from Sentry

Fast GET / POST / PUT requests
Successful requests completing under 3 seconds
/docs and /openapi.json endpoints

This keeps Sentry focused on problems, not traffic volume.

Why Two Middlewares Are Required

This part is important and often misunderstood.

app.add_middleware(SentryAsgiMiddleware)
app.add_middleware(TimingMiddleware)

These two middlewares do different jobs, and both are required.

SentryAsgiMiddleware – Enables APM

SentryAsgiMiddleware is what actually:

Starts and finishes Sentry transactions
Hooks into the ASGI request lifecycle
Sends performance data to Sentry

Without this middleware:

No transactions are created
before_send_transaction is never called
APM simply does not work

In short:

No SentryAsgiMiddleware = No APM

TimingMiddleware – Adds Intelligence

The second middleware is custom.

It measures the real execution time of each request and attaches it to the Sentry scope.

class TimingMiddleware(BaseHTTPMiddleware):
    async def dispatch(self, request: Request, call_next):
        start_time = time.time()
        response = await call_next(request)
        duration = time.time() - start_time

        with sentry_sdk.configure_scope() as scope:
            scope.set_extra("duration", duration)

        return response

Why this is needed:

Execution time is required to decide whether a request is “important”
Sentry’s internal timing isn’t easily usable for filtering
Without this, cost-control logic becomes guesswork

Think of it this way:

SentryAsgiMiddleware is the pipeline
TimingMiddleware is the brain

Filtering Transactions Before They Are Sent

Sentry provides a hook called before_send_transaction.
This runs just before a transaction is sent to Sentry and allows you to drop it.

This is where the cost optimization happens.

def before_send_transaction(event, hint):
    transaction_name = event.get("transaction", "")
    request_method = event.get("request", {}).get("method", "")
    status_code = event.get("contexts", {}).get("response", {}).get("status_code", 0)

    duration = event.get("extra", {}).get("duration")

    # Ignore docs and schema
    if "/docs" in transaction_name or "/openapi.json" in transaction_name:
        return None

    # Always send server errors
    if 500 <= status_code <= 599:
        return event

    # Send slow requests
    if duration and duration > 5:
        return event

    # Drop fast successful requests
    if request_method in ["GET", "POST", "PUT"] \
       and 200 <= status_code < 400 \
       and duration and duration < 3:
        return None

    return event

If this function returns:

event → transaction is sent
None → transaction is dropped

Simple, predictable, and fully under your control.

Initializing Sentry with Custom Filtering

sentry_sdk.init(
    dsn="SENTRY_DSN",
    send_default_pii=True,
    traces_sample_rate=1.0,
    before_send_transaction=before_send_transaction,
)

Instead of relying on random sampling, this approach gives deterministic filtering based on real behavior.

What Changed After This

Lower Cost

Transaction volume dropped sharply.
Sentry usage slowed down immediately.

Cleaner Dashboards

Only slow or failing requests appeared.
Debugging became easier, not harder.

Better Signal

Every transaction in Sentry now means:

“This is worth looking at.”

When This Approach Makes Sense

This works best when:

Your API traffic is high
Most requests are successful and fast
You care more about issues than metrics

If you want every request traced forever, this is not the right approach.
If you want useful observability without burning money, it absolutely is.

Final Thoughts

APM should help you find problems, not create new ones in your billing dashboard.

By combining:

SentryAsgiMiddleware
A simple timing middleware
before_send_transaction

You turn Sentry from:

“collect everything”
into
“collect what actually matters”

And that small change makes a huge difference in real production systems.

DEV Community