5 Stripe Billing Patterns That Signal Churn (And How to Monitor Them in Real-Time)

#stripe #python #saas #monitoring

Last month, a founder I follow posted about discovering they'd had a Stripe configuration error for 6 weeks. Failed payments were silently retrying and failing with no alert. By the time they caught it, they'd missed revenue from ~40 customers who had soft-declined cards but never got a dunning email.

Six weeks. No alert. They just happened to notice the number looked off.

This is more common than people admit. Most Stripe setups are reactive — you find out about billing problems when customers complain or when you stare at your MRR dashboard long enough. Here are 5 billing patterns I now actively monitor, and the detection logic behind each one.

1. Subscription Churn Spike

What it is: A sudden increase in cancellations or non-renewals within a rolling window.

Why it's tricky: Individual cancellations don't set off alarms. But 8 cancellations in 2 hours at 2am on a Tuesday? That's either a bug in your billing code, a bad email you sent, or a competitor just dropped pricing.

Detection logic:

def detect_churn_spike(events, window_hours=2, threshold=5):
    """Alert if more than `threshold` churn events in `window_hours`."""
    cutoff = datetime.utcnow() - timedelta(hours=window_hours)
    recent_churns = [
        e for e in events
        if e['type'] in ('customer.subscription.deleted', 'customer.subscription.updated')
        and e['created'] > cutoff.timestamp()
        and e.get('data', {}).get('object', {}).get('cancel_at_period_end') == True
    ]
    return len(recent_churns) >= threshold

BillingWatch implementation: Configurable per-tenant thresholds. Default is 5 churns in 2 hours.

2. Failed Payment Surge

What it is: Payment failure rate exceeding your baseline.

Why it's tricky: Some failure rate is normal (expired cards, insufficient funds). But if your failure rate spikes from 3% to 15%, something is wrong — either your Stripe configuration changed, a card processor is having issues, or you have a dunning problem.

Detection logic:

def calculate_failure_rate(events, window_hours=24):
    cutoff = datetime.utcnow() - timedelta(hours=window_hours)
    recent = [e for e in events if e['created'] > cutoff.timestamp()]

    charges = [e for e in recent if e['type'] == 'charge.succeeded']
    failures = [e for e in recent if e['type'] == 'charge.failed']

    total = len(charges) + len(failures)
    if total == 0:
        return 0.0

    failure_rate = len(failures) / total
    return failure_rate

# Alert when failure rate exceeds 2x the 7-day rolling baseline
def is_failure_rate_anomalous(current_rate, baseline_rate, multiplier=2.0):
    return current_rate > (baseline_rate * multiplier)

The key is comparing against your own baseline, not a fixed threshold. A SaaS with enterprise customers might have a normal 8% failure rate (annual renewals with outdated cards). A consumer app might sit at 1.5%. Know your baseline.

3. Webhook Delivery Failures

What it is: Stripe failed to deliver webhooks to your endpoint.

Why it's tricky: Stripe retries webhooks for 72 hours, but if your endpoint is consistently failing, you're processing events out of order — or missing them entirely. Subscription state gets desynced from Stripe's truth.

Detection logic:

# Poll Stripe's webhook attempts endpoint
def check_webhook_health(stripe_client, days_back=1):
    cutoff = int((datetime.utcnow() - timedelta(days=days_back)).timestamp())

    attempts = stripe_client.WebhookEndpoint.list(limit=100)
    for endpoint in attempts.data:
        recent_failures = stripe_client.Event.list(
            type='*',
            created={'gt': cutoff},
            limit=100
        )
        # Check if any events have failed delivery
        # (Stripe doesn't directly expose this via API — you have to check
        # your endpoint's delivery logs in the Dashboard or via /v1/webhook_endpoints)

    return endpoint_health_report

Practical tip: Set up a /health endpoint on your webhook handler that logs successful delivery. Alert when the gap between last successful delivery and now exceeds 30 minutes during business hours.

4. MRR Drop Without Corresponding Churn Events

What it is: Your calculated MRR drops but Stripe isn't showing matching cancellation events.

Why it's tricky: This is the sneaky one. It usually means a subscription was incorrectly updated (quantity changed, price override applied incorrectly, or trial extended accidentally). The subscription isn't cancelled — it's just silently billing less.

Detection logic:

def reconcile_mrr(stripe_client, stored_mrr):
    """Compare our calculated MRR with what Stripe says it should be."""
    stripe_mrr = 0

    # Pull all active subscriptions
    subscriptions = stripe_client.Subscription.list(status='active', limit=100)
    for sub in subscriptions.auto_paging_iter():
        for item in sub['items']['data']:
            monthly_amount = normalize_to_monthly(
                item['price']['unit_amount'],
                item['price']['recurring']['interval']
            )
            stripe_mrr += monthly_amount * item['quantity']

    stripe_mrr_dollars = stripe_mrr / 100  # Stripe stores in cents
    delta = abs(stripe_mrr_dollars - stored_mrr)
    pct_delta = delta / stored_mrr if stored_mrr > 0 else 0

    if pct_delta > 0.05:  # Alert on >5% discrepancy
        return AnomalyAlert(
            type='mrr_mismatch',
            our_mrr=stored_mrr,
            stripe_mrr=stripe_mrr_dollars,
            delta=delta
        )
    return None

I run this reconciliation daily. Even small MRR discrepancies compound.

5. New Subscription Dry Spell

What it is: No new subscriptions for longer than your normal cadence.

Why it's tricky: If you normally get 3-5 new subs per day and suddenly go 48 hours without one, something might be broken in your signup flow — not your billing. Stripe's checkout session might be erroring. Your payment form might be broken on mobile. Or you might have a traffic issue.

Detection logic:

def detect_acquisition_drought(events, expected_daily_signups, drought_hours=24):
    cutoff = datetime.utcnow() - timedelta(hours=drought_hours)
    recent_new_subs = [
        e for e in events
        if e['type'] == 'customer.subscription.created'
        and e['created'] > cutoff.timestamp()
    ]

    expected_in_window = (expected_daily_signups / 24) * drought_hours

    if len(recent_new_subs) < (expected_in_window * 0.2):  # Less than 20% of expected
        return DroughtAlert(
            expected=expected_in_window,
            actual=len(recent_new_subs),
            hours=drought_hours
        )
    return None

This one requires you to know your baseline. I recommend tracking a rolling 30-day average as your expected rate.

Setting Up Real-Time Monitoring

All 5 of these run in BillingWatch — a self-hosted FastAPI service that ingests your Stripe webhooks and runs anomaly detection in real-time.

The setup is minimal:

Point your Stripe webhook endpoint at your BillingWatch instance
Configure thresholds per tenant
Get Slack/webhook alerts when anomalies fire

BillingWatch is on GitHub — MIT licensed, self-hosted, runs on Docker.

The alternative is building this yourself (the patterns above are a solid starting point) or using a paid service. Both are valid. The main thing is having some monitoring in place before you have a 6-week silent failure to explain to your investors.

BillingWatch: github.com/rmbell09-lang/billingwatch

Stack: FastAPI, SQLite, Stripe Webhooks, Docker

Not affiliated with Stripe.