Ray

Posted on Mar 29

"Self-Hosting Stripe Anomaly Detection: Building a Multi-Tenant BillingWatch with FastAPI"

#python #stripe #fastapi #webdev

Why Self-Host Billing Monitoring?

SaaS billing monitoring tools exist. They're fine. They also cost money, send your billing data to someone else's servers, and lock you into their alerting UX. If you're running multiple Stripe accounts — or just care about controlling sensitive financial telemetry — self-hosting is worth the hour of setup.

BillingWatch is what I built to replace a paid tool: a FastAPI + SQLite stack that ingests Stripe webhooks, runs 7 real-time anomaly detectors, and alerts you when something looks wrong. Multi-tenant by default.

What BillingWatch Detects

Seven detectors ship out of the box:

Charge Failure Spike — unusual jump in failed charges vs. baseline
Duplicate Charge — same customer, same amount, short time window
Fraud Spike — surge in charge.dispute.created events
Revenue Drop — successful charge volume drops below rolling average
Silent Lapse — no webhook activity for N hours (dead webhook endpoint)
Webhook Lag — event timestamp vs. received-at gap exceeds threshold
Invoice Mismatch — invoice amount doesn't match subscription plan amount

Each detector is independently configurable per tenant — you can set different thresholds for your production account vs. a staging account.

Architecture

BillingWatch/
├── app/
│   ├── main.py           # FastAPI app, webhook endpoint
│   ├── detectors/        # One file per anomaly type
│   │   ├── charge_failure.py
│   │   ├── duplicate_charge.py
│   │   └── ...
│   ├── models.py         # SQLite schema (SQLAlchemy)
│   └── alerts.py         # Alert routing (email, Slack, webhook)
├── docker-compose.yml
└── .env.example

FastAPI handles the ingestion. SQLite stores everything locally — no external DB required. Docker Compose wires it up.

Webhook Signature Verification

This is non-negotiable with Stripe. Every webhook must be verified before processing:

import stripe
from fastapi import FastAPI, Request, HTTPException, Header
from typing import Optional

app = FastAPI()

@app.post("/webhooks/{tenant_id}")
async def receive_webhook(
    tenant_id: str,
    request: Request,
    stripe_signature: Optional[str] = Header(None)
):
    payload = await request.body()

    # Get this tenant's webhook secret
    webhook_secret = get_tenant_webhook_secret(tenant_id)

    try:
        event = stripe.Webhook.construct_event(
            payload=payload,
            sig_header=stripe_signature,
            secret=webhook_secret
        )
    except stripe.error.SignatureVerificationError:
        raise HTTPException(status_code=400, detail="Invalid signature")
    except ValueError:
        raise HTTPException(status_code=400, detail="Invalid payload")

    # Store and process
    store_event(tenant_id, event)
    run_detectors(tenant_id, event)

    return {"status": "ok"}

Each tenant gets their own webhook endpoint URL (/webhooks/<tenant_id>) and their own Stripe webhook secret. This is how multi-tenancy works — complete isolation at the HTTP layer.

Multi-Tenant Setup

Tenants are just rows in a config table:

from sqlalchemy import Column, String, JSON
from .database import Base

class Tenant(Base):
    __tablename__ = "tenants"

    id = Column(String, primary_key=True)  # e.g. "prod", "staging", "client-acme"
    name = Column(String)
    stripe_webhook_secret = Column(String)
    thresholds = Column(JSON)  # Per-tenant detector config

# Default thresholds
DEFAULT_THRESHOLDS = {
    "charge_failure_spike": {"multiplier": 3.0, "window_hours": 1},
    "duplicate_charge": {"window_seconds": 300},
    "fraud_spike": {"multiplier": 5.0, "window_hours": 24},
    "revenue_drop": {"drop_pct": 50, "window_hours": 24},
    "silent_lapse": {"max_silence_hours": 4},
    "webhook_lag": {"max_lag_seconds": 30},
    "invoice_mismatch": {"tolerance_cents": 0}
}

To add a new tenant:

curl -X POST http://localhost:8000/tenants \
  -H "Content-Type: application/json" \
  -d '{"id": "prod", "name": "Production", "stripe_webhook_secret": "whsec_..."}'

Then register the BillingWatch webhook URL in your Stripe dashboard under https://yourdomain.com/webhooks/prod.

A Detector Example: Charge Failure Spike

from datetime import datetime, timedelta
from ..models import StripeEvent, session

class ChargeFailureDetector:
    def run(self, tenant_id: str, event: dict, thresholds: dict):
        if event['type'] != 'charge.failed':
            return None

        config = thresholds.get('charge_failure_spike', {})
        window = timedelta(hours=config.get('window_hours', 1))
        multiplier = config.get('multiplier', 3.0)

        # Count recent failures
        cutoff = datetime.utcnow() - window
        recent_failures = session.query(StripeEvent).filter(
            StripeEvent.tenant_id == tenant_id,
            StripeEvent.event_type == 'charge.failed',
            StripeEvent.received_at >= cutoff
        ).count()

        # Compare to baseline (same window, 7 days ago)
        week_ago_cutoff = cutoff - timedelta(days=7)
        baseline_failures = session.query(StripeEvent).filter(
            StripeEvent.tenant_id == tenant_id,
            StripeEvent.event_type == 'charge.failed',
            StripeEvent.received_at.between(week_ago_cutoff, week_ago_cutoff + window)
        ).count()

        if baseline_failures > 0 and recent_failures >= (baseline_failures * multiplier):
            return {
                "type": "charge_failure_spike",
                "severity": "high",
                "detail": f"{recent_failures} failures vs {baseline_failures} baseline"
            }

        return None

Each detector returns None (no anomaly) or a dict (fire an alert). Clean, testable, easy to add new ones.

Docker Compose

version: '3.8'
services:
  billingwatch:
    build: .
    ports:
      - "8000:8000"
    environment:
      - DATABASE_URL=sqlite:///./billingwatch.db
    volumes:
      - ./data:/app/data
    restart: unless-stopped

git clone https://github.com/rmbell09-lang/BillingWatch
cp .env.example .env
docker-compose up -d

Local Testing With Stripe CLI

# Install Stripe CLI
brew install stripe/stripe-cli

# Forward real Stripe events to local BillingWatch
stripe listen --forward-to localhost:8000/webhooks/dev

# Trigger a test event
stripe trigger charge.failed

The Stripe CLI outputs the webhook signing secret you'll need for local testing — drop it in your .env as DEV_WEBHOOK_SECRET.

What It Looks Like in Practice

After running for a few weeks on a production Stripe account:

Caught 2 duplicate charge attempts (card testing, both auto-blocked by Stripe but good to know)
Silent lapse alert fired once when Stripe had a webhook delivery outage — useful for knowing "is this us or them?"
Revenue drop detector needed threshold tuning — first week had false positives on low-traffic days

The tuning overhead is real but worth it. After calibrating thresholds to your actual traffic patterns, it's quiet — and when it fires, it means something.

GitHub

Full source at github.com/rmbell09-lang/BillingWatch. Issues and contributions welcome, especially new detector implementations.

DEV Community