We Leaked 1,368 Customers into Our LIVE Stripe Account via E2E Tests

#a11y #webdev

Six weeks ago we wired up signup-to-paid-plan E2E tests against our staging stack. Last week we found 1,368 fake customers in our production Stripe account. No charges. No invoices. No alert ever fired. Here is what went wrong, the boring fix, and a checklist to see whether the same pattern is silently filling your own Stripe.

The pattern fires silently. Stripe charges nothing for Customer objects, so the usual monitoring catches none of it.

The setup that looked fine

Our signup flow does what every SaaS signup flow does. POST email and password → create a row in users → call stripe.Customer.create with the email → store the returned cus_xxx on the user row → send a welcome email. Nothing exotic.

Our E2E suite, run on every commit to master, signs up a fresh user, walks the onboarding, hits the dashboard, signs out. Around 30 signups per CI run. Test emails follow a pattern: {slug}+e2e@access-proof.com.

Two assumptions, both wrong:

We assumed the CI environment was configured with the Stripe TEST secret key. It was not. Someone had set it to LIVE during a debugging session a couple of months back. The CI variable was never reverted.
We assumed Stripe would somehow flag the obviously-fake emails. It does not. Creating a Customer object is rate-limited but otherwise free. There is no fraud signal, no review queue, no email-bounce side effect, no soft alert.

Six weeks of pushes × ~30 signups × CI = ~1,400 fake customers. Almost all of them disposable email addresses ending in +e2e@access-proof.com, but a handful with other test patterns we use in scripts.

How we found it

By accident, looking at the "Customers" page of the Stripe dashboard during a debugging session for something else. The total count surprised us — we have maybe 80 real signups so far. Sorting by recency showed a torrent of +e2e@ addresses.

What didn't catch it: no monitoring, no Stripe webhook (we listen for invoices and subscriptions, not raw Customer creates), no email alert (Stripe doesn't send any), no quota issue (we are nowhere near Stripe's account-level limits).

The lesson is uncomfortable. Compliance debt — and this is a flavour of GDPR-relevant data hygiene — does not always announce itself. It accumulates quietly in places nobody has a reason to look.

The boring fix

Two changes. First, the email-pattern skip at the boundary:

# app/services/billing.py

TEST_EMAIL_PATTERNS = (
    re.compile(r".+\+e2e@access-proof\.com$"),
    re.compile(r".+\+test@.+$"),
    re.compile(r".+@mailosaur\.io$"),
)

def is_test_email(email: str) -> bool:
    return any(p.match(email) for p in TEST_EMAIL_PATTERNS)

async def create_stripe_customer_if_real(user: User) -> str | None:
    if is_test_email(user.email):
        logger.info("stripe.customer.skip", email=user.email)
        return None
    customer = await stripe.Customer.create_async(email=user.email)
    return customer.id

Second, the boot-time guard so the same drift cannot happen again:

# app/main.py

if settings.STRIPE_SECRET_KEY.startswith("sk_live_") and os.getenv("CI") == "true":
    raise RuntimeError(
        "Refusing to boot: LIVE Stripe key detected in CI environment. "
        "Set STRIPE_SECRET_KEY to a TEST key (sk_test_...) for CI."
    )

This is the fix that pays back. The first change makes the bug impossible at the application level. The second makes the misconfiguration loud at startup, so it surfaces during the next pipeline run, not six weeks later.

The purge

Cleaning up the 1,368 ghosts was a one-shot script:

# scripts/purge_stripe_test_customers.py

import stripe, time

stripe.api_key = os.environ["STRIPE_LIVE_KEY"]
DELETED = []

for c in stripe.Customer.list(limit=100).auto_paging_iter():
    if is_test_email(c.email or ""):
        stripe.Customer.delete(c.id)
        DELETED.append({"id": c.id, "email": c.email})
        time.sleep(0.1)  # respect rate limit

print(f"Deleted {len(DELETED)} test customers")
# write CSV audit trail
with open(f"deleted-{date.today()}.csv", "w") as f:
    csv.DictWriter(f, ["id", "email"]).writerows(DELETED)

About 12 minutes of script time. A CSV with the 1,368 IDs and emails went into our audit folder. Stripe's rate limit on the Customer endpoint is about 100 requests per second; our 10-per-second pace is conservative and avoids any 429s.

What else might be leaking silently

Once you start looking, the pattern repeats. Things that get created in tests but never trigger a charge, an invoice, or an alert:

Mailgun / SendGrid contacts — every welcome email adds a contact unless you skip on test addresses.
Mixpanel / Segment / Amplitude identify calls — every test signup becomes a real user in your analytics, polluting funnel data.
Algolia / Meilisearch indexes — if you index users for admin search, your index fills with fake rows.
CRMs (HubSpot, Pipedrive) — every signup creates a contact. Sales calls a real-looking lead address that bounces.
S3 / R2 buckets — if test signups create folders or default assets, you grow object storage forever.

For each one, the fix is the same: pick a stable test-email pattern, skip the call at the boundary, add a boot-time guard against mode drift.

The audit trail matters more than the deletion

If you discover a leak like this and there is any chance the data flowed to a third party, keep a dated record. GDPR Article 5 (data minimisation) and Article 32 (security) are about being able to show what you did. A CSV of the 1,368 IDs you purged, dated and committed somewhere durable, is better than a cleaner slate with no record.

We also added the leak to our internal PITFALLS.md — the running register of mistakes we never want to repeat. If you do not keep one already, the entry takes one line: date, what happened, root cause, the boring fix that prevents the recurrence.

One last check

Go open your Stripe dashboard. Sort customers by recency. Skim the last 100. If you see emails with +test, +e2e, @example.com, @mailosaur, or your CI-tagged patterns — you are doing the same thing we were. Apply the four-step fix above. The whole job is under an hour.

For everything else that runs silently in the background of a SaaS — accessibility regressions, RGPD lapses, broken email deliverability — the free WCAG scan we run is the same flavour of low-effort, dated audit. It will not catch a Stripe leak. But it will catch the accessibility issues your tests are not even trying to find.

Originally published on access-proof.com.