TL;DR: If your Playwright/Cypress/Pytest tests involving email keep failing intermittently in CI, the cause is almost always shared inbox state. The fix is per-test inbox isolation, not longer timeouts.
The Symptom
Your test suite has a test that looks reasonable:
✓ user can register (3.2s)
✗ user receives welcome email (timeout after 30s)
✓ user receives welcome email (2.1s) ← same test, re-run, now passes
The test passes locally. It fails in CI. It passes again on retry. Your team has learned to just re-run the pipeline and move on. The underlying problem is never fixed.
This is not an infrastructure problem. It's not "CI is slower than local." It's a state isolation problem.
The Root Cause
When email-dependent tests share an inbox, several things go wrong.
Race conditions in parallel runs
CI pipelines run tests concurrently. If you're using -n 4 in Pytest or --workers=4 in Playwright, four tests may be signing up simultaneously — all using the same test@yourcompany.com address.
Worker 1 signs up. Worker 2 signs up. Both trigger welcome emails. Worker 1 reads the inbox and finds two emails. Which one is its? It doesn't know. It reads the wrong one. The test either fails on assertion or false-passes.
Stale emails from previous runs
A test passes. The inbox now contains a "Welcome" email. The next test run starts. It signs up again, polls the inbox, and immediately finds the email from the previous run — before the new email has even been sent. The test passes — but it's asserting on stale data.
This is why "it works locally, fails in CI" is common: locally you likely run tests serially, so stale emails are less frequent. CI runs them fast and parallel.
The "longer timeout" trap
The instinct is to increase the wait:
time.sleep(10) # give the email time to arrive
This doesn't fix the race condition. It makes your test suite slower and still non-deterministic.
Why the Obvious Solutions Don't Fully Work
Mocking the email service
If you control the email service, you can intercept emails before sending. But:
- Many teams use third-party auth providers (Auth0, Cognito, Firebase) that send emails you can't intercept
- Mocking removes confidence in the real integration
- It doesn't test whether the email was actually sent, properly formatted, or delivered
A dedicated test Gmail account
Better than sharing, but still has problems:
- Gmail IMAP is rate-limited; fast parallel tests will hit limits
- You still need to clean up emails between runs (fragile teardown)
- Shared between all test workers unless you manage multiple accounts
Mailhog / MailDev (self-hosted SMTP)
Great for testing your own email service, but:
- Doesn't work when the email originates from a third party (Cognito SES, SendGrid, etc.)
- Requires infrastructure in CI (Docker, port mapping)
- Shared inbox per service instance (same race condition problem unless you implement per-test routing)
The Fix: Per-Test Inbox Isolation
The correct architecture is:
- Before each test, create a fresh, unique email inbox
- Use the inbox's address in the test
- Poll that specific inbox for the expected email
- The inbox expires automatically — no teardown
This eliminates shared state entirely. Each test is completely independent. Parallel execution is safe by design.
How to Implement It
Python (Pytest)
# conftest.py
import pytest, requests, time, os
API_KEY = os.environ["MINUTEMAIL_API_KEY"]
BASE = "https://api.minutemail.co/v1"
HDRS = {"Authorization": f"Bearer {API_KEY}"}
@pytest.fixture
def inbox():
mb = requests.post(f"{BASE}/mailboxes", headers=HDRS, json={"domain": "minutemail.cc", "expiresIn": 10}).json()
yield mb
# expires automatically
def wait_for_email(mailbox_id, timeout=30):
deadline = time.time() + timeout
while time.time() < deadline:
msgs = requests.get(f"{BASE}/mailboxes/{mailbox_id}/mails", headers=HDRS).json().get("items", [])
if msgs: return msgs[0]
time.sleep(2)
raise TimeoutError("No email")
TypeScript (Playwright)
// fixtures/email.ts
import { test as base } from '@playwright/test';
const headers = { 'Authorization': `Bearer ${process.env.MINUTEMAIL_API_KEY}` };
const BASE = 'https://api.minutemail.co/v1';
export const test = base.extend({
inbox: async ({}, use) => {
const mb = await fetch(`${BASE}/mailboxes`, { method: 'POST', headers, body: JSON.stringify({ domain: 'minutemail.cc', expiresIn: 10 }) }).then(r => r.json());
await use({
address: mb.address,
waitForEmail: async (timeout = 30000) => {
const deadline = Date.now() + timeout;
while (Date.now() < deadline) {
const { items } = await fetch(`${BASE}/mailboxes/${mb.id}/mails`, { headers }).then(r => r.json());
if (items.length) return items[0];
await new Promise(r => setTimeout(r, 2000));
}
throw new Error('Email timeout');
}
});
}
});
With this pattern, parallel CI becomes stable by design, not by accident.
What to Use for the Inbox Service
You need a hosted service with:
- Per-inbox API: create a new inbox on demand, get a unique address back
- TTL control: inbox expires automatically (no cleanup logic needed)
- Message polling API: read received emails programmatically
- Reliable delivery: must actually receive emails from third-party senders
A few options:
| Service | API | Per-inbox TTL | Free tier | Notes |
|---|---|---|---|---|
| Mailinator | Limited | No | Yes | Known domains often blocked |
| Mailtrap | Yes | No | Yes (limited) | Better for SMTP testing |
| MinuteMail | Yes | Yes (1–60 min) | Yes (100 calls/day) | Built for this use case |
MinuteMail was built specifically for developer/QA use cases: each POST /mailboxes returns a unique address with configurable TTL. Full docs at https://docs.minutemail.co.
The Outcome
After switching to per-test inbox isolation:
- Tests that were "retry to fix" become deterministic
- Parallel CI works correctly without coordination between workers
- You get confidence that the full email flow works end-to-end, not just that your code tried to send
The pattern requires ~20 lines of fixture code. The payoff is a test suite you can trust.
Tags: #testing #devops #automation #ci
Created: 2026-02-26
Top comments (0)