Email is the test surface nobody wants to own. It's asynchronous, depends on external providers, has its own version of "flaky" (sometimes a delivery just… takes 90 seconds), and the bugs that bite hardest in production — wrong template variables, broken unsubscribe links, OTPs that expire too fast — are the ones unit tests can't catch.
This guide is the working developer's playbook for testing email in 2026. We'll cover the three layers (unit, integration, end-to-end), the tooling for each, and the patterns that actually scale.
The Three Layers of Email Testing
| Layer | What you're testing | Tooling | |---|---|---| | Unit | Template rendering, variable substitution, plain-text fallback | Jest / Vitest with snapshot tests | | Integration | Your app correctly calls your email provider's API | Provider sandbox or mock SDK | | End-to-end | The email actually arrives and the user can complete the flow | YoBox Temp Mail + Cypress / Playwright |
Most teams cover unit. Many cover integration. Few cover end-to-end — and that's exactly where the production bugs live.
Layer 1: Unit Tests for Templates
If you're using a templating engine (MJML, Handlebars, JSX-email, React Email), render the template with test data and snapshot the output. Catches:
Missing variable substitutions (Hello, {{name}} rendered with no name)
Broken HTML
Missing plain-text fallback
Localization bugs
test('welcome email renders', () => {
const html = renderWelcomeEmail({ name: 'Ada', verifyUrl: 'https://example.com/v/abc' });
expect(html).toMatchSnapshot();
expect(html).toContain('Hello, Ada');
expect(html).toContain('https://example.com/v/abc');
});
These tests are cheap, fast, and catch the "template rendered with undefined literally in the body" class of bugs.
Layer 2: Integration Tests for Send Logic
Your app calls Postmark / SendGrid / SES / Resend with a payload. The integration test asserts that you call the provider with the right payload, not that the email arrives.
Most providers ship a test mode or a sandbox endpoint. Use it.
test('signup triggers verification email', async () => {
const sendSpy = vi.spyOn(emailProvider, 'send');
await signup({ email: 'test@example.com' });
expect(sendSpy).toHaveBeenCalledWith(expect.objectContaining({
to: 'test@example.com',
template: 'verify-email',
variables: expect.objectContaining({ code: expect.stringMatching(/^\d{6}$/) }),
}));
});
You'll catch: - Wrong recipient - Wrong template - Missing variables - Codes outside expected format
You will not catch: - The email being rejected by Gmail's spam filter - The OTP arriving 90 seconds late - The magic link 404'ing because the route changed
Those need end-to-end.
Layer 3: End-to-End with Disposable Inboxes
This is where most teams give up. Don't.
The pattern, in one paragraph: spin up a real disposable inbox via API, submit a real signup with that address, let the email actually traverse SMTP, poll the inbox until the message arrives, parse out the OTP or link, submit it back to your app, assert the user is now signed in.
Full implementation in "How to Test Email Flows Without a Real Inbox" — works in Cypress, Playwright, or any HTTP-capable runner.
The YoBox Temp Mail API is designed for this loop: no auth, no rate limit on normal use, no captcha. The address generates in under a second, OTPs typically arrive in 2–5 seconds, and inboxes persist long enough for slow senders (SES, in-house SMTP).
Pair with Webhook Testing
Many email flows also fire downstream webhooks. SendGrid fires processed, delivered, opened; Postmark fires Delivery, Open, Click. To fully test, point those webhooks at the YoBox Webhook Tester during your test run, and assert on both halves:
- Trigger signup
- Wait for OTP email
- Verify code
- Wait for user.verified webhook on your test endpoint
- Assert on both This catches the production-only bugs: the email sent but the webhook didn't fire (bad downstream config), the webhook fired but with wrong payload, the order-of-operations bug where the user clicks before the webhook lands.
Common Test Scenarios
Signup with OTP
test('signup', async ({ page, request }) => {
const inbox = await createTempInbox(request);
await page.goto('/signup');
await page.fill('[name=email]', inbox.address);
await page.click('text=Send code');
const code = await pollForOtp(request, inbox.token);
await page.fill('[name=otp]', code);
await page.click('text=Verify');
await expect(page).toHaveURL(/dashboard/);
});
Password Reset
test('password reset', async ({ page, request }) => {
// (assume user already exists at inbox.address)
await page.goto('/forgot');
await page.fill('[name=email]', existingUser.email);
await page.click('text=Reset password');
const msg = await pollForEmail(request, existingUser.token);
const link = msg.text.match(/https:\/\/[^\s]+\/reset\?token=\S+/)?.[0];
await page.goto(link);
await page.fill('[name=password]', 'new-pw');
await page.click('text=Save');
await expect(page.locator('text=Password updated')).toBeVisible();
});
Double Opt-In
test('newsletter confirmation', async ({ page, request }) => {
const inbox = await createTempInbox(request);
await page.goto('/subscribe');
await page.fill('[name=email]', inbox.address);
await page.click('text=Subscribe');
const msg = await pollForEmail(request, inbox.token);
const confirmLink = msg.text.match(/https:\/\/[^\s]+\/confirm\?\S+/)?.[0];
await page.goto(confirmLink);
await expect(page.locator('text=Subscription confirmed')).toBeVisible();
});
Deliverability Testing
Unit and integration tests can't tell you if Gmail will route your message to spam. For that, you need:
Mail-tester.com for one-off checks (paste a generated address, send your email, get a deliverability score).
GlockApps or similar for ongoing inbox-placement monitoring.
DMARC reports for production sender health.
These aren't replacements for the test layers above; they're a separate axis.
Anti-Patterns to Avoid
Snapshot the entire HTML email and break on every CSS tweak. Snapshot the key text + structure, not the whole document.
Use your real personal Gmail for test signups. You'll regret it.
Stub email entirely in E2E tests. You'll miss the bugs E2E exists to find.
Share one disposable inbox across parallel tests. Race conditions on messages.
Set OTP TTL to 60 seconds "for security." Real users with slow inboxes can't complete the flow.
Tooling Cheat Sheet
| Need | Tool | |---|---| | Render template snapshots | Jest / Vitest | | Mock email provider SDK | provider's own test mode or vi.spyOn | | Disposable inbox for E2E | YoBox Temp Mail | | Capture downstream webhooks | YoBox Webhook Tester | | Local SMTP catcher for dev | Mailhog, MailCatcher | | Deliverability score | mail-tester.com | | Inbox-placement monitoring | GlockApps |
FAQ
Should I test against staging or production email providers? Staging if available. Production providers usually have a test mode (Postmark "Sandbox", SendGrid "Sandbox Mode") that returns success without actually delivering.
How do I avoid hitting send rate limits in CI? Cap parallelism, use disposable addresses (so the recipient never hits its rate limit), and reserve full E2E runs for nightly builds.
Can I run these tests in GitHub Actions? Yes. YoBox Temp Mail and Webhook Tester have no auth and no captcha, so they work in any CI.
What about testing email in mobile apps? Same pattern — the disposable address is platform-agnostic, you just drive the mobile UI with Detox or Appium instead of Playwright.
Do I need a separate domain for test emails? Not strictly. But sending from a different from address in tests (e.g. test@yourdomain) keeps your production sender reputation clean.
Bottom Line
Email testing has three layers and you should cover all three. Unit tests catch template bugs. Integration tests catch send-logic bugs. End-to-end tests with disposable inboxes and webhook capture catch the bugs that only show up when the email actually flies. Skip any layer and you'll find the bug in production instead of CI.
A practical guide to OTP verification, password resets, transactional emails, deliverability testing, and the workflows modern developers and QA teams use to validate email systems in 2026.
Top comments (0)