ko-chan

Posted on Jan 13 • Edited on Jan 24 • Originally published at ko-chan.github.io

Testing WebAuthn in CI: E2E Automation with Virtual Authenticators and Mailpit [Part 2]

#e2e #playwright #webauthn #automation

This article was originally published on Saru Blog.

What You'll Learn

How to test WebAuthn (passkey) authentication in CI environments
Automating OTP email retrieval with Mailpit API
Preventing email race conditions in parallel E2E tests
Locale-specific testing for multilingual UIs

Introduction

In Part 1, I introduced the overall architecture and automation strategy for "Saru," a multi-tenant SaaS platform. This article dives deeper into the E2E testing implementation that forms the core of that automation.

The most challenging aspect is testing authentication flows. Saru uses two authentication methods:

Portal	Auth Method	Challenge
System / Provider	OTP + Passkey	Email retrieval, WebAuthn
Reseller / Consumer	Keycloak OAuth	External IdP integration

This article explains how to automate testing all of these in CI.

1. WebAuthn Virtual Authenticator: Testing Passkeys in CI

The Challenge with Passkey Authentication

WebAuthn (passkeys) typically requires physical security keys or biometric authentication. At first glance, testing this in CI seems impossible.

Solution: Chrome DevTools Protocol (CDP) Virtual Authenticator

Playwright allows you to create virtual authenticators through CDP. This enables testing the full WebAuthn flow without physical devices.

Note: CDP virtual authenticators are Chromium-only. They don't work with Safari (WebKit) or Firefox. For cross-browser testing, run WebAuthn tests only on Chromium and mock authenticated state for other browsers.

Implementation Code

import { test, expect, type BrowserContext } from '@playwright/test';

test('should complete signup with Passkey registration', async ({ page, context }) => {
  // Enable virtual authenticator
  const cdpSession = await context.newCDPSession(page);
  await cdpSession.send('WebAuthn.enable');

  // Add virtual authenticator
  await cdpSession.send('WebAuthn.addVirtualAuthenticator', {
    options: {
      protocol: 'ctap2',           // CTAP2 protocol
      transport: 'usb',            // Emulate USB connection
      hasResidentKey: true,        // Passkey capable
      hasUserVerification: true,   // Emulate biometric auth
      isUserVerified: true,        // Always succeed verification
      automaticPresenceSimulation: true, // Auto-respond
    },
  });

  // ... Execute signup flow ...

  // Click Passkey registration button
  await page.getByRole('button', { name: 'Passkey' }).click();

  // Virtual authenticator responds automatically
  await expect(page.getByText('Passkey registered')).toBeVisible();

  // Cleanup
  await cdpSession.send('WebAuthn.disable');
});

Alignment Between Transport Settings and Server Configuration

When setting up the WebAuthn virtual authenticator, alignment with server-side settings is crucial.

In Saru's case, the backend generates WebAuthn registration options with AuthenticatorAttachment: CrossPlatform. This setting "prefers roaming authenticators (USB keys, etc.)."

Initially, I used transport: 'internal' (platform authenticator), which caused registration to fail.

// When server prefers CrossPlatform, alignment matters
transport: 'internal',  // Platform authenticator → may fail
transport: 'usb',       // Roaming authenticator → aligns with server

Key Point: The virtual authenticator's transport setting needs to align with the server's AuthenticatorAttachment setting. If registration fails, check the server configuration first. While WebAuthn spec doesn't require exact 1:1 correspondence, misalignment is a common cause of failures.

2. Automating OTP Email Retrieval: Mailpit API Integration

Problems with Traditional Approaches

Many E2E tests retrieve OTP from a test endpoint:

// Get OTP via test mode (not recommended)
const response = await request.get(`/signup/${sessionId}/test/otp`);
const { otp } = await response.json();

Problems:

Adds TEST_MODE branches to production code
Doesn't test actual email sending
Diverges from real user flows

Solution: Mailpit API

Saru uses Mailpit (development mail server) API to extract OTP from actually sent emails.

const MAILPIT_API_URL = 'http://localhost:8025/api/v1';

export async function waitForOtpEmail(
  email: string,
  type: 'login' | 'signup',
  maxAttempts = 30,
  sentAfter?: string
): Promise<string | null> {
  const subjectPatterns = {
    login: ['ログインコード', 'Login Code'],
    signup: ['認証コード', 'Verification Code'],
  };

  for (let attempt = 1; attempt <= maxAttempts; attempt++) {
    await new Promise(resolve => setTimeout(resolve, 1000));

    const response = await fetch(`${MAILPIT_API_URL}/messages`);
    const data = await response.json();

    // Search for email
    const otpEmail = data.messages.find(msg => {
      // Check recipient
      if (!msg.To.some(to => to.Address === email)) return false;
      // Check subject pattern
      if (!subjectPatterns[type].some(p => msg.Subject.includes(p))) return false;
      // Timestamp filter (explained below)
      if (sentAfter && new Date(msg.Created) < new Date(sentAfter)) return false;
      return true;
    });

    if (otpEmail) {
      // Extract 6-digit OTP
      const match = otpEmail.Snippet.match(/(\d{6})/);
      if (match) return match[1];
    }
  }
  return null;
}

Key Points:

Tests actual email sending flow
Supports both Japanese/English subject patterns
Polls for up to 30 seconds (handles SMTP cold start delays)

3. Preventing Email Race Conditions in Parallel Tests

The Problem: OTP Mix-ups in Parallel Execution

When running multiple tests in parallel in CI, tests may accidentally retrieve another test's OTP.

For example:

Test A: Sends OTP to user-a@example.com
Test B: Sends OTP to user-b@example.com
Test A: Searches Mailpit → Gets Test B's OTP

Solution: Timestamp + Unique Address Filtering

Saru combines two methods to prevent race conditions:

Unique email addresses: Each test uses a different email address
Timestamp filtering: Record time before OTP request, search only emails after that time

// Record timestamp before login
const sentAfter = new Date().toISOString();

// Submit email address (unique per test)
await page.fill('input[type="email"]', email);
await page.getByRole('button', { name: 'Login' }).click();

// Get OTP with timestamp filtering
const otp = await waitForOtpEmail(email, 'login', 30, sentAfter);

// Filtering inside waitForOtpEmail
const otpEmail = data.messages.find(msg => {
  // Recipient check (narrow down by unique address)
  if (!msg.To.some(to => to.Address === email)) return false;

  // Timestamp filter (exclude old emails)
  if (sentAfter) {
    const emailTime = new Date(msg.Created).getTime();
    const filterTime = new Date(sentAfter).getTime();
    if (emailTime < filterTime) return false;
  }
  return true;
});

Alternative Approaches for Parallel Testing

More robust methods to consider:

Method	Pros	Cons
Unique address + timestamp (Saru's approach)	Simple, no backend changes	Vulnerable to clock skew
Embed X-Request-ID in email	Uniquely identifies email	Requires backend changes
Mailpit Search API	Direct filtering by conditions	Depends on API features

Saru's approach prioritizes "simple and works well enough."

Deprecated: clearMailpit()

Previously, clearMailpit() deleted all emails before each test, but in parallel execution this deletes other tests' emails too. Timestamp filtering made this function deprecated.

/**
 * @deprecated Use timestamp-based filtering instead.
 * This function causes race conditions in parallel tests.
 */
export async function clearMailpit(): Promise<void> {
  await fetch(`${MAILPIT_API_URL}/messages`, { method: 'DELETE' });
}

4. Appendix: Locale-Specific Testing

Not directly related to authentication testing, but a useful technique for E2E testing multilingual apps.

The Challenge with Multilingual E2E

Common approach:

// Regex to support multiple languages (old approach)
await expect(page.getByRole('button', {
  name: /(ログイン|Login|登录)/
})).toBeVisible();

Problem: Regex must be updated every time a language is added.

Locale-Specific Testing Pattern

In Saru, we fix the language at test time and directly verify that language's text.

// e2e/utils/locale.ts
export async function setLocale(
  context: BrowserContext,
  locale: 'ja' | 'en'
): Promise<void> {
  // Note: domain: 'localhost' may behave differently across browsers
  // Consider using url option if issues arise
  await context.addCookies([{
    name: 'locale',
    value: locale,
    domain: 'localhost',
    path: '/',
  }]);
}

// Test file
const TEXT = {
  LOGIN: 'ログイン',
  PRODUCT_NAME: '商品名',
  CREATE: '作成',
} as const;

test.beforeEach(async ({ context }) => {
  await setLocale(context, 'ja');
});

test('should create a product', async ({ page }) => {
  await page.getByLabel(TEXT.PRODUCT_NAME).fill('テスト商品');
  await page.getByRole('button', { name: TEXT.CREATE }).click();
});

Benefits: Text is explicit and readable; impact scope is clear when adding languages.

5. CI Configuration: Parallel Execution on Self-hosted Runners

Matrix Strategy for Parallelization

GitHub Actions uses matrix for parallel execution by portal.

# .github/workflows/e2e-tests.yml
jobs:
  e2e:
    runs-on: [self-hosted, linux, x64]
    strategy:
      fail-fast: false
      matrix:
        portal:
          - name: system
            tests: "e2e/system-*.spec.ts e2e/system-portal/*.spec.ts"
            api_port: 8080
          - name: provider
            tests: "e2e/provider-portal/*.spec.ts"
            api_port: 8081
          # ... other portals

Separating Cross-Portal Tests

Tests spanning multiple portals (e.g., Provider→Reseller integration) run in a separate job.

Reasons:

Tests logging in as the same user compete
OTP retrieval timing overlaps

e2e-cross-portal:
  needs: [db-setup, e2e]  # Run after other E2E tests complete
  runs-on: [self-hosted, linux, x64]
  steps:
    - name: Run cross-portal tests
      run: |
        pnpm exec playwright test \
          e2e/auth.spec.ts \
          e2e/dashboard.spec.ts \
          e2e/search-filters.spec.ts

6. Running Cross-Portal Tests Locally

Since it takes 15-20 minutes to reach cross-portal tests in CI, we have scripts for local verification first.

# Run all cross-portal tests
./scripts/run-e2e-cross-portal.sh

# Smoke tests only
./scripts/run-e2e-cross-portal.sh smoke

# Run with visible browser
./scripts/run-e2e-cross-portal.sh --headed

# Playwright UI mode
./scripts/run-e2e-cross-portal.sh --ui

Summary

Challenge	Solution	Constraints/Notes
Testing WebAuthn authentication	CDP virtual authenticator	Chromium only
OTP email retrieval	Mailpit API integration	Requires polling
Email race conditions in parallel tests	Unique address + timestamp	Watch for clock skew
Multilingual UI testing	Locale-specific testing	Cookie setup dependent
CI execution time	Matrix parallelization + cross-portal separation	Complex job design

With these mechanisms, Saru's main authentication flows are automated in CI. Production-specific issues (external IdP outages, browser update behavior changes, etc.) still require manual verification, but manual testing in the daily development cycle has been significantly reduced.

Series Articles

Part 1: Tackling Unmanageable Complexity with Automation
Part 2: Testing WebAuthn in CI (this article)

DEV Community