DEV Community

ko-chan
ko-chan

Posted on • Originally published at ko-chan.github.io

Testing WebAuthn in CI: E2E Automation with Virtual Authenticators and Mailpit [Part 2]

This article was originally published on Saru Blog.


What You'll Learn

  • How to test WebAuthn (passkey) authentication in CI environments
  • Automating OTP email retrieval with Mailpit API
  • Preventing email race conditions in parallel E2E tests
  • Locale-specific testing for multilingual UIs

Introduction

In Part 1, I introduced the overall architecture and automation strategy for "Saru," a multi-tenant SaaS platform. This article dives deeper into the E2E testing implementation that forms the core of that automation.

The most challenging aspect is testing authentication flows. Saru uses two authentication methods:

Portal Auth Method Challenge
System / Provider OTP + Passkey Email retrieval, WebAuthn
Reseller / Consumer Keycloak OAuth External IdP integration

This article explains how to automate testing all of these in CI.

1. WebAuthn Virtual Authenticator: Testing Passkeys in CI

The Challenge with Passkey Authentication

WebAuthn (passkeys) typically requires physical security keys or biometric authentication. At first glance, testing this in CI seems impossible.

Solution: Chrome DevTools Protocol (CDP) Virtual Authenticator

Playwright allows you to create virtual authenticators through CDP. This enables testing the full WebAuthn flow without physical devices.

Note: CDP virtual authenticators are Chromium-only. They don't work with Safari (WebKit) or Firefox. For cross-browser testing, run WebAuthn tests only on Chromium and mock authenticated state for other browsers.

Implementation Code

import { test, expect, type BrowserContext } from '@playwright/test';

test('should complete signup with Passkey registration', async ({ page, context }) => {
  // Enable virtual authenticator
  const cdpSession = await context.newCDPSession(page);
  await cdpSession.send('WebAuthn.enable');

  // Add virtual authenticator
  await cdpSession.send('WebAuthn.addVirtualAuthenticator', {
    options: {
      protocol: 'ctap2',           // CTAP2 protocol
      transport: 'usb',            // Emulate USB connection
      hasResidentKey: true,        // Passkey capable
      hasUserVerification: true,   // Emulate biometric auth
      isUserVerified: true,        // Always succeed verification
      automaticPresenceSimulation: true, // Auto-respond
    },
  });

  // ... Execute signup flow ...

  // Click Passkey registration button
  await page.getByRole('button', { name: 'Passkey' }).click();

  // Virtual authenticator responds automatically
  await expect(page.getByText('Passkey registered')).toBeVisible();

  // Cleanup
  await cdpSession.send('WebAuthn.disable');
});
Enter fullscreen mode Exit fullscreen mode

Alignment Between Transport Settings and Server Configuration

When setting up the WebAuthn virtual authenticator, alignment with server-side settings is crucial.

In Saru's case, the backend generates WebAuthn registration options with AuthenticatorAttachment: CrossPlatform. This setting "prefers roaming authenticators (USB keys, etc.)."

Initially, I used transport: 'internal' (platform authenticator), which caused registration to fail.

// When server prefers CrossPlatform, alignment matters
transport: 'internal',  // Platform authenticator → may fail
transport: 'usb',       // Roaming authenticator → aligns with server
Enter fullscreen mode Exit fullscreen mode

Key Point: The virtual authenticator's transport setting needs to align with the server's AuthenticatorAttachment setting. If registration fails, check the server configuration first. While WebAuthn spec doesn't require exact 1:1 correspondence, misalignment is a common cause of failures.

2. Automating OTP Email Retrieval: Mailpit API Integration

Problems with Traditional Approaches

Many E2E tests retrieve OTP from a test endpoint:

// Get OTP via test mode (not recommended)
const response = await request.get(`/signup/${sessionId}/test/otp`);
const { otp } = await response.json();
Enter fullscreen mode Exit fullscreen mode

Problems:

  • Adds TEST_MODE branches to production code
  • Doesn't test actual email sending
  • Diverges from real user flows

Solution: Mailpit API

Saru uses Mailpit (development mail server) API to extract OTP from actually sent emails.

const MAILPIT_API_URL = 'http://localhost:8025/api/v1';

export async function waitForOtpEmail(
  email: string,
  type: 'login' | 'signup',
  maxAttempts = 30,
  sentAfter?: string
): Promise<string | null> {
  const subjectPatterns = {
    login: ['ログインコード', 'Login Code'],
    signup: ['認証コード', 'Verification Code'],
  };

  for (let attempt = 1; attempt <= maxAttempts; attempt++) {
    await new Promise(resolve => setTimeout(resolve, 1000));

    const response = await fetch(`${MAILPIT_API_URL}/messages`);
    const data = await response.json();

    // Search for email
    const otpEmail = data.messages.find(msg => {
      // Check recipient
      if (!msg.To.some(to => to.Address === email)) return false;
      // Check subject pattern
      if (!subjectPatterns[type].some(p => msg.Subject.includes(p))) return false;
      // Timestamp filter (explained below)
      if (sentAfter && new Date(msg.Created) < new Date(sentAfter)) return false;
      return true;
    });

    if (otpEmail) {
      // Extract 6-digit OTP
      const match = otpEmail.Snippet.match(/(\d{6})/);
      if (match) return match[1];
    }
  }
  return null;
}
Enter fullscreen mode Exit fullscreen mode

Key Points:

  • Tests actual email sending flow
  • Supports both Japanese/English subject patterns
  • Polls for up to 30 seconds (handles SMTP cold start delays)

3. Preventing Email Race Conditions in Parallel Tests

The Problem: OTP Mix-ups in Parallel Execution

When running multiple tests in parallel in CI, tests may accidentally retrieve another test's OTP.

For example:

  1. Test A: Sends OTP to user-a@example.com
  2. Test B: Sends OTP to user-b@example.com
  3. Test A: Searches Mailpit → Gets Test B's OTP

Solution: Timestamp + Unique Address Filtering

Saru combines two methods to prevent race conditions:

  1. Unique email addresses: Each test uses a different email address
  2. Timestamp filtering: Record time before OTP request, search only emails after that time
// Record timestamp before login
const sentAfter = new Date().toISOString();

// Submit email address (unique per test)
await page.fill('input[type="email"]', email);
await page.getByRole('button', { name: 'Login' }).click();

// Get OTP with timestamp filtering
const otp = await waitForOtpEmail(email, 'login', 30, sentAfter);
Enter fullscreen mode Exit fullscreen mode
// Filtering inside waitForOtpEmail
const otpEmail = data.messages.find(msg => {
  // Recipient check (narrow down by unique address)
  if (!msg.To.some(to => to.Address === email)) return false;

  // Timestamp filter (exclude old emails)
  if (sentAfter) {
    const emailTime = new Date(msg.Created).getTime();
    const filterTime = new Date(sentAfter).getTime();
    if (emailTime < filterTime) return false;
  }
  return true;
});
Enter fullscreen mode Exit fullscreen mode

Alternative Approaches for Parallel Testing

More robust methods to consider:

Method Pros Cons
Unique address + timestamp (Saru's approach) Simple, no backend changes Vulnerable to clock skew
Embed X-Request-ID in email Uniquely identifies email Requires backend changes
Mailpit Search API Direct filtering by conditions Depends on API features

Saru's approach prioritizes "simple and works well enough."

Deprecated: clearMailpit()

Previously, clearMailpit() deleted all emails before each test, but in parallel execution this deletes other tests' emails too. Timestamp filtering made this function deprecated.

/**
 * @deprecated Use timestamp-based filtering instead.
 * This function causes race conditions in parallel tests.
 */
export async function clearMailpit(): Promise<void> {
  await fetch(`${MAILPIT_API_URL}/messages`, { method: 'DELETE' });
}
Enter fullscreen mode Exit fullscreen mode

4. Appendix: Locale-Specific Testing

Not directly related to authentication testing, but a useful technique for E2E testing multilingual apps.

The Challenge with Multilingual E2E

Common approach:

// Regex to support multiple languages (old approach)
await expect(page.getByRole('button', {
  name: /(ログイン|Login|登录)/
})).toBeVisible();
Enter fullscreen mode Exit fullscreen mode

Problem: Regex must be updated every time a language is added.

Locale-Specific Testing Pattern

In Saru, we fix the language at test time and directly verify that language's text.

// e2e/utils/locale.ts
export async function setLocale(
  context: BrowserContext,
  locale: 'ja' | 'en'
): Promise<void> {
  // Note: domain: 'localhost' may behave differently across browsers
  // Consider using url option if issues arise
  await context.addCookies([{
    name: 'locale',
    value: locale,
    domain: 'localhost',
    path: '/',
  }]);
}
Enter fullscreen mode Exit fullscreen mode
// Test file
const TEXT = {
  LOGIN: 'ログイン',
  PRODUCT_NAME: '商品名',
  CREATE: '作成',
} as const;

test.beforeEach(async ({ context }) => {
  await setLocale(context, 'ja');
});

test('should create a product', async ({ page }) => {
  await page.getByLabel(TEXT.PRODUCT_NAME).fill('テスト商品');
  await page.getByRole('button', { name: TEXT.CREATE }).click();
});
Enter fullscreen mode Exit fullscreen mode

Benefits: Text is explicit and readable; impact scope is clear when adding languages.

5. CI Configuration: Parallel Execution on Self-hosted Runners

Matrix Strategy for Parallelization

GitHub Actions uses matrix for parallel execution by portal.

# .github/workflows/e2e-tests.yml
jobs:
  e2e:
    runs-on: [self-hosted, linux, x64]
    strategy:
      fail-fast: false
      matrix:
        portal:
          - name: system
            tests: "e2e/system-*.spec.ts e2e/system-portal/*.spec.ts"
            api_port: 8080
          - name: provider
            tests: "e2e/provider-portal/*.spec.ts"
            api_port: 8081
          # ... other portals
Enter fullscreen mode Exit fullscreen mode

Separating Cross-Portal Tests

Tests spanning multiple portals (e.g., Provider→Reseller integration) run in a separate job.

Reasons:

  • Tests logging in as the same user compete
  • OTP retrieval timing overlaps
e2e-cross-portal:
  needs: [db-setup, e2e]  # Run after other E2E tests complete
  runs-on: [self-hosted, linux, x64]
  steps:
    - name: Run cross-portal tests
      run: |
        pnpm exec playwright test \
          e2e/auth.spec.ts \
          e2e/dashboard.spec.ts \
          e2e/search-filters.spec.ts
Enter fullscreen mode Exit fullscreen mode

6. Running Cross-Portal Tests Locally

Since it takes 15-20 minutes to reach cross-portal tests in CI, we have scripts for local verification first.

# Run all cross-portal tests
./scripts/run-e2e-cross-portal.sh

# Smoke tests only
./scripts/run-e2e-cross-portal.sh smoke

# Run with visible browser
./scripts/run-e2e-cross-portal.sh --headed

# Playwright UI mode
./scripts/run-e2e-cross-portal.sh --ui
Enter fullscreen mode Exit fullscreen mode

Summary

Challenge Solution Constraints/Notes
Testing WebAuthn authentication CDP virtual authenticator Chromium only
OTP email retrieval Mailpit API integration Requires polling
Email race conditions in parallel tests Unique address + timestamp Watch for clock skew
Multilingual UI testing Locale-specific testing Cookie setup dependent
CI execution time Matrix parallelization + cross-portal separation Complex job design

With these mechanisms, Saru's main authentication flows are automated in CI. Production-specific issues (external IdP outages, browser update behavior changes, etc.) still require manual verification, but manual testing in the daily development cycle has been significantly reduced.


Series Articles

Top comments (0)