Faizal

Posted on Jun 21

The Playwright Playbook — Part 7: The CI/CD Setup Nobody Shows You

#typescript #playwright #testing #automation

The Playwright Playbook — Part 7: The CI/CD Setup Nobody Shows You

"A test suite that only runs on your laptop isn't a test suite. It's a hobby."

Six parts in, we have a serious framework.

POM-based UI tests. Network interception. Multi-user contexts. A full API testing layer. Visual regression across four viewports. A complete debugging toolkit.

Now it needs to run automatically. On every pull request. On every merge. On every deployment. Without you touching it.

Most CI/CD tutorials for Playwright show you this:

# The "tutorial" version everyone copies
- run: npx playwright test

That's not a CI setup. That's a shell command in a YAML file.

A real production CI/CD pipeline for Playwright has:

Sharding — split tests across multiple machines and finish in a fraction of the time
Browser matrix — Chromium, Firefox, WebKit in parallel
Docker — identical environment on every machine, every time
Artifacts — HTML report, traces, screenshots, videos — downloadable from every run
Failure notifications — your team knows within seconds, not the next morning
Separate VRT workflow — visual regression on its own cadence, not blocking every PR
Environment-specific pipelines — staging vs production, different configurations

Let's build all of it. 🎯

🏗️ Where We Left Off

After Part 6, our full project structure is:

playwright-playbook/
├── tests/
│   ├── auth/login.spec.ts                       ✅ Part 1
│   ├── tasks/task-management.spec.ts            ✅ Part 1
│   ├── network/                                 ✅ Part 2
│   ├── multi-user/                              ✅ Part 3
│   ├── multi-tab/                               ✅ Part 3
│   ├── api/                                     ✅ Part 4
│   ├── visual/                                  ✅ Part 5
│   └── debug/trace-examples.spec.ts             ✅ Part 6
├── pages/
│   ├── LoginPage.ts                             ✅ Part 1
│   ├── TaskPage.ts                              ✅ Part 1
│   └── DashboardPage.ts                         ✅ Part 3
├── api/
│   ├── TaskApiClient.ts                         ✅ Part 4
│   └── AuthApiClient.ts                         ✅ Part 4
├── fixtures/
│   ├── auth.fixture.ts                          ✅ Part 1
│   ├── tasks.json                               ✅ Part 2
│   ├── empty-tasks.json                         ✅ Part 2
│   ├── tasks-har.har                            ✅ Part 2
│   ├── multi-user.fixture.ts                    ✅ Part 3
│   └── api.fixture.ts                           ✅ Part 4
├── scripts/
│   └── record-har.ts                            ✅ Part 2
├── utils/
│   ├── schema-validator.ts                      ✅ Part 4
│   ├── visual-helpers.ts                        ✅ Part 5
│   └── debug-helpers.ts                         ✅ Part 6
├── snapshots/                                   ✅ Part 5
├── .vscode/
│   ├── extensions.json                          ✅ Part 6
│   └── launch.json                              ✅ Part 6
├── .auth/
├── global-setup.ts                              ✅ Part 1
├── playwright.config.ts                         ✅ Parts 1–6
└── .env

By the end of Part 7, we add:

playwright-playbook/
├── .github/
│   └── workflows/                               ← NEW
│       ├── playwright.yml
│       └── playwright-visual.yml
├── docker/                                      ← NEW
│   ├── Dockerfile
│   └── docker-compose.yml
├── scripts/
│   └── notify-slack.ts                          ← NEW
└── .gitignore                                   ← NEW (complete version)

Every file gets fully built below. 👇

🧠 The CI Architecture — Mental Model First

Before we write a single line of YAML, understand the architecture we're building:

On every Pull Request:
  ┌─────────────────────────────────────────────────────┐
  │  playwright.yml                                     │
  │                                                     │
  │  Shard 1 (machine 1): auth + tasks + network tests  │
  │  Shard 2 (machine 2): multi-user + multi-tab tests  │
  │  Shard 3 (machine 3): api tests                     │
  │  Shard 4 (machine 4): debug tests                   │
  │                                                     │
  │  All shards run in parallel → merge reports         │
  │  Upload: HTML report + traces + screenshots         │
  │  Notify Slack on failure                            │
  └─────────────────────────────────────────────────────┘

On merge to main (nightly for VRT):
  ┌─────────────────────────────────────────────────────┐
  │  playwright-visual.yml                              │
  │                                                     │
  │  Runs inside Docker (consistent rendering)          │
  │  Full visual regression suite                       │
  │  On failure: diff images uploaded as artifacts      │
  │  Requires manual approval to update baselines       │
  └─────────────────────────────────────────────────────┘

Two separate workflows. Different triggers. Different purposes. Clean separation. 🎯

⚙️ Updating `playwright.config.ts` — CI-Ready Final Version

First, the final version of our config — tuned for both local development and CI:

// playwright.config.ts — final version
import { defineConfig, devices } from '@playwright/test';
import * as dotenv from 'dotenv';

dotenv.config();

export default defineConfig({
  testDir: './tests',
  fullyParallel: true,

  // Fail the build if test.only is accidentally committed
  forbidOnly: !!process.env.CI,

  // Retry once on CI — surfaces flaky tests without hiding them
  retries: process.env.CI ? 1 : 0,

  // Parallel workers — CI uses 4, local uses all available cores
  workers: process.env.CI ? 4 : undefined,

  // Reporters — list for CI console output, HTML for artifact
  reporter: process.env.CI
    ? [
        ['list'],
        ['html', { open: 'never', outputFolder: 'playwright-report' }],
        ['json', { outputFile: 'test-results/results.json' }],
        ['github'], // Annotates failing tests directly in the PR
      ]
    : [
        ['list'],
        ['html', { open: 'on-failure' }],
      ],

  expect: {
    toHaveScreenshot: {
      maxDiffPixelRatio: 0.002,
      timeout: 10000,
      animations: 'disabled',
    },
  },

  use: {
    baseURL: process.env.BASE_URL || 'http://localhost:3000',
    screenshot: 'only-on-failure',
    video: 'retain-on-failure',
    trace: process.env.CI ? 'on-first-retry' : 'on',
    extraHTTPHeaders: {
      'Accept': 'application/json',
      'Content-Type': 'application/json',
    },
  },

  projects: [
    {
      name: 'admin',
      use: {
        ...devices['Desktop Chrome'],
        storageState: '.auth/admin.json',
      },
      testMatch: ['**/auth/**', '**/tasks/**', '**/network/**'],
    },
    {
      name: 'user',
      use: {
        ...devices['Desktop Chrome'],
        storageState: '.auth/user.json',
      },
      testMatch: ['**/tasks/**'],
    },
    {
      name: 'multi-context',
      use: { ...devices['Desktop Chrome'] },
      testMatch: ['**/multi-user/**', '**/multi-tab/**'],
    },
    {
      name: 'api',
      use: {},
      testMatch: ['**/api/**'],
    },
    {
      name: 'visual',
      use: {
        ...devices['Desktop Chrome'],
        storageState: '.auth/admin.json',
        viewport: { width: 1280, height: 720 },
        launchOptions: {
          args: ['--disable-gpu', '--force-device-scale-factor=1'],
        },
      },
      testMatch: ['**/visual/**'],
      snapshotDir: './snapshots',
    },
    {
      name: 'visual-responsive',
      use: {
        ...devices['Desktop Chrome'],
        storageState: '.auth/admin.json',
      },
      testMatch: ['**/visual/responsive**'],
      snapshotDir: './snapshots/responsive',
    },
    // Cross-browser matrix — run on merge to main only
    {
      name: 'firefox',
      use: {
        ...devices['Desktop Firefox'],
        storageState: '.auth/admin.json',
      },
      testMatch: ['**/auth/**', '**/tasks/**'],
    },
    {
      name: 'webkit',
      use: {
        ...devices['Desktop Safari'],
        storageState: '.auth/admin.json',
      },
      testMatch: ['**/auth/**', '**/tasks/**'],
    },
    // Mobile browser projects
    {
      name: 'mobile-chrome',
      use: {
        ...devices['Pixel 7'],
        storageState: '.auth/user.json',
      },
      testMatch: ['**/tasks/**'],
    },
    {
      name: 'mobile-safari',
      use: {
        ...devices['iPhone 14'],
        storageState: '.auth/user.json',
      },
      testMatch: ['**/tasks/**'],
    },
  ],
});

🐳 Docker — Consistent Environments Everywhere

The single biggest source of VRT flakiness is rendering differences between machines. macOS renders fonts differently from Linux. One engineer's machine differs from another's. CI differs from both.

Docker solves this by running Playwright inside the official Microsoft Playwright image — the same image, everywhere.

# docker/Dockerfile
FROM mcr.microsoft.com/playwright:v1.47.0-jammy

WORKDIR /app

# Copy package files first — better Docker layer caching
COPY package*.json ./
RUN npm ci

# Copy the rest of the project
COPY . .

# Default command — run all tests
CMD ["npx", "playwright", "test"]

# docker/docker-compose.yml
version: '3.8'

services:
  playwright:
    build:
      context: ..
      dockerfile: docker/Dockerfile
    environment:
      - CI=true
      - BASE_URL=${BASE_URL:-http://app:3000}
      - ADMIN_EMAIL=${ADMIN_EMAIL}
      - ADMIN_PASSWORD=${ADMIN_PASSWORD}
      - USER_EMAIL=${USER_EMAIL}
      - USER_PASSWORD=${USER_PASSWORD}
    volumes:
      # Mount test-results and reports back to host for inspection
      - ../test-results:/app/test-results
      - ../playwright-report:/app/playwright-report
      - ../snapshots:/app/snapshots
    depends_on:
      - app
    networks:
      - playwright-network

  # Your app container — replace with your actual app image
  app:
    image: your-app:latest
    ports:
      - '3000:3000'
    environment:
      - NODE_ENV=test
      - DB_SEED=true
    networks:
      - playwright-network

networks:
  playwright-network:
    driver: bridge

Run locally with Docker:

# Build and run the full suite in Docker
docker-compose -f docker/docker-compose.yml up --build

# Run only visual tests in Docker
docker-compose -f docker/docker-compose.yml run playwright \
  npx playwright test --project=visual

# Update visual baselines inside Docker (so they match CI rendering)
docker-compose -f docker/docker-compose.yml run playwright \
  npx playwright test --project=visual --update-snapshots

🔔 Slack Notification Script

When tests fail in CI at 2am, your team should know immediately — not when they check GitHub the next morning.

// scripts/notify-slack.ts
import * as fs from 'fs';
import * as path from 'path';

interface TestResult {
  stats: {
    expected: number;
    unexpected: number;
    skipped: number;
    flaky: number;
    duration: number;
  };
  suites: Array<{
    title: string;
    specs: Array<{
      title: string;
      ok: boolean;
      tests: Array<{ results: Array<{ status: string; error?: { message: string } }> }>;
    }>;
  }>;
}

async function notifySlack(): Promise<void> {
  const webhookUrl = process.env.SLACK_WEBHOOK_URL;
  if (!webhookUrl) {
    console.log('No SLACK_WEBHOOK_URL set — skipping Slack notification.');
    return;
  }

  const resultsPath = path.join(process.cwd(), 'test-results', 'results.json');
  if (!fs.existsSync(resultsPath)) {
    console.error('results.json not found. Run tests first.');
    process.exit(1);
  }

  const results: TestResult = JSON.parse(fs.readFileSync(resultsPath, 'utf-8'));
  const { stats } = results;

  const passed = stats.expected;
  const failed = stats.unexpected;
  const flaky = stats.flaky;
  const skipped = stats.skipped;
  const duration = Math.round(stats.duration / 1000);

  // Only notify on failure
  if (failed === 0 && flaky === 0) {
    console.log(`✅ All ${passed} tests passed. No Slack notification needed.`);
    return;
  }

  // Collect failed test names
  const failedTests: string[] = [];
  for (const suite of results.suites) {
    for (const spec of suite.specs) {
      if (!spec.ok) {
        failedTests.push(`• ${suite.title} › ${spec.title}`);
      }
    }
  }

  const runUrl = process.env.GITHUB_RUN_URL ?? 'N/A';
  const branch = process.env.GITHUB_REF_NAME ?? 'unknown branch';
  const actor = process.env.GITHUB_ACTOR ?? 'unknown';

  const payload = {
    blocks: [
      {
        type: 'header',
        text: {
          type: 'plain_text',
          text: `❌ Playwright Tests Failed — ${branch}`,
        },
      },
      {
        type: 'section',
        fields: [
          { type: 'mrkdwn', text: `*Passed:*\n✅ ${passed}` },
          { type: 'mrkdwn', text: `*Failed:*\n❌ ${failed}` },
          { type: 'mrkdwn', text: `*Flaky:*\n⚠️ ${flaky}` },
          { type: 'mrkdwn', text: `*Duration:*\n⏱ ${duration}s` },
        ],
      },
      {
        type: 'section',
        text: {
          type: 'mrkdwn',
          text: `*Failed tests:*\n${failedTests.slice(0, 10).join('\n')}${
            failedTests.length > 10
              ? `\n_...and ${failedTests.length - 10} more_`
              : ''
          }`,
        },
      },
      {
        type: 'section',
        fields: [
          { type: 'mrkdwn', text: `*Triggered by:*\n${actor}` },
          { type: 'mrkdwn', text: `*Branch:*\n${branch}` },
        ],
      },
      {
        type: 'actions',
        elements: [
          {
            type: 'button',
            text: { type: 'plain_text', text: '🔍 View CI Run' },
            url: runUrl,
            style: 'danger',
          },
        ],
      },
    ],
  };

  const response = await fetch(webhookUrl, {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify(payload),
  });

  if (!response.ok) {
    console.error(`Slack notification failed: ${response.status}`);
    process.exit(1);
  }

  console.log('📢 Slack notification sent.');
}

notifySlack().catch(console.error);

🚀 The Main GitHub Actions Workflow

This is the full production workflow — sharded, parallelized, with artifacts and notifications.

# .github/workflows/playwright.yml
name: Playwright Tests

on:
  push:
    branches: [main, develop]
  pull_request:
    branches: [main, develop]
  # Allow manual trigger from GitHub UI
  workflow_dispatch:

env:
  # These come from GitHub repository secrets
  BASE_URL: ${{ secrets.BASE_URL }}
  ADMIN_EMAIL: ${{ secrets.ADMIN_EMAIL }}
  ADMIN_PASSWORD: ${{ secrets.ADMIN_PASSWORD }}
  USER_EMAIL: ${{ secrets.USER_EMAIL }}
  USER_PASSWORD: ${{ secrets.USER_PASSWORD }}

jobs:
  # ─────────────────────────────────────────
  # Job 1: Install and cache dependencies
  # ─────────────────────────────────────────
  install:
    name: Install Dependencies
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - uses: actions/setup-node@v4
        with:
          node-version: '20'
          cache: 'npm'

      - name: Install dependencies
        run: npm ci

      - name: Cache Playwright browsers
        uses: actions/cache@v4
        id: playwright-cache
        with:
          path: ~/.cache/ms-playwright
          key: playwright-${{ runner.os }}-${{ hashFiles('package-lock.json') }}

      - name: Install Playwright browsers
        if: steps.playwright-cache.outputs.cache-hit != 'true'
        run: npx playwright install --with-deps

      # Cache node_modules for downstream jobs
      - name: Cache node_modules
        uses: actions/cache@v4
        with:
          path: node_modules
          key: node-modules-${{ runner.os }}-${{ hashFiles('package-lock.json') }}

  # ─────────────────────────────────────────
  # Job 2: Run tests — sharded across 4 machines
  # ─────────────────────────────────────────
  test:
    name: Test (Shard ${{ matrix.shardIndex }}/${{ matrix.shardTotal }})
    runs-on: ubuntu-latest
    needs: install

    strategy:
      fail-fast: false  # Don't cancel other shards if one fails
      matrix:
        shardIndex: [1, 2, 3, 4]
        shardTotal: [4]

    steps:
      - uses: actions/checkout@v4

      - uses: actions/setup-node@v4
        with:
          node-version: '20'
          cache: 'npm'

      - name: Restore node_modules
        uses: actions/cache@v4
        with:
          path: node_modules
          key: node-modules-${{ runner.os }}-${{ hashFiles('package-lock.json') }}

      - name: Restore Playwright browsers
        uses: actions/cache@v4
        with:
          path: ~/.cache/ms-playwright
          key: playwright-${{ runner.os }}-${{ hashFiles('package-lock.json') }}

      - name: Install Playwright system dependencies
        run: npx playwright install-deps chromium

      - name: Run Playwright tests (Shard ${{ matrix.shardIndex }})
        run: |
          npx playwright test \
            --project=admin \
            --project=user \
            --project=multi-context \
            --project=api \
            --shard=${{ matrix.shardIndex }}/${{ matrix.shardTotal }}
        env:
          CI: true

      # Upload blob report — merged in the next job
      - name: Upload blob report
        if: always()
        uses: actions/upload-artifact@v4
        with:
          name: blob-report-${{ matrix.shardIndex }}
          path: blob-report/
          retention-days: 3

  # ─────────────────────────────────────────
  # Job 3: Cross-browser tests — main branch only
  # ─────────────────────────────────────────
  cross-browser:
    name: Cross-Browser (${{ matrix.browser }})
    runs-on: ubuntu-latest
    needs: install
    if: github.ref == 'refs/heads/main'

    strategy:
      fail-fast: false
      matrix:
        browser: [firefox, webkit]

    steps:
      - uses: actions/checkout@v4

      - uses: actions/setup-node@v4
        with:
          node-version: '20'
          cache: 'npm'

      - name: Restore node_modules
        uses: actions/cache@v4
        with:
          path: node_modules
          key: node-modules-${{ runner.os }}-${{ hashFiles('package-lock.json') }}

      - name: Install ${{ matrix.browser }}
        run: npx playwright install --with-deps ${{ matrix.browser }}

      - name: Run tests on ${{ matrix.browser }}
        run: npx playwright test --project=${{ matrix.browser }}
        env:
          CI: true

      - name: Upload blob report
        if: always()
        uses: actions/upload-artifact@v4
        with:
          name: blob-report-${{ matrix.browser }}
          path: blob-report/
          retention-days: 3

  # ─────────────────────────────────────────
  # Job 4: Merge shard reports into one HTML report
  # ─────────────────────────────────────────
  merge-reports:
    name: Merge Reports & Publish
    runs-on: ubuntu-latest
    needs: [test]
    if: always()

    steps:
      - uses: actions/checkout@v4

      - uses: actions/setup-node@v4
        with:
          node-version: '20'
          cache: 'npm'

      - name: Restore node_modules
        uses: actions/cache@v4
        with:
          path: node_modules
          key: node-modules-${{ runner.os }}-${{ hashFiles('package-lock.json') }}

      - name: Download all blob reports
        uses: actions/download-artifact@v4
        with:
          path: all-blob-reports
          pattern: blob-report-*
          merge-multiple: true

      - name: Merge into single HTML report
        run: |
          npx playwright merge-reports \
            --reporter html,json \
            ./all-blob-reports

      - name: Upload merged HTML report
        uses: actions/upload-artifact@v4
        with:
          name: playwright-report-${{ github.run_id }}
          path: playwright-report/
          retention-days: 14

      - name: Upload test results JSON
        uses: actions/upload-artifact@v4
        with:
          name: test-results-json
          path: test-results/results.json
          retention-days: 14

  # ─────────────────────────────────────────
  # Job 5: Notify Slack on failure
  # ─────────────────────────────────────────
  notify:
    name: Slack Notification
    runs-on: ubuntu-latest
    needs: [test, merge-reports]
    if: failure()

    steps:
      - uses: actions/checkout@v4

      - uses: actions/setup-node@v4
        with:
          node-version: '20'
          cache: 'npm'

      - name: Restore node_modules
        uses: actions/cache@v4
        with:
          path: node_modules
          key: node-modules-${{ runner.os }}-${{ hashFiles('package-lock.json') }}

      - name: Download test results
        uses: actions/download-artifact@v4
        with:
          name: test-results-json
          path: test-results/

      - name: Send Slack notification
        run: npx ts-node scripts/notify-slack.ts
        env:
          SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL }}
          GITHUB_RUN_URL: ${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}
          GITHUB_REF_NAME: ${{ github.ref_name }}
          GITHUB_ACTOR: ${{ github.actor }}

🎨 The Visual Regression Workflow — Separate Pipeline

VRT runs on its own schedule — not on every PR (too slow), but on every merge to main and nightly.

# .github/workflows/playwright-visual.yml
name: Visual Regression Tests

on:
  push:
    branches: [main]
  schedule:
    # Run every night at 2am UTC
    - cron: '0 2 * * *'
  workflow_dispatch:
    inputs:
      update_snapshots:
        description: 'Update baseline snapshots?'
        required: false
        default: 'false'
        type: boolean

env:
  BASE_URL: ${{ secrets.BASE_URL }}
  ADMIN_EMAIL: ${{ secrets.ADMIN_EMAIL }}
  ADMIN_PASSWORD: ${{ secrets.ADMIN_PASSWORD }}
  USER_EMAIL: ${{ secrets.USER_EMAIL }}
  USER_PASSWORD: ${{ secrets.USER_PASSWORD }}

jobs:
  visual-regression:
    name: Visual Regression
    runs-on: ubuntu-latest

    steps:
      - uses: actions/checkout@v4

      - uses: actions/setup-node@v4
        with:
          node-version: '20'
          cache: 'npm'

      - name: Install dependencies
        run: npm ci

      - name: Install Playwright Chromium
        run: npx playwright install --with-deps chromium

      - name: Run visual regression tests
        run: |
          npx playwright test \
            --project=visual \
            --project=visual-responsive \
            ${{ github.event.inputs.update_snapshots == 'true' && '--update-snapshots' || '' }}
        env:
          CI: true

      # If snapshots were updated, commit them back to the repo
      - name: Commit updated snapshots
        if: github.event.inputs.update_snapshots == 'true'
        run: |
          git config --global user.name 'github-actions[bot]'
          git config --global user.email 'github-actions[bot]@users.noreply.github.com'
          git add snapshots/
          git diff --staged --quiet || git commit -m "chore: update visual regression baselines [skip ci]"
          git push
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

      # Upload diff images when VRT fails — so you can see exactly what changed
      - name: Upload visual diff artifacts
        if: failure()
        uses: actions/upload-artifact@v4
        with:
          name: visual-diffs-${{ github.run_id }}
          path: |
            test-results/**/*-diff.png
            test-results/**/*-actual.png
            test-results/**/*-expected.png
          retention-days: 14

      - name: Upload VRT HTML report
        if: always()
        uses: actions/upload-artifact@v4
        with:
          name: vrt-report-${{ github.run_id }}
          path: playwright-report/
          retention-days: 14

      - name: Notify Slack on VRT failure
        if: failure()
        run: npx ts-node scripts/notify-slack.ts
        env:
          SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL }}
          GITHUB_RUN_URL: ${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}
          GITHUB_REF_NAME: ${{ github.ref_name }}
          GITHUB_ACTOR: ${{ github.actor }}

📋 The Complete `.gitignore`

After 7 parts, here's the complete gitignore for the whole project:

# .gitignore

# Dependencies
node_modules/

# Environment files — NEVER commit these
.env
.env.local
.env.*.local

# Authentication state — auto-generated by globalSetup
.auth/

# Playwright test artifacts — generated on each run
test-results/
playwright-report/
blob-report/

# Snapshots are committed — they ARE your baselines
# !snapshots/  ← DO NOT ignore snapshots

# TypeScript build output
dist/
build/

# OS files
.DS_Store
Thumbs.db

# IDE files
.idea/
*.swp
*.swo

# Logs
*.log
npm-debug.log*

Note what is NOT ignored: snapshots/. Visual regression baselines must be committed to the repo. They are your source of truth. ✅

🔐 GitHub Secrets — What to Configure

In your GitHub repository → Settings → Secrets and Variables → Actions, add:

Secret name          Value
──────────────────   ──────────────────────────────────────
BASE_URL             https://staging.yourtaskapp.com
ADMIN_EMAIL          admin@test.com
ADMIN_PASSWORD       your-admin-password
USER_EMAIL           user@test.com
USER_PASSWORD        your-user-password
SLACK_WEBHOOK_URL    https://hooks.slack.com/services/...

Never hardcode these in YAML. Never commit them. GitHub Secrets encrypt them at rest and mask them in logs automatically. 🔑

📊 Understanding Sharding — Why It Matters

Without sharding, 100 tests run sequentially on one machine. With 4 shards, each machine gets ~25 tests and they all run in parallel.

Without sharding (1 machine):
  100 tests × 3 seconds average = 5 minutes total

With sharding (4 machines):
  25 tests × 3 seconds average = ~75 seconds per shard
  All 4 run in parallel = ~75 seconds total

That's a 4x speed improvement on the same test suite.
Scale to 8 shards? 8x faster.

The --shard=N/M flag tells Playwright to run only the Nth chunk of M total chunks:

# Run shard 1 of 4
npx playwright test --shard=1/4

# Run shard 2 of 4
npx playwright test --shard=2/4

Playwright distributes tests evenly across shards automatically. No manual assignment needed. 🎯

📁 Final Project Structure After Part 7

Every file listed below has been fully built across Parts 1 through 7:

playwright-playbook/
├── .github/
│   └── workflows/                               ✅ Part 7
│       ├── playwright.yml
│       └── playwright-visual.yml
├── tests/
│   ├── auth/login.spec.ts                       ✅ Part 1
│   ├── tasks/task-management.spec.ts            ✅ Part 1
│   ├── network/                                 ✅ Part 2
│   │   ├── api-mocking.spec.ts
│   │   ├── error-simulation.spec.ts
│   │   └── network-assertions.spec.ts
│   ├── multi-user/                              ✅ Part 3
│   │   ├── role-permissions.spec.ts
│   │   └── realtime-collaboration.spec.ts
│   ├── multi-tab/                               ✅ Part 3
│   │   └── multi-tab-flows.spec.ts
│   ├── api/                                     ✅ Part 4
│   │   ├── tasks-api.spec.ts
│   │   ├── auth-api.spec.ts
│   │   ├── graphql-api.spec.ts
│   │   └── api-ui-chain.spec.ts
│   ├── visual/                                  ✅ Part 5
│   │   ├── dashboard-visual.spec.ts
│   │   ├── task-visual.spec.ts
│   │   └── responsive-visual.spec.ts
│   └── debug/                                   ✅ Part 6
│       └── trace-examples.spec.ts
├── pages/
│   ├── LoginPage.ts                             ✅ Part 1
│   ├── TaskPage.ts                              ✅ Part 1
│   └── DashboardPage.ts                         ✅ Part 3
├── api/
│   ├── TaskApiClient.ts                         ✅ Part 4
│   └── AuthApiClient.ts                         ✅ Part 4
├── fixtures/
│   ├── auth.fixture.ts                          ✅ Part 1
│   ├── tasks.json                               ✅ Part 2
│   ├── empty-tasks.json                         ✅ Part 2
│   ├── tasks-har.har                            ✅ Part 2
│   ├── multi-user.fixture.ts                    ✅ Part 3
│   └── api.fixture.ts                           ✅ Part 4
├── scripts/
│   ├── record-har.ts                            ✅ Part 2
│   └── notify-slack.ts                          ✅ Part 7
├── utils/
│   ├── schema-validator.ts                      ✅ Part 4
│   ├── visual-helpers.ts                        ✅ Part 5
│   └── debug-helpers.ts                         ✅ Part 6
├── docker/                                      ✅ Part 7
│   ├── Dockerfile
│   └── docker-compose.yml
├── snapshots/                                   ✅ Part 5
├── .vscode/                                     ✅ Part 6
│   ├── extensions.json
│   └── launch.json
├── .auth/                                       ← git-ignored
├── global-setup.ts                              ✅ Part 1
├── playwright.config.ts                         ✅ Parts 1–7 (final version)
├── .gitignore                                   ✅ Part 7
├── .env                                         ← git-ignored
└── package.json

🗺️ What's Coming in This Series

Part 1 — Stop Writing Tests Like a Beginner              ✅ Done
Part 2 — Network Interception: The Complete Guide        ✅ Done
Part 3 — Multi-User, Multi-Tab & Context Testing         ✅ Done
Part 4 — API Testing (The Underrated Superpower)         ✅ Done
Part 5 — Visual Regression Testing                       ✅ Done
Part 6 — Debugging Like a Pro: Trace Viewer & Inspector  ✅ Done
Part 7 — The CI/CD Setup Nobody Shows You                ← You are here
Part 8 — Playwright Meets AI: Agents, MCP & Self-Healing Tests

In Part 8 — the series finale — we add AI on top of everything we've built. Playwright MCP, AI test agents, and self-healing selectors. The framework we've spent 7 parts building becomes the target for AI-powered test generation and auto-healing.

🔖 Before You Go

Seven parts in. The framework is complete.

And it now runs automatically — on every PR, on every merge, across multiple machines in parallel, with cross-browser coverage, Docker-consistent VRT, downloadable reports, and your team notified the moment something breaks.

That's not a test suite. That's a quality platform. 🏗️

One part left. And it's the one that ties your entire positioning together — Playwright meets AI.

Follow me so you don't miss Part 8 — the series finale where we add AI agents, MCP, and self-healing tests on top of the framework we've built across all 7 parts.

Drop a comment below 👇

What does your current CI/CD setup for tests look like?
Are you using sharding — or still running everything sequentially?
What's the first GitHub Secret you'd set up from this list?

Let's talk in the comments. 🙌

Faizal Shaikh | Senior Automation Engineer | Playwright & AI Testing
Connect with me on LinkedIn

Top comments (4)

Nazar Boyko • Jun 21

Pulling visual regression out into its own workflow is the call most setups skip, so nice work. Running it on merge and nightly instead of every PR keeps the slow, flaky part from blocking people who just want their code in. The Docker bit underneath it is the actual fix, since font rendering drifts between macOS and Linux and that alone breaks pixel diffs for no real reason. One thing I'd watch is the test and merge jobs. They only restore node_modules from cache and never fall back to npm ci, so a cache miss leaves them with nothing installed and a confusing failure.

Faizal • Jul 11

Really glad that separation landed — you've articulated exactly why it exists. VRT blocking a PR that has nothing to do with UI changes is one of those slow-burn team frustrations that nobody talks about until someone quits. 😄
And the cache miss point is a genuinely good catch. You're right — if the node_modules cache isn't hit, the job just fails with a confusing "module not found" rather than falling back to npm ci. The fix is straightforward:

- name: Restore node_modules
  uses: actions/cache@v4
  id: cache-node
  with:
    path: node_modules
    key: node-modules-${{ runner.os }}-${{ hashFiles('package-lock.json') }}

- name: Install dependencies (cache miss fallback)
  if: steps.cache-node.outputs.cache-hit != 'true'
  run: npm ci

The if: cache-hit != 'true' condition means npm ci only runs when the cache is cold — fast path stays fast, but nothing ever silently breaks. Will add this to the article. Thanks for catching it 🙏

Sol • Jul 7

The sharding + artifact point is the part most CI guides skip: once you can see the failing shard, trace, and browser quickly, the next bottleneck is usually attribution, not detection.

When a test that used to be stable starts flaking in a GitHub Actions setup like this, how does your team figure out which commit introduced it? And if nobody can pin it down confidently, do you usually quarantine it, keep rerunning, or use some other fallback?

Faizal • Jul 11

Great question — and honestly one of the harder problems in CI maintenance.
The way I approach it:
First, the trace + shard artifact tells you WHEN it started failing — you can usually narrow it to a 2-3 commit window just from the run history. From there, git bisect does the heavy lifting if the window is still too wide.
If nobody can pin it confidently after that — I quarantine it. Tag it @flaky, open a tracking issue with the trace attached, and remove it from the blocking suite temporarily. Rerunning indefinitely without understanding why is just hiding the problem.
The one thing I'd add to what's in Part 7: a flaky test annotation in the HTML report is only useful if your team actually looks at it. Pairing the Slack notification with a direct link to the failing shard's trace is what makes attribution fast enough that people actually follow through rather than just re-clicking "re-run". 🎯

The Playwright Playbook — Part 7: The CI/CD Setup Nobody Shows You

🏗️ Where We Left Off

🧠 The CI Architecture — Mental Model First

⚙️ Updating playwright.config.ts — CI-Ready Final Version

🐳 Docker — Consistent Environments Everywhere

🔔 Slack Notification Script

🚀 The Main GitHub Actions Workflow

🎨 The Visual Regression Workflow — Separate Pipeline

📋 The Complete .gitignore

🔐 GitHub Secrets — What to Configure

📊 Understanding Sharding — Why It Matters

📁 Final Project Structure After Part 7

🗺️ What's Coming in This Series

🔖 Before You Go

⚙️ Updating `playwright.config.ts` — CI-Ready Final Version

📋 The Complete `.gitignore`