DEV Community

Shiplight
Shiplight

Posted on • Originally published at shiplight.ai

E2E Testing in GitHub Actions: Setup Guide (2026)

Running E2E tests in GitHub Actions is one of the highest-leverage investments a team can make in release confidence. Tests that run automatically on every pull request catch regressions before they reach staging — not after a customer reports them.

But E2E tests in CI have a reputation problem: they're slow, flaky, and often the first thing teams skip when deadlines tighten. This guide covers how to set them up properly — fast, reliable, and self-healing — so they become a trusted release gate rather than a noise source.

What You'll Set Up

By the end of this guide you'll have:

  • E2E tests running on every pull request via GitHub Actions
  • Environment-specific configuration using GitHub Secrets
  • Parallelized execution to keep CI under 5 minutes
  • Failure artifacts (screenshots, videos) uploaded automatically
  • A self-healing layer so tests don't break on routine UI changes

Prerequisites

  • A GitHub repository with a web application
  • E2E tests written in Playwright (or a tool that wraps it, like Shiplight)
  • Node.js-based project

Step 1: Basic GitHub Actions Workflow

Create .github/workflows/e2e.yml:

name: E2E Tests

on:
  pull_request:
    branches: [main, develop]
  push:
    branches: [main]

jobs:
  e2e:
    runs-on: ubuntu-latest
    timeout-minutes: 30

    steps:
      - name: Checkout
        uses: actions/checkout@v4

      - name: Setup Node.js
        uses: actions/setup-node@v4
        with:
          node-version: '20'
          cache: 'npm'

      - name: Install dependencies
        run: npm ci

      - name: Install Playwright browsers
        run: npx playwright install --with-deps chromium

      - name: Run E2E tests
        run: npx playwright test
        env:
          BASE_URL: ${{ secrets.STAGING_URL }}
          TEST_USER_EMAIL: ${{ secrets.TEST_USER_EMAIL }}
          TEST_USER_PASSWORD: ${{ secrets.TEST_USER_PASSWORD }}

      - name: Upload test artifacts on failure
        uses: actions/upload-artifact@v4
        if: failure()
        with:
          name: playwright-report
          path: playwright-report/
          retention-days: 7
Enter fullscreen mode Exit fullscreen mode

This runs on every PR, passes environment secrets safely, and uploads the HTML report when tests fail.

Playwright config for CI

// playwright.config.ts
export default {
  timeout: process.env.CI ? 45000 : 15000,
  retries: process.env.CI ? 1 : 0,
  workers: process.env.CI ? 2 : undefined,
  reporter: process.env.CI ? 'github' : 'list',
};
Enter fullscreen mode Exit fullscreen mode

The github reporter outputs test results as GitHub Actions annotations directly in the PR diff view — no artifact download needed for a quick pass/fail check.

Step 2: Store Secrets Correctly

Never hardcode credentials in your workflow. Store them in Settings → Secrets and variables → Actions:

Secret Purpose
STAGING_URL Base URL of your staging environment
TEST_USER_EMAIL Test account email
TEST_USER_PASSWORD Test account password
SHIPLIGHT_API_TOKEN If using Shiplight Cloud for execution

Reference them as ${{ secrets.SECRET_NAME }} — they're masked in logs and never exposed in PR output.

Step 3: Parallelize to Keep CI Fast

Playwright's sharding splits your suite across multiple runners:

jobs:
  e2e:
    runs-on: ubuntu-latest
    strategy:
      fail-fast: false
      matrix:
        shard: [1, 2, 3, 4]

    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: '20'
          cache: 'npm'
      - run: npm ci
      - run: npx playwright install --with-deps chromium

      - name: Run shard
        run: npx playwright test --shard=${{ matrix.shard }}/4
        env:
          BASE_URL: ${{ secrets.STAGING_URL }}

      - name: Upload shard report
        uses: actions/upload-artifact@v4
        if: always()
        with:
          name: playwright-report-shard-${{ matrix.shard }}
          path: playwright-report/
Enter fullscreen mode Exit fullscreen mode

4 shards typically cuts a 20-minute suite down to 5–6 minutes.

Step 4: Merge Reports from Parallel Shards

  merge-reports:
    needs: e2e
    runs-on: ubuntu-latest
    if: always()

    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: '20'
          cache: 'npm'
      - run: npm ci

      - name: Download shard reports
        uses: actions/download-artifact@v4
        with:
          pattern: playwright-report-shard-*
          path: all-reports/
          merge-multiple: true

      - name: Merge reports
        run: npx playwright merge-reports --reporter html ./all-reports

      - name: Upload merged report
        uses: actions/upload-artifact@v4
        with:
          name: playwright-report-merged
          path: playwright-report/
          retention-days: 14
Enter fullscreen mode Exit fullscreen mode

Step 5: Gate PRs on Test Results

Make test failures block merges: Settings → Branches → Add rule → Require status checks to pass. Select the e2e job as a required check.

Now tests are a real quality gate — not optional feedback.

Step 6: Run Nightly Full Regression

Separate your fast PR suite (critical paths, ~5 min) from deep regression (scheduled):

name: Nightly Regression

on:
  schedule:
    - cron: '0 2 * * *'  # 2am UTC every day

jobs:
  regression:
    runs-on: ubuntu-latest
    timeout-minutes: 60

    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: '20'
          cache: 'npm'
      - run: npm ci
      - run: npx playwright install --with-deps

      - name: Run full regression suite
        run: npx playwright test --project=regression
        env:
          BASE_URL: ${{ secrets.PRODUCTION_URL }}

      - name: Notify on failure
        if: failure()
        uses: slackapi/slack-github-action@v1
        with:
          payload: '{"text": "Nightly regression failed"}'
        env:
          SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL }}
Enter fullscreen mode Exit fullscreen mode

Step 7: Add Self-Healing Tests

The biggest CI pain point isn't slow tests — it's tests that break every time a developer renames a CSS class or moves a button. Traditional E2E tests bind to implementation details. Every UI refactor breaks tests even when behavior is unchanged.

Shiplight solves this with intent-based self-healing. Instead of CSS selectors, Shiplight tests store the semantic intent of each step:

# Survives CSS renames, component refactors, layout changes
goal: Verify user can complete checkout
statements:
  - intent: Navigate to the product page
  - intent: Add item to cart
  - intent: Proceed to checkout
  - intent: Enter shipping address
  - VERIFY: order confirmation message is visible
Enter fullscreen mode Exit fullscreen mode

When a UI change breaks a locator, Shiplight's AI resolves the correct element from the live DOM using the intent description. A developer renaming btn-checkout to btn-place-order doesn't break a single test.

The GitHub Actions integration posts results directly back to the PR:

      - name: Run E2E tests with Shiplight
        uses: shiplightai/run-tests@v1
        with:
          api-token: ${{ secrets.SHIPLIGHT_API_TOKEN }}
          suite-id: ${{ vars.E2E_SUITE_ID }}
          environment-id: ${{ vars.STAGING_ENV_ID }}
          post-pr-comment: true
Enter fullscreen mode Exit fullscreen mode

The PR comment includes a pass/fail summary, AI-generated failure explanation, and a link to the full run with screenshots and step-by-step trace.

Common Problems and Fixes

Tests pass locally but fail in CI

Cause: Timing differences, missing environment variables, or headless browser behavior.

Fix: Add explicit waits for network requests. Avoid page.waitForTimeout() — use page.waitForSelector() or page.waitForLoadState() instead. Set CI: true in your env.

Tests are flaky in CI

// playwright.config.ts
export default {
  retries: process.env.CI ? 2 : 0,  // retry only in CI
  workers: process.env.CI ? 2 : 4,  // fewer workers in CI
  timeout: 30000,
}
Enter fullscreen mode Exit fullscreen mode

If tests break whenever a developer renames a component, retries won't help — the locator is genuinely broken. Migrate to semantic selectors (getByRole, getByTestId) or use Shiplight's self-healing layer.

CI is too slow

Shard (Step 3), scope your PR suite to critical paths, move deep regression to nightly, and cache Playwright browsers:

      - name: Cache Playwright browsers
        uses: actions/cache@v4
        with:
          path: ~/.cache/ms-playwright
          key: playwright-${{ runner.os }}-${{ hashFiles('package-lock.json') }}
Enter fullscreen mode Exit fullscreen mode

Artifacts not uploading on timeout

Use if: always() instead of if: failure() so artifacts upload even when the job times out.

Full Production-Ready Workflow

name: E2E Tests

on:
  pull_request:
    branches: [main]
  push:
    branches: [main]

jobs:
  e2e:
    runs-on: ubuntu-latest
    timeout-minutes: 15
    strategy:
      fail-fast: false
      matrix:
        shard: [1, 2, 3]

    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: '20'
          cache: 'npm'
      - run: npm ci

      - name: Cache Playwright browsers
        uses: actions/cache@v4
        with:
          path: ~/.cache/ms-playwright
          key: playwright-${{ runner.os }}-${{ hashFiles('package-lock.json') }}

      - run: npx playwright install --with-deps chromium

      - name: Run E2E shard
        run: npx playwright test --shard=${{ matrix.shard }}/3
        env:
          CI: true
          BASE_URL: ${{ secrets.STAGING_URL }}
          TEST_USER_EMAIL: ${{ secrets.TEST_USER_EMAIL }}
          TEST_USER_PASSWORD: ${{ secrets.TEST_USER_PASSWORD }}

      - uses: actions/upload-artifact@v4
        if: always()
        with:
          name: report-shard-${{ matrix.shard }}
          path: playwright-report/
          retention-days: 7
Enter fullscreen mode Exit fullscreen mode

Key Takeaways

  • Gate PRs on E2E results — branch protection rules make tests a real quality signal
  • Shard for speed — 3–4 shards keeps most suites under 5 minutes
  • Separate PR gate from nightly regression — fast critical paths on PRs, deep coverage overnight
  • Cache Playwright browsers — saves 30–60 seconds per run
  • Self-healing tests eliminate CI breakageShiplight Plugin stores intent behind each step so tests survive CSS renames and refactors automatically

References: GitHub Actions documentation · Playwright CI documentation

Top comments (0)