DevHelm

Posted on Jun 19 • Originally published at devhelm.io

Playwright Monitoring: Turn E2E Tests Into Production Monitors

#guides #infrastructure

You already have Playwright tests. They run in CI on every pull request, they assert that login works and checkout completes, and then they stop — because CI only runs them against a branch, at merge time. The moment the code is in production, those tests go silent. A third-party script breaks checkout at 3 AM and your perfectly good test suite says nothing, because nothing triggered it.

Playwright monitoring closes that gap: you take the same browser tests and run them on a schedule against production, turning your end-to-end suite into a synthetic monitoring system that watches real user journeys continuously.

Prerequisites

Node.js 18+ and an existing project (npm install -D @playwright/test, then npx playwright install chromium).
A deployed production (or staging) URL to run checks against.
A dedicated synthetic test account — never a real customer's credentials.
A secret store for that account's credentials (GitHub Actions secrets, or your platform's equivalent). Never hard-code them.

Step 1 — Write a check that asserts on what the user sees

A monitor-grade check is not "did the page load." It is "could a user complete the thing they came to do." Assert on the outcome, with a generous timeout for real-world latency:

import { test, expect } from "@playwright/test";

test("checkout reaches confirmation", async ({ page }) => {
  await page.goto("https://shop.example.com");

  await page.getByRole("button", { name: "Add to cart" }).click();
  await page.getByRole("link", { name: "Checkout" }).click();

  await page.getByLabel("Email").fill(process.env.SYNTHETIC_EMAIL!);
  await page.getByLabel("Card number").fill("4242424242424242");
  await page.getByRole("button", { name: "Pay now" }).click();

  // The assertion a 200 OK can never make for you:
  await expect(page.getByText("Order confirmed")).toBeVisible({
    timeout: 15000,
  });
});

Credentials come from process.env, not the source. The test card is a non-charging token, so the check does not create a real order every time it runs.

Step 2 — Make assertions wait for conditions, never for time

The number-one cause of flaky production checks is fixed sleeps. waitForTimeout(3000) either wastes three seconds or races a slow response and fails falsely. Wait for the condition instead:

// Flaky: races real-world timing
await page.waitForTimeout(3000);
expect(await page.getByTestId("balance").textContent()).toBeTruthy();

// Stable: waits for the actual signal, up to a bound
await expect(page.getByTestId("balance")).toBeVisible({ timeout: 10000 });
await expect(page.getByTestId("balance")).not.toBeEmpty();

Playwright's web-first assertions retry automatically until the condition holds or the timeout expires. A check built this way passes in 200 ms when the app is fast and only fails when something is genuinely wrong.

Step 3 — Capture evidence on failure

When a production check fails, you need to know why without re-running it by hand. Configure Playwright to keep a screenshot, trace, and video on failure so every alert links to forensic evidence:

// playwright.config.ts
import { defineConfig } from "@playwright/test";

export default defineConfig({
  use: {
    screenshot: "only-on-failure",
    trace: "retain-on-failure",
    video: "retain-on-failure",
  },
  timeout: 30000,
  retries: 1, // confirm-on-failure: re-run once before declaring failure
});

retries: 1 is the local form of confirm-on-failure — a single transient blip re-runs once before the check reports red, which kills most false positives without delaying real outage detection.

Step 4 — Run it on a schedule against production

CI runs tests on commits; monitoring runs them on a clock. The simplest scheduled runner is a cron workflow. In GitHub Actions:

name: synthetic-checkout
on:
  schedule:
    - cron: "*/5 * * * *" # every 5 minutes
  workflow_dispatch:

jobs:
  check:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with: { node-version: 20 }
      - run: npm ci && npx playwright install --with-deps chromium
      - run: npx playwright test checkout.spec.ts
        env:
          SYNTHETIC_EMAIL: ${{ secrets.SYNTHETIC_EMAIL }}
      - uses: actions/upload-artifact@v4
        if: failure()
        with:
          name: failure-evidence
          path: test-results/

This is the honest baseline: it works, it is free, and it gets you scheduled browser checks today. Its limits are also honest — cron granularity floors you at roughly one minute, GitHub-hosted runners give you one region, and a failed run uploads an artifact but does not page anyone. A dedicated synthetic monitoring platform exists to fix exactly those gaps (sub-minute intervals, multiple regions, built-in alerting); the tool comparison covers when the cron approach stops being enough.

Step 5 — Alert on failure, routed by severity

A scheduled check is only useful if a failure reaches a human. At minimum, wire the workflow's failure to a notification — Slack, email, PagerDuty — and route it by how much the journey matters. A failed checkout check pages on-call; a failed secondary-page check files a business-hours ticket. Map that to your incident severity levels so the response is consistent with the rest of your reliability process.

Step 6 — Watch the layer underneath the journey

A browser journey sits on top of API endpoints, and when checkout breaks you want to know immediately whether the failure is in the UI or in the API underneath. Monitoring those endpoints directly — with assertions on status, body, and JSON paths — turns "the whole flow is red" into "the /payment-intent endpoint is returning 500," which is most of the diagnosis done for you. It also covers the dependency case: if a synthetic checkout fails because a payment provider is degraded, seeing the vendor's status next to your failing API check shrinks your MTTR from a scramble to a glance.

What to do next

Read synthetic monitoring best practices for intervals, test-data safety, and de-flaking at scale.
Compare scheduled-runner versus dedicated platforms in the best synthetic monitoring tools in 2026.
Understand where browser checks fit against real-user data in synthetic monitoring vs RUM.

Cover the API endpoints and uptime that your Playwright journeys depend on — with multi-region checks, config-as-code, and a status page that updates from the same data — at app.devhelm.io. Your first monitor is live in about 60 seconds, no credit card.

Originally published on DevHelm.

DEV Community