DEV Community

Serhiy Pikho
Serhiy Pikho

Posted on

The Quest for Ultra-Reliable E2E UI Tests: Making API Contracts First-Class in Page Objects

Over the last couple of years I’ve repeatedly encountered the same frustrations relating to test stability while writing and maintaining UI tests across different frameworks, Cypress, WebdriverIO, and Playwright.

Much of the literature on reliable UI testing focuses on selector strategy, polling mechanics, and retry behaviour. These techniques are useful and often necessary. I’ve relied on them extensively. But they all operate within the same assumption: that the DOM is the primary boundary of the system under test.

Increasingly, I’ve come to the conclusion that this assumption is incomplete.


Core Idea

A UI is not a self-contained system. It is a rendering layer over a set of network contracts.

If the API calls required by a page do not complete successfully, the UI cannot function correctly. The DOM may render and elements may exist, but the underlying state is already invalid.

The practical implication is that readiness should be defined in terms of contract completion rather than DOM stability.

Instead of waiting for locators to stabilise, readiness can be defined by the successful resolution of specific, declared network dependencies.


Effects of This Change

Moving readiness to the network layer changes several properties of a test suite.

  1. Polling logic becomes simpler. Instead of repeatedly checking element state, we track deterministic request/response pairs.
  2. Diagnostics improve. Failures are attributed to specific contracts. Instead of a late-stage “locator not found”, the failure identifies the exact request that did not occur or returned non-OK (examples below).
  3. Run time improves. Tests terminate earlier. If a required API fails, there is little value in continuing execution.
  4. Parallel execution becomes more predictable. When tests are aware of backend dependencies, they partially account for system load rather than interpreting load-related failures as UI instability.

This does not eliminate flakiness entirely. It reduces ambiguity and moves failures closer to their cause.


Prior Art and Limits of Existing Signals

Network awareness is not new. Most modern frameworks expose mechanisms to wait for network conditions.

For example, Playwright allows:

await page.goto(url, { waitUntil: "networkidle" })
Enter fullscreen mode Exit fullscreen mode

In practice, broad signals such as “network idle” are imprecise in modern applications. Background traffic, analytics calls, and third-party injections can keep the network active indefinitely. Waiting for the absence of traffic is not equivalent to waiting for the contracts that actually matter.

Direct use of waitForResponse inside individual tests can address specific cases, but it distributes readiness logic across the suite. Each test author decides what to wait for and how strictly to enforce it. The approach described here centralises that concern in the Page Object layer, making readiness a structural property rather than a local decision.

Similarly, network mocking tools solve a different problem, determinism of inputs, rather than enforcing correctness of contract completion in integrated environments.

The distinction is subtle: this approach targets explicitly declared dependencies rather than relying on global heuristics or per-test decisions.


Implementation Example

Large-scale UI test suites commonly model each page as a Page Object.

A typical Page Object defines:

  • Locators
  • Interaction helpers
  • Page-specific assertions

If we treat a Page Object as a model of a page, then the model should include more than visible elements. It should also describe the network contracts required for that page to function.

One way to structure this is for each page to declare its required and optional API dependencies. A shared base layer observes network traffic, matches it against those declarations, and enforces readiness rules.

Readiness becomes an architectural concern rather than something encoded repeatedly inside tests.


Base Page Layer

All Page Objects inherit from a shared base class. The example below is simplified but representative.

import { type Page } from "@playwright/test";
import { type DataDependency, DataDependencyTracker } from "./dataDependencyTracker";

export abstract class BasePage {
  protected readonly tracker: DataDependencyTracker;

  constructor(protected readonly page: Page) {
    this.tracker = DataDependencyTracker.getInstance(page);
  }

  protected apiContracts = {
    profile: { method: "GET", url: /\/api\/profile(?:\?|$)/ },
    basket: { method: "GET", url: /\/api\/basket(?:\?|$)/ },
    quote: { method: "POST", url: /\/api\/quote(?:\?|$)/ },
    confirm: { method: "POST", url: /\/api\/confirm(?:\?|$)/ }
  } satisfies Record<string, DataDependency>;

  protected async transition(
    action: () => Promise<unknown>,
    override?: DataDependency[]
  ) {
    const dependencies = override ?? this.dataDependencies;

    await Promise.all([
      this.tracker.waitForAll(dependencies),
      action()
    ]);
  }

  abstract get dataDependencies(): DataDependency[];
}
Enter fullscreen mode Exit fullscreen mode

Several structural decisions are important:

  • A tracker instance is scoped to the Playwright Page (isolated per page / worker).
  • A central registry of API contracts.
  • A transition boundary that gates navigation.
  • A dataDependencies getter, allowing dependencies to be declared statically or computed conditionally.

The ordering inside Promise.all is intentional. Dependency tracking begins before the action resolves. If tracking were started after the click, fast requests could be missed. The boundary must observe the transition, not react to it.


Data Dependency Tracker

The tracker is a thin layer over Playwright’s network primitives.

import { type Page, type Request, type Response } from "@playwright/test";

export type DataDependency = {
  url: RegExp;
  method: string;
};

export type TimeoutOpts = {
  requestMs: number;
  responseMs: number;
  requireRequest: boolean;
  requireOk: boolean;
};

export class DataDependencyTracker {
  private static trackerRegistry = new WeakMap<Page, DataDependencyTracker>();

  private readonly cached = new Set<string>();
  private readonly notApplicable = new Set<string>();

  private readonly reducedRequestMs = 1_000;

  private readonly defaults: TimeoutOpts = {
    requestMs: 30_000,
    responseMs: 60_000,
    requireRequest: true,
    requireOk: true
  };

  private constructor(private readonly page: Page) {}

  static getInstance(page: Page): DataDependencyTracker {
    let tracker = this.trackerRegistry.get(page);

    if (!tracker) {
      tracker = new DataDependencyTracker(page);
      this.trackerRegistry.set(page, tracker);
    }

    return tracker;
  }

  async waitFor(dep: DataDependency, opts: Partial<TimeoutOpts> = {}) {
    const method = dep.method.toUpperCase();
    const key = `${method} ${dep.url.toString()}`;

    const { requestMs, responseMs, requireRequest, requireOk } = {
      ...this.defaults,
      ...opts
    };

    const effectiveRequestMs =
      this.cached.has(key) || this.notApplicable.has(key)
        ? this.reducedRequestMs
        : requestMs;

    let request: Request | undefined;

    try {
      request = await this.page.waitForRequest(
        r =>
          dep.url.test(r.url()) &&
          r.method().toUpperCase() === method,
        { timeout: effectiveRequestMs }
      );
    } catch {
      this.notApplicable.add(key);

      if (requireRequest) {
        throw new Error(`Expected request did not occur: ${method} ${dep.url}`);
      }

      return;
    }

    const response = await this.page.waitForResponse(
      r => r.request() === request,
      { timeout: responseMs }
    );

    this.notApplicable.delete(key);
    this.cached.add(key);

    if (!response.ok() && requireOk) {
      throw new Error(
        `Non-ok response: ${method} ${dep.url} (${response.status()})`
      );
    }
  }

  async waitForAll(deps: DataDependency[], opts: Partial<TimeoutOpts> = {}) {
    await Promise.all(deps.map(d => this.waitFor(d, opts)));
  }
}
Enter fullscreen mode Exit fullscreen mode

Example: A Full Contract-Gated Flow

Consider a minimal checkout flow:

  1. The landing page loads profile data.
  2. The order page loads basket data.
  3. The review page prepares totals.
  4. Submitting the order triggers confirmation.
  5. The success page depends on that confirmation succeeding.

Each page declares its required contracts by referencing the central apiContracts repository. Transitions are gated through transition().


Page Objects

export class LandingPage extends BasePage {
  get dataDependencies(): DataDependency[] {
    return [this.apiContracts.profile];
  }

  async goToCompleteOrderPage(): Promise<CompleteOrderPage> {
    const next = new CompleteOrderPage(this.page);

    await this.transition(
      () => this.page.getByRole("link", { name: "Checkout" }).click(),
      next.dataDependencies
    );

    return next;
  }
}

export class CompleteOrderPage extends BasePage {
  get dataDependencies(): DataDependency[] {
    return [this.apiContracts.basket];
  }

  async goToReviewPage(): Promise<ReviewPage> {
    const next = new ReviewPage(this.page);

    await this.transition(
      () => this.page.getByRole("button", { name: "Continue" }).click(),
      next.dataDependencies
    );

    return next;
  }
}
Enter fullscreen mode Exit fullscreen mode

The remaining pages (ReviewPage, OrderSuccessPage, etc.) follow the same pattern.


The Test

test("user completes checkout", async ({ page }) => {
  const landingPage = new LandingPage(page);
  const completeOrderPage = await landingPage.goToCompleteOrderPage();
  const reviewPage = await completeOrderPage.goToReviewPage();
  await reviewPage.verifyTotals();
  const successPage = await reviewPage.submitOrder();
  await successPage.expectSuccess();
});
Enter fullscreen mode Exit fullscreen mode

No explicit waits are required here.

No waitForResponse().
No expectLoaded().
No arbitrary timeouts.

If a method returns a Page Object, its declared contracts have completed. Readiness is enforced at the architectural boundary, not inside the test.


Failure Modes

If the submission request never occurs:

Expected request did not occur: POST /\/api\/confirm(?:\?|$)/
Enter fullscreen mode Exit fullscreen mode

If the response is non-OK:

Non-ok response: POST /\/api\/confirm(?:\?|$)/ (500)
Enter fullscreen mode Exit fullscreen mode

Failures surface at the transition boundary, not at a later DOM assertion.

Under parallel load, slow paths remain localised to their declared dependencies rather than inflating global timeouts.


Limitations

This pattern is not a replacement for DOM-level checks. A page can satisfy its network contracts and still render incorrectly.

Contract-driven readiness complements DOM assertions; it does not eliminate them.

It also introduces coupling between Page Objects and backend contracts. When APIs change, test models must change with them. In larger systems this can be desirable, contract drift becomes visible, but in smaller or rapidly evolving projects it may feel heavy.

Another practical difficulty is discovery. Modern UI applications often abstract API calls behind state management layers, Zustand, MobX, Redux, or custom data clients. From the perspective of the test author, the network dependencies of a page are not always obvious.

Declaring API contracts manually requires either familiarity with the frontend implementation or careful inspection of network traffic. This adds friction, particularly in large or fast-moving codebases.

In practice, this led to additional tooling to assist with dependency discovery rather than relying solely on manual declaration. That is likely a separate discussion.

Top comments (0)