Your AI Agent Can Read the DOM. It Can't See the Screen.

#playwright #ai #typescript #mcp

Here's a test that passes every time:

await expect(page.locator('.checkout-button')).toBeVisible();

And here's what the AI agent doesn't know: the checkout button is at y: 1450px on a 375px mobile viewport. It exists. It's visible according to the DOM. The test is green. The user can't reach it without scrolling three screens down, and on some devices a sticky cookie banner overlaps it by 60%.

The agent read the accessibility tree. It didn't see the screen.

The gap between the DOM and the render

When an AI agent analyzes a Playwright failure or writes a new test, it works with what Playwright exposes by default: roles, labels, text content, ARIA attributes. This is the right abstraction for functional testing.

But layout bugs don't live in the DOM. They live in the render engine's output — in coordinates, z-indexes, bounding boxes, and intersection ratios. A button can be display: block, visibility: visible, opacity: 1, and completely unreachable by a real user.

Current tools for this problem are either pixel-diff based (noisy, breaks on anti-aliasing) or proprietary enterprise AI (Applitools, Percy). There's no open-source tool that gives an AI agent structured geometric data from a live browser.

That's what playwright-spatial-layout-mcp does.

How it works

The MCP server launches a headless Chromium browser, navigates to a URL, and extracts geometric data using getBoundingClientRect() and getComputedStyle() in a single page.evaluate() call — one browser round-trip per element batch.

Four tools:

extract_bounding_boxes — returns position, size, z-index, and viewport visibility for any set of selectors.

{
  "url": "https://your-app.com/checkout",
  "selectors": [".checkout-button", ".cookie-banner", "nav"],
  "viewport": { "width": 375, "height": 812 }
}

[
  {
    "selector": ".checkout-button",
    "box": { "x": 16, "y": 892, "width": 343, "height": 48 },
    "z_index": "auto",
    "is_visible": true,
    "is_in_viewport": false
  }
]

The button exists. It is not in the viewport. The agent now knows this.

detect_visual_occlusion — computes the intersection ratio between two elements' bounding boxes.

{
  "url": "https://your-app.com/checkout",
  "target_selector": ".checkout-button",
  "overlay_selector": ".cookie-banner"
}

{
  "is_occluded": true,
  "intersection_ratio": 0.61,
  "occluded_area_px": 4128
}

61% of the button's area is under the cookie banner. The agent can now report this as a bug, not a passing test.

verify_spatial_relationships — validates layout rules and returns a pass/fail with a human-readable reason per rule.

Six rule types: left_of, right_of, above, below, contains, not_overlapping.

{
  "url": "https://your-app.com",
  "rules": [
    { "type": "above", "element_a": "nav", "element_b": ".hero" },
    { "type": "not_overlapping", "element_a": ".sidebar", "element_b": ".main-content" }
  ]
}

{
  "passed": false,
  "results": [
    { "passed": true,  "reason": "'nav' bottom (64px) is above '.hero' top (64px)" },
    { "passed": false, "reason": "'.sidebar' and '.main-content' overlap by 12%" }
  ]
}

This is layout spec-as-code — the agent asserts design constraints the same way it asserts functional ones.

compute_viewport_reflow — tracks how element geometry changes across breakpoints. All viewports are processed in parallel.

{
  "url": "https://your-app.com",
  "selectors": [".hero-cta", "nav"],
  "viewports": [
    { "width": 375, "height": 812 },
    { "width": 768, "height": 1024 },
    { "width": 1280, "height": 720 }
  ]
}

[
  {
    "selector": ".hero-cta",
    "shifted": true,
    "max_delta_x": 442,
    "max_delta_y": 318,
    "max_delta_width": 897
  }
]

The CTA moved 442px horizontally and 318px vertically between mobile and desktop. The agent knows which element is most volatile across breakpoints.

What the agent can do with this

Before this MCP, an AI agent writing or debugging Playwright tests operated blind to rendering. It could tell you the button has role="button" and aria-label="Checkout". It could not tell you where the button is on screen.

With spatial data in context, the agent can:

Detect that a passing test covers a button the user can't actually click
Identify which elements are off-screen on mobile before a test suite runs
Verify that a CSS refactor didn't break the layout without running visual regression diffs
Catch z-index wars where one component silently slides under another after a merge

The shift is from "does this element exist in the DOM" to "can a real user reach this element on this device."

Installation

npm install -g playwright-spatial-layout-mcp
npx playwright install chromium

Add to your Claude Desktop config:

{
  "mcpServers": {
    "playwright-spatial-layout-mcp": {
      "command": "npx",
      "args": ["-y", "playwright-spatial-layout-mcp"]
    }
  }
}

Then ask your agent:

"Check if the cookie banner is blocking the checkout button on a 375px viewport"

"Verify that nav is above the hero section and sidebar doesn't overlap main content"

"Which elements shift the most when resizing from desktop to mobile?"

Part of a larger ecosystem

This is the fifth MCP server in a series of open-source tools for the Playwright/TypeScript testing ecosystem:

playwright-trace-decoder-mcp — root-cause analysis from Playwright trace.zip files
flakiness-knowledge-graph-mcp — knowledge graph of flaky test patterns over time
ast-impact-mapper-mcp — TypeScript AST-based test impact analysis
zod-contract-mock-forge-mcp — deterministic mock generation from Zod schemas

Each one addresses a specific blind spot — what the agent can't reason about without structured tool access. Spatial layout was the most visible one.

npm: https://www.npmjs.com/package/playwright-spatial-layout-mcp

GitHub: https://github.com/vola-trebla/playwright-spatial-layout-mcp