Roger Rajaratnam

Posted on Apr 22 • Originally published at sourcier.uk

The playwright-explore-website Copilot skill

#engineering #tooling #automation

Original post: The playwright-explore-website Copilot skill

This is the standalone write-up behind the workflow I referenced in
my JavaScript London talk companion post.

The short version is that playwright-explore-website is a small GitHub
Copilot skill that tells Copilot to use the Playwright MCP server to open a
site, explore a few important user flows, document what it found, and suggest
candidate test cases.

In the latest version, it also treats Playwright as the source of truth for
rendered UI checks. If the task involves a visible change or regression, the
agent should inspect the current UI first, make the change, verify the updated
state afterwards, and clean up temporary screenshots unless the user asked to
keep them.

I did not invent the starting point from scratch. The initial version was
based on the original
playwright-explore-website example in awesome-copilot.
What I changed was the setup guidance and the execution rules, so it behaves
better in real browser work, especially for regression checks and visual QA,
instead of acting like a generic prompt stub.

This post stays focused on the skill itself: where it came from, how I set it
up locally, and the guardrails I added. The talk companion post covers the
broader Playwright testing workflow around it.

Diagram fallback for Dev.to. View the canonical article for the full version: https://sourcier.uk/blog/playwright-explore-website

What the skill is for

This skill sits in the gap between "open the browser and poke around" and
"write the final Playwright test file".

It is useful when you need to:

understand an unfamiliar product area quickly
smoke test a staging or preview deployment
reproduce a bug report that is missing exact steps
verify rendered UI before and after a visible change
identify candidate locators before writing a Playwright test
turn exploratory browsing into a short list of candidate scenarios

That is why I like it. It makes exploration explicit.

The original awesome-copilot example

The original version in awesome-copilot is deliberately small.

At a high level, it tells Copilot to:

navigate to the provided URL with Playwright MCP
identify and interact with 3 to 5 core user flows
document the interactions, locators, and expected outcomes
close the browser context afterwards
summarise the findings and propose test cases

That is already a good baseline because it forces exploration before test
generation. The model has to look at the real site first instead of guessing
what the UI probably looks like.

My local setup

I keep this as a personal skill rather than a repo-specific one.

That means the file lives at:

~/.copilot/skills/playwright-explore-website/SKILL.md

I prefer that because the same workflow is useful across multiple projects.
Once the skill is there, Copilot can discover it in any repo where browser
exploration makes sense.

The frontmatter stays intentionally simple:

---
name: playwright-explore-website
description: 'Website exploration for testing using Playwright MCP'
---

The important part is not the title. It is that the description contains the
right trigger words, so Copilot can load it when the task is about website
exploration, Playwright, or browser-based testing.

Playwright MCP setup

One of the biggest gaps in the original example is that it assumes the
Playwright MCP tools are already available.

That is fine once your machine is configured. It is less helpful the first time
you try to use the skill and the tools are missing.

I added an explicit setup section so the skill can bootstrap the missing piece
instead of failing vaguely.

The CLI path is the shortest route:

code --add-mcp '{"name":"playwright","command":"pnpm","args":["dlx","@playwright/mcp@latest"]}'

If you prefer to wire it up in settings.json, the equivalent config is:

"mcp": {
  "servers": {
    "playwright": {
      "command": "pnpm",
      "args": ["dlx", "@playwright/mcp@latest"]
    }
  }
}

After that, reload VS Code or the Copilot extension and accept the prompt to
start the MCP server.

This matters because a good skill should not only describe the happy path. It
should also help the agent recover when a prerequisite is missing.

The enhancements I added

I kept the core purpose the same, but I tightened how the skill runs.

1. Self-bootstrapping setup instructions

The added ## MCP Server Setup section gives the agent a concrete fallback when
the Playwright tools are unavailable.

That turns a dead end into a fixable setup problem.

2. A serial exploration rule

I added a rule for multi-page and multi-breakpoint work:

If you need to compare multiple pages or breakpoints, inspect them serially or
in separate tabs. Do not queue parallel navigations and screenshots against the
same Playwright page context.

This is a practical guardrail. Browser exploration becomes noisy very quickly if
you mix multiple navigations and screenshots in one live context.

3. Rendered UI review

The newer version also makes the rendered browser state the thing to trust.

Use Playwright to review the rendered UI directly. For implementation or
regression checks, inspect the current state first, then verify the updated
state after changes instead of relying on code inspection alone.

This is the rule that changed the skill most in practice. CSS, JSX, or Astro
templates are not the final UI. The browser is. If the job is about a visible
change, the skill now pushes Copilot to validate the actual rendered result
before it signs off.

4. Visual audit prompts

I also added an explicit visual review prompt:

For visual audits, explicitly note supporting-label readability,
hero-to-first-section spacing, footer divider spacing, and
last-section-to-footer separation when relevant.

That came from real UI review work. I wanted the skill to be useful not just
for functional exploration, but also for browser-based design checks.

5. Screenshot cleanup

I also added a cleanup rule:

Delete any temporary screenshots you created during the session unless the
user explicitly asked to keep them.

This sounds small, but it matters. Exploration and regression review can leave
behind a pile of disposable screenshots very quickly. If the skill creates
artifacts to reason about the UI, it should also leave the workspace tidy when
those artifacts are no longer needed.

6. Stronger output requirements

The local version is stricter about what the exploration should produce.

It does not stop at "I clicked around and it looked fine". It asks for:

the user interactions that were performed
the relevant UI elements and likely locators
the expected outcomes for each flow
a concise summary of findings
proposed test cases based on the exploration

That output maps much more cleanly to a future Playwright test file.

Where this fits

This post is intentionally about the skill itself.

If you want the wider workflow around it, including how I use it to explore a
risky journey, compare it with codegen, and turn the findings into a real test,
that is in Playwright E2E testing AI skills: JavaScript London talk.

That boundary is deliberate. The skill helps you explore a real browser session
and capture candidate flows, locators, and outcomes. The Playwright test is
still the final artifact.

DEV Community