DEV Community

Cover image for I Gave My AI Agent a UX Audit Superpower: CLI + MCP in 5 Minutes
Petri Lahdelma
Petri Lahdelma

Posted on

I Gave My AI Agent a UX Audit Superpower: CLI + MCP in 5 Minutes

Most AI coding agents can generate a full page of UI in seconds. None of them can tell you whether the result is actually usable.

Missing alt texts. Broken tab order. Contrast ratios that fail WCAG. Focus traps that don't trap. These are the things that ship when accessibility is an afterthought and AI makes it worse because it ships faster.

I built VertaaUX to close that gap. It runs a deep UX and accessibility audit on any URL and returns scored, actionable findings across seven categories: usability, clarity, information architecture, accessibility, conversion, semantic markup, and keyboard navigation.

Today I want to show you two ways to use it: the CLI for your terminal workflow, and the MCP server for AI agents like Claude, Cursor, and Copilot.

The CLI: One Command, Full Audit

Install globally or run with npx:

npx @vertaaux/cli audit https://your-site.com
Enter fullscreen mode Exit fullscreen mode

That's it. You get a scored report in your terminal with severity-ranked findings.

Here's a real run against vertaaux.ai itself:

$ vertaa audit https://vertaaux.ai --mode basic

[21:01:24] Running audit... (1/3) | 0 issues
[21:01:37] Audit Complete  score=72  issues=36  (12s)

Scores
──────────────────────────────────────────────
Overall: 72/100

Category              Score
──────────────────────────────────────────────
clarity               95
semantic              98
keyboard              82
usability             73
ia                    66
conversion            63
accessibility         62
Enter fullscreen mode Exit fullscreen mode

Seven categories, scored in 12 seconds. We eat our own dogfood — and yes, we have work to do on our own accessibility score.

Audit Modes

Three levels of depth depending on your needs:

# Fast broad check
vertaa audit https://your-site.com --mode basic

# Standard analysis (default)
vertaa audit https://your-site.com --mode standard

# Deep WCAG-focused audit
vertaa audit https://your-site.com --mode deep
Enter fullscreen mode Exit fullscreen mode

JSON Output for CI/CD

The --format json flag gives you structured output for piping:

vertaa audit https://your-site.com --format json | jq '.data.scores'
Enter fullscreen mode Exit fullscreen mode
{
  "ia": 66,
  "clarity": 95,
  "keyboard": 82,
  "semantic": 98,
  "usability": 73,
  "conversion": 63,
  "accessibility": 62
}
Enter fullscreen mode Exit fullscreen mode

Drilling Into Findings

Each issue comes with severity, business impact, and a recommended fix:

vertaa audit https://your-site.com --format json | jq '.data.issues[0]'
Enter fullscreen mode Exit fullscreen mode
{
  "title": "Too many navigation links",
  "severity": "warning",
  "description": "Found 17 links in navigation. Consider grouping or reducing to improve scanability.",
  "businessImpact": "Cognitive overload reduces navigation efficiency by 40%",
  "recommendedFix": "<!-- Group into dropdown menus -->...",
  "estimatedEffort": "medium"
}
Enter fullscreen mode Exit fullscreen mode

Not just "this is broken" — but why it matters and how to fix it.

GitHub Actions

Drop this into .github/workflows/a11y-gate.yml:

name: Accessibility Gate

on:
  pull_request:
    branches: [main]

jobs:
  audit:
    runs-on: ubuntu-latest
    steps:
      - uses: vertaaux/audit-action@v1
        with:
          url: ${{ vars.STAGING_URL }}
          api-key: ${{ secrets.VERTAAUX_API_KEY }}
          fail-on: critical
Enter fullscreen mode Exit fullscreen mode

PRs with critical accessibility violations don't merge. No manual review needed for the things machines can catch.

The MCP Server: Give Your AI Agent Eyes

The CLI covers your terminal and CI. But what about the AI agents you're already using to write code?

This is where MCP (Model Context Protocol) comes in. MCP lets AI agents call external tools. The VertaaUX MCP server exposes 38 tools that turn any MCP-compatible agent into a UX auditor.

Install

npm install -g @vertaaux/mcp-server
Enter fullscreen mode Exit fullscreen mode

Claude Desktop

Add to ~/Library/Application Support/Claude/claude_desktop_config.json:

{
  "mcpServers": {
    "vertaaux": {
      "command": "npx",
      "args": ["-y", "@vertaaux/mcp-server"],
      "env": {
        "VERTAAUX_API_KEY": "vx_live_..."
      }
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Cursor / VS Code

Same config pattern — add the MCP server to your editor's MCP settings.

What Can the Agent Do?

Once connected, your AI agent can:

  • Audit a URL — "Audit staging.myapp.com and tell me the top 5 issues"
  • Explain findings — "Why does this contrast ratio fail? Show me the fix"
  • Generate patches — "Create a PR that fixes the critical accessibility issues"
  • Compare competitors — "Audit our site and competitor.com side by side"
  • Track regressions — "Compare this audit to last week's baseline"
  • Gate deployments — "Does this pass our vertaa.policy.yml?"

The agent doesn't hallucinate these capabilities. It calls real tools with real browser-based analysis.

Real Workflow: Audit → Fix → Verify

Here's what a typical session looks like in Claude Desktop:

You: "Audit staging.myapp.com for accessibility issues"

The agent calls audit_url, waits for the browser-based audit to complete, and returns scored findings.

You: "Fix the critical issues in the signup form"

The agent calls suggest_fix for each finding, generates framework-aware patches (it detects React/Vue/Svelte from your package.json), and shows you the diffs.

You: "Open a draft PR with those fixes"

The agent calls generate_pr, which applies the patches atomically via the Git Trees API. Draft PR — human review before merge.

You: "Verify the fixes landed on staging"

The agent calls verify_fixes against the baseline audit. Fixed issues, still-broken issues, and new regressions — all categorized.

One conversation. No context switching. No separate tooling.

CLI + MCP: Better Together

The CLI and MCP server share the same engine and scoring. Use them together:

Scenario Surface
Local dev — quick check before pushing CLI
CI/CD — automated quality gate CLI + GitHub Action
Code review — "is this accessible?" MCP in your editor
Bug triage — investigate a reported issue MCP in Claude Desktop
Competitor analysis — how do we compare? MCP or CLI
Sprint planning — what's our UX debt? CLI with --format json output

Getting Started

  1. Get an API key at vertaaux.ai/settings/api
  2. Run your first audit: npx @vertaaux/cli audit https://your-site.com
  3. Connect the MCP server to your AI agent of choice
  4. Set up a CI gate to catch regressions automatically

The free tier gives you enough audits to evaluate. Pro unlocks all seven scoring categories and advanced fix generation.

Ready to Take It to the Next Level? Agent Skills.

The CLI and MCP server give your agent the tools. But tools without context produce generic results. What if your agent knew which audit profile to pick, how to interpret the scores, which CI thresholds to set, and how to chain audits into fix plans — without you spelling it out every time?

That's what VertaaUX Agent Skills do. They're published in the open Agent Skills format and work with Claude Code, Cursor, Codex, GitHub Copilot, Gemini CLI, and any host that supports the format.

Install with one command:

npx skills add VertaaUX/agent-skills
Enter fullscreen mode Exit fullscreen mode

Or install just the VertaaUX skill:

npx skills add VertaaUX/agent-skills --skill vertaaux
Enter fullscreen mode Exit fullscreen mode

What the skill gives your agent:

  • Audit profile selection — built-in profiles like quick-ux, wcag-aa, and ci-gate with a decision tree so the agent picks the right one for the task
  • Deterministic task recipes — step-by-step sequences for accessibility investigations, competitive reviews, CI setup, and remediation workflows
  • Guardrails — prevents the agent from hallucinating CLI flags or API parameters that don't exist
  • Skill composition contracts — explicit handoff conventions so the vertaaux skill chains cleanly into a11y-review, create-analyzer, and architecture-review

The difference: without the skill, your agent runs audit_url and dumps raw findings. With the skill, it picks the right profile, runs the audit, triages by severity, generates a fix plan, and sets up a CI gate — in one conversation.

Browse the skill on Smithery: petri-lahdelma/vertaaux — or just run npx skills add VertaaUX/agent-skills and start auditing.


If you're shipping UI with AI assistance, you need something checking whether that UI actually works for everyone. Lighthouse covers performance. VertaaUX covers the rest.

Links:

Top comments (1)

Collapse
 
supertrained profile image
Rhumb

Superpower is the right framing.

The onboarding model that seems to work best for agent tooling is capability-first, not connector-first. “One key, many superpowers” is easier for both the operator and the agent to reason about than a long list of named tools.

If one key lets the agent suddenly audit, extract, summarize, or generate useful output, you get to first value fast. Then you bring in the operator’s own systems only when the workflow actually needs them.

That is also why “38 tools” can be true without being the real product surface. The compounding value is usually one visible capability with clear structured output, not a tool graveyard the model has to rediscover every run.