Junior Diaz

Posted on Jun 19 • Originally published at blog.juniordiazbriceno.dev

Automating Web Accessibility Testing: Combining axe-core and Playwright

#a11y #playwright #testing #typescript

Introduction

In the previous article of this series, we covered how to detect unintended visual changes with visual regression testing in Playwright. This time we're tackling a different kind of bug that rarely shows up in a traditional functional test: accessibility violations.

Web accessibility isn't just a legal requirement — it's a fundamental principle of inclusive design. As an application grows, manually testing for accessibility compliance becomes impractical and error-prone. In this article we build an automated testing framework using Playwright and axe-core, and run it against AutoCatalog, my car catalog demo built specifically for this series (Next.js + TypeScript).

By the end of this article, you'll have a helper you can drop into your own project and start using right away.

Why Automate Accessibility Testing?

Adding automated accessibility testing to your workflow brings several concrete advantages:

Early detection: identify and fix issues before they reach production, significantly reducing remediation costs.
Consistent standards: automated tests apply the same criteria across every page and feature.
Regression prevention: a UI change that breaks accessibility gets caught before the merge, not after.
Compliance documentation: generated reports serve as evidence of WCAG 2.0/2.1 Level A and AA compliance efforts.
Developer empowerment: devs get immediate feedback on accessibility issues without needing to be accessibility experts.

The Tech Stack

Playwright

Playwright is a modern end-to-end testing framework with multi-browser automation capabilities. For accessibility testing it brings:

Reliable execution across Chromium, Firefox, and WebKit.
Built-in waiting mechanisms that ensure the page is fully loaded before analysis.
Natural integration with the rest of the suite (same fixtures, same page objects).
Real browser engine interaction, making tests reflect the DOM a user actually sees.

axe-core

Developed by Deque Systems, axe-core is the industry-standard engine for accessibility testing. It detects WCAG 2.0, WCAG 2.1, and other accessibility standard violations with very few false positives. Among what it identifies:

Missing alternative text on images
Insufficient color contrast
Incorrect heading hierarchies
Keyboard navigation issues
Missing form labels
Incorrect or missing ARIA attributes

@axe-core/playwright

This package connects Playwright and axe-core with a fluent API designed specifically for Playwright's Page object. It handles injecting axe-core into the browser context and extracting results, so you can focus on the test logic rather than the integration mechanics.

Architecture Overview

The accessibility testing framework relies on three layers:

Utility layer: reusable functions for analyzing pages and generating reports.
Test layer: concrete test cases for each page, modal, or state.
Report layer: transforms raw axe-core results into readable HTML reports, with detail on each violation.

Implementation: Building the Utility Layer

Core Analysis Function

analyzeAccessibility is the main interface for running the analysis:

import { Page } from '@playwright/test';
import AxeBuilder from '@axe-core/playwright';
import type { AxeResults } from 'axe-core';

export interface A11yAnalyzeOptions {
    tags?: string[];
    rules?: string[];
    disableRules?: string[];
    include?: string | string[];
    exclude?: string | string[];
}

export async function analyzeAccessibility(
    page: Page,
    options: A11yAnalyzeOptions = {}
): Promise<AxeResults> {
    const {
        tags = ['wcag2a', 'wcag2aa'],
        rules = [],
        disableRules = [],
        include = [],
        exclude = []
    } = options;

    const builder = new AxeBuilder({ page }).withTags(tags);

    if (rules.length > 0) {
        builder.withRules(rules);
    }

    if (disableRules.length > 0) {
        builder.disableRules(disableRules);
    }

    // Include specific elements/regions for testing
    if (include.length > 0 || (typeof include === 'string' && include)) {
        const includeSelectors = Array.isArray(include) ? include : [include];
        includeSelectors.forEach(selector => {
            builder.include(selector);
        });
    }

    // Exclude specific elements/regions from testing
    if (exclude.length > 0 || (typeof exclude === 'string' && exclude)) {
        const excludeSelectors = Array.isArray(exclude) ? exclude : [exclude];
        excludeSelectors.forEach(selector => {
            builder.exclude(selector);
        });
    }

    return await builder.analyze();
}

Key design principles:

WCAG compliance by default: the default tags are ['wcag2a', 'wcag2aa'], so without configuring anything you're already covering the baseline.
Flexible scoping: include and exclude let you test specific components in isolation.
Builder pattern: delegates to AxeBuilder for more complex analysis scenarios without reinventing the wheel.

💡 On the tags default: ['wcag2a', 'wcag2aa'] isn't an example value — it's the project's compliance target, defined once here in the helper. No need to repeat it in every test; you'd only override it when a specific test needs to deviate from that baseline (for example, to validate a particular rule with rules).

Report Generation

generateA11yReport transforms raw axe-core results into an HTML report:

import { createHtmlReport } from 'axe-html-reporter';
import * as fs from 'node:fs';
import * as path from 'node:path';

export interface A11yReportOptions {
    projectKey?: string;
    outputDir?: string;
    reportFileName?: string;
}

export function generateA11yReport(
    results: AxeResults,
    options: A11yReportOptions = {}
): void {
    const {
        projectKey = 'playwright-a11y',
        outputDir = path.join('test-results', 'a11y-reports'),
        reportFileName = 'a11y-report.html'
    } = options;

    // Ensure the output directory exists
    fs.mkdirSync(outputDir, { recursive: true });

    createHtmlReport({
        results,
        options: {
            projectKey,
            outputDir,
            reportFileName,
        }
    });
}

Each report includes the violation description, affected elements, suggested fixes, and links to the relevant guideline — very handy for sharing the report directly with a dev without having to translate axe-core's raw output.

Convenience Wrapper

analyzeAndReport combines analysis and reporting in a single call:

export interface A11yOptions extends A11yAnalyzeOptions, A11yReportOptions { }

export async function analyzeAndReport(
    page: Page,
    options: A11yOptions = {}
): Promise<AxeResults> {
    const { tags, rules, disableRules, include, exclude, projectKey, outputDir, reportFileName } = options;

    const results = await analyzeAccessibility(page, { tags, rules, disableRules, include, exclude });
    generateA11yReport(results, { projectKey, outputDir, reportFileName });

    return results;
}

Test Implementation Patterns

Full-Page Tests

A full-page accessibility test ensures the entire view meets accessibility standards:

test('Product Management - Default page - Accessibility validation', async ({ page }) => {
    await test.step('Validate accessibility', async () => {
        const results = await analyzeAndReport(page, {
            projectKey: 'autocatalog-a11y',
            reportFileName: 'manage-default.html',
        })
        expect(results.violations).toEqual([])
    })
})

If the page has accessibility violations, this test will fail — and the HTML report generated at manage-default.html details each one with its selector, description, impact, and suggested fix.

Modal Dialog Tests (with `include`)

Modals are a good candidate for automated accessibility testing because they concentrate several criteria axe-core can verify statically: dialog ARIA attributes (role, aria-label, aria-modal), labels associated with each form field, and visible focus or contrast indicators. But there's a problem with testing a modal through a full-page scan: it picks up everything else that's broken on the page too.

test('Product Management - Add Product modal open - Accessibility validation', async ({ page }) => {
    await test.step('Open Add Product modal', async () => {
        await managePage.openAddProductModal()
        await managePage.expectAddProductModalVisible()
    })

    await test.step('Validate accessibility', async () => {
        const results = await analyzeAndReport(page, {
            include: "[role='dialog']",
            projectKey: 'autocatalog-a11y',
            reportFileName: 'manage-add-product-modal.html',
        })
        expect(results.violations).toEqual([])
    })
})

📸 3 violations detected — full-page scan

📸 0 violations — scoped to [role='dialog']

Best Practices and Patterns

Test Organization

Organizing accessibility tests to reflect the application's structure helps a lot when reading reports and running subsets:

Use describe blocks to group by page/feature.
Name each test so it's clear which state or component is being validated.
Keep separate test files per feature.
Use tags like @A11y and feature-specific tags (@ProductManagement) for selective execution.

Example structure, based on AutoCatalog's real suite:

test.describe('@ProductManagement @A11y - Product Management accessibility tests', () => {
    test('Product Management - Default page - Accessibility validation', /* ... */)
    test('Product Management - Add Product modal open - Accessibility validation', /* ... */)
})

The same pattern repeats for HomePage (with tag @Home @A11y) and CartPage (@Cart @A11y), each with their own relevant states.

Selective Test Execution

The tag system lets you run specific subsets depending on what you need to validate:

# Run all accessibility tests
npx playwright test --grep "@A11y"

# Run only Product Management tests
npx playwright test --grep "@ProductManagement"

# Run only Home tests
npx playwright test --grep "@Home"

This gives you the flexibility to run everything on each PR, or split by feature for parallelization.

Understanding the Limits of Automated Testing

Automated accessibility tests catch a lot of issues, but not all of them. Automated tools typically identify between 30% and 50% of accessibility problems.

Issues That Require Manual Testing

Content quality: whether the alt accurately describes the image and its context.
Focus order: whether the tab order makes logical sense.
Cognitive accessibility: whether the content and navigation are comprehensible.
Meaningful sequence: whether content retains meaning when read linearly.
Complex interactions: whether multi-step flows work well with assistive technology.
Real screen reader experience: how content is announced and navigated.
Keyboard shortcuts: whether custom shortcuts conflict with those of assistive technology.

A Comprehensive Strategy

Automated testing (30-50% coverage): catches common violations early.
Manual testing with keyboard and screen reader (adds 20-30%).
Research with real assistive technology users (remaining issues + UX insights).
Ongoing developer education.

💡 Need to complement with visual testing? Some WCAG criteria, like 2.4.7 (Focus Visible), require visual verification that axe-core can't fully automate. Article 3 of this series covers how to validate focus indicators with visual regression in Playwright.

Tracking Progress

Some metrics worth keeping an eye on as the suite grows:

Violation trends: count and severity over time.
Test coverage: what percentage of the app is covered by accessibility tests.
Remediation time: how long it takes for a detected violation to get fixed.
Prevention rate: violations caught in PR vs. those that reach production.
Manual findings: issues automation missed (they tell you where the real gaps are).

Suite Structure

Here's how the accessibility suite is organized in autocatalog-testing:

tests/
├── accessibility/
│   ├── ProductManagementA11yTest.ts
│   │   └── @ProductManagement @A11y
│   │       ├── Default page
│   │       └── Add Product modal (include: [role='dialog'])
│   ├── HomeA11yTest.ts
│   │   └── @Home @A11y
│   └── CartA11yTest.ts
│       └── @Cart @A11y
│
├── utils/
│   └── AccessibilityHelper.ts
│       ├── analyzeAccessibility()
│       ├── generateA11yReport()
│       ├── analyzeAndReport()
│       └── generateReportFileName()
│
└── pageObject/
    ├── POManager.ts
    ├── HomePage.ts
    ├── ManagePage.ts
    └── CartPage.ts

Common Troubleshooting

Tests timing out

await page.waitForLoadState('networkidle');
await page.waitForSelector('.main-content');
const results = await analyzeAndReport(page, options);

Flaky tests due to dynamic content

await page.waitForSelector('[data-testid="product-table"]', { state: 'visible' });
const results = await analyzeAndReport(page, options);

Third-party components with many violations

If you're working with a third-party component you can't fix and need to exclude it from the analysis, @axe-core/playwright supports exclude for that (see the official docs). AutoCatalog doesn't have that scenario, so we don't cover it in detail here — but it's good to know the option exists if you need it.

Fun Facts

Three things I learned while building this suite that change how you read (or configure) an axe-core analysis:

axe-core reads the accessibility tree, not the DOM. That's why two tools can scan the same page and report different things without either being "wrong": axe-core evaluates the model that screen readers and other assistive technologies actually consume, while tools like WAVE read the DOM more directly. An element with aria-hidden="true" may have DOM-level issues (missing label, contrast problem) that WAVE flags — but axe-core correctly ignores it because for any assistive technology, that element simply doesn't exist.
color-contrast doesn't evaluate disabled elements. axe-core completely excludes from this rule any element with disabled or aria-disabled="true" — it doesn't mark it as "pass", "fail", or "incomplete", it simply doesn't evaluate it, following the exemption WCAG 1.4.3 makes for inactive components. A disabled button with terrible contrast will never show up in the report.
axe-core's tags go beyond WCAG. Playwright's docs only show four (wcag2a, wcag2aa, wcag21a, wcag21aa), but axe-core supports many more — wcag22aa, best-practice, section508, EN-301-549, among others. It's not something you'll need to touch every day, but the available surface is wider than it first appears, in case you ever need to align your tests with a standard other than WCAG.

The tool isn't being lenient — it's being precise. But precision only helps if you know what you're measuring.

Conclusion

Automated accessibility testing with Playwright and axe-core is a solid foundation for building more inclusive applications, and it catches a good share of problems well before they reach production.

Key takeaways:

Start small, iterate: begin with the most-used flows and expand coverage over time.
Combine automated and manual: automation covers 30-50%; manual testing is still necessary.
Use include to isolate the component you actually want to test, without the result getting mixed up with the rest of the page.
Understand what the tool measures — accessibility tree vs. DOM, color-contrast exclusions on disabled elements — before interpreting (or trusting) its results.
Use tags for selective execution, so you can run the full accessibility suite or just the subset you care about.

Does your team have automated accessibility coverage? If you want to explore how to implement this kind of testing in your project, tell me about your team here.

Additional Resources

This is the second article in the series — coming up next: **WCAG 2.4.7 Focus Visible: Visual Regression Testing with Playwright**

DEV Community

Automating Web Accessibility Testing: Combining axe-core and Playwright

Introduction

Why Automate Accessibility Testing?

The Tech Stack

Playwright

axe-core

@axe-core/playwright

Architecture Overview

Implementation: Building the Utility Layer

Core Analysis Function

Report Generation

Convenience Wrapper

Test Implementation Patterns

Full-Page Tests

Modal Dialog Tests (with `include`)

Best Practices and Patterns

Test Organization

Selective Test Execution

Understanding the Limits of Automated Testing

Issues That Require Manual Testing

A Comprehensive Strategy

Tracking Progress

Suite Structure

Common Troubleshooting

Tests timing out

Flaky tests due to dynamic content

Third-party components with many violations

Fun Facts

Conclusion

Additional Resources

Top comments (0)

Introduction

Why Automate Accessibility Testing?

The Tech Stack

Playwright

axe-core

@axe-core/playwright

Architecture Overview

Implementation: Building the Utility Layer

Core Analysis Function

Report Generation

Convenience Wrapper

Test Implementation Patterns

Full-Page Tests

Modal Dialog Tests (with include)

Best Practices and Patterns

Test Organization

Selective Test Execution

Understanding the Limits of Automated Testing

Issues That Require Manual Testing

A Comprehensive Strategy

Tracking Progress

Suite Structure

Common Troubleshooting

Tests timing out

Flaky tests due to dynamic content

Third-party components with many violations

Fun Facts

Conclusion

Additional Resources

Modal Dialog Tests (with `include`)