DEV Community

Cover image for I Built a Free WCAG Accessibility Scanner — Here's What I Learned
Zdenek Spacek
Zdenek Spacek

Posted on

I Built a Free WCAG Accessibility Scanner — Here's What I Learned

As a solo developer building in public, I recently launched AccessiGuard — a free WCAG accessibility scanner. What started as a side project to help developers catch accessibility issues early has taught me more about web standards, automated testing, and edge cases than I ever expected.

Here's the technical journey, the challenges I faced, and what I learned along the way.

Why Another Accessibility Tool?

The accessibility landscape is changing fast. The EU's European Accessibility Act (EAA) is already in effect, and US government entities face an April 2026 deadline for WCAG compliance. Just last year, accessiBe was fined $1 million by the FTC for misleading accessibility claims.

I wanted to build something honest: a tool that tells you what it actually checks, doesn't make inflated promises, and remains free for developers who want to catch issues before they become lawsuits.

The Tech Stack

I kept it intentionally simple:

  • Next.js 14 (App Router) — for the web interface and API routes
  • Cheerio — for fast, server-side HTML parsing
  • TypeScript — because accessibility checks require precision
  • React — for the frontend dashboard
  • Vercel — for hosting and edge functions

The beauty of Cheerio over a full browser automation tool like Puppeteer is speed. We can parse and analyze HTML in milliseconds rather than seconds. The tradeoff? We can't check everything that requires JavaScript execution or visual rendering.

What We Actually Check

I'll be honest: automated tools can only catch about 30-40% of WCAG issues. The rest require human judgment. But that 30-40% matters — those are the low-hanging fruit that plague most websites.

Here's what AccessiGuard scans for:

1. Missing Alt Text on Images

This is the most common issue. Here's a simplified version of the check:

function checkImageAlt($) {
  const issues = [];

  $('img').each((i, elem) => {
    const $img = $(elem);
    const alt = $img.attr('alt');
    const role = $img.attr('role');

    // Decorative images should have empty alt or role="presentation"
    if (role === 'presentation' || role === 'none') {
      return; // Skip
    }

    // Non-decorative images must have alt text
    if (alt === undefined) {
      issues.push({
        type: 'missing-alt',
        element: $.html($img),
        wcag: '1.1.1',
        level: 'A'
      });
    }
  });

  return issues;
}
Enter fullscreen mode Exit fullscreen mode

Challenge faced: Distinguishing between truly missing alt attributes and intentionally empty ones (alt=""). Empty alt is valid for decorative images, but missing alt is always an error.

2. Form Labels

Forms are critical for accessibility. Every input needs a label:

function checkFormLabels($) {
  const issues = [];

  $('input, select, textarea').each((i, elem) => {
    const $input = $(elem);
    const id = $input.attr('id');
    const ariaLabel = $input.attr('aria-label');
    const ariaLabelledby = $input.attr('aria-labelledby');
    const type = $input.attr('type');

    // Skip hidden inputs and buttons
    if (type === 'hidden' || type === 'submit' || type === 'button') {
      return;
    }

    // Check for label association
    const hasLabel = id && $(`label[for="${id}"]`).length > 0;
    const hasAriaLabel = ariaLabel || ariaLabelledby;

    if (!hasLabel && !hasAriaLabel) {
      issues.push({
        type: 'missing-form-label',
        element: $.html($input),
        wcag: '3.3.2',
        level: 'A'
      });
    }
  });

  return issues;
}
Enter fullscreen mode Exit fullscreen mode

Challenge faced: Modern frameworks like React often use aria-label or wrap inputs in labels without for attributes. I had to account for multiple valid labeling patterns.

3. Heading Hierarchy

Headings should follow a logical order (h1 → h2 → h3), not skip levels:

function checkHeadingOrder($) {
  const issues = [];
  const headings = [];

  $('h1, h2, h3, h4, h5, h6').each((i, elem) => {
    const level = parseInt(elem.name.substring(1));
    headings.push({ level, text: $(elem).text().trim() });
  });

  for (let i = 1; i < headings.length; i++) {
    const current = headings[i].level;
    const previous = headings[i - 1].level;

    // Check if we skip levels (e.g., h2 → h4)
    if (current - previous > 1) {
      issues.push({
        type: 'heading-skip',
        message: `Heading level ${current} appears after level ${previous}`,
        wcag: '1.3.1',
        level: 'A'
      });
    }
  }

  return issues;
}
Enter fullscreen mode Exit fullscreen mode

Challenge faced: Some modern designs intentionally use CSS to style headings differently than their semantic level. I had to decide whether to flag semantic issues or trust the developer's intent.

4. Color Contrast

This is where it gets tricky. Without rendering the page, we can't truly measure contrast. So AccessiGuard:

  • Parses inline styles and <style> tags
  • Flags suspicious color combinations
  • Recommends manual testing with browser DevTools
function checkColorContrast($) {
  const warnings = [];

  $('[style*="color"]').each((i, elem) => {
    const $elem = $(elem);
    const style = $elem.attr('style');

    // Simple regex to extract colors (not production-ready)
    const colorMatch = style.match(/color:\s*([^;]+)/);
    const bgMatch = style.match(/background(-color)?:\s*([^;]+)/);

    if (colorMatch && bgMatch) {
      warnings.push({
        type: 'contrast-warning',
        message: 'Manual contrast check recommended',
        element: $.html($elem),
        wcag: '1.4.3',
        level: 'AA'
      });
    }
  });

  return warnings;
}
Enter fullscreen mode Exit fullscreen mode

Challenge faced: Accurate contrast calculation requires computed styles from a rendered page. I chose to flag potential issues and recommend tools like WAVE or Axe DevTools for final verification.

5. Language Declaration

Simple but critical:

function checkLanguage($) {
  const issues = [];
  const htmlLang = $('html').attr('lang');

  if (!htmlLang) {
    issues.push({
      type: 'missing-lang',
      message: 'HTML element missing lang attribute',
      wcag: '3.1.1',
      level: 'A'
    });
  }

  return issues;
}
Enter fullscreen mode Exit fullscreen mode

The Architecture

Here's how a scan works:

  1. User submits URL via the Next.js frontend
  2. API route (/api/scan) receives the request
  3. Fetch HTML using native fetch() with a 10-second timeout
  4. Parse with Cheerio — convert HTML string to queryable DOM
  5. Run checks — all check functions execute in parallel
  6. Aggregate results — combine issues by severity (A, AA, AAA)
  7. Return JSON — frontend displays results

The entire scan typically takes 500ms to 2 seconds, depending on page size.

Edge Cases and Gotchas

SVGs with <title> elements

Screen readers handle SVG accessibility differently across browsers. I initially flagged SVGs without aria-label, but missed that <title> elements inside SVGs are valid accessible names.

Dynamic Content

Single-page apps (SPAs) often render content client-side. Cheerio only sees the initial HTML. Solution: I added a notice recommending browser-based tools (Axe DevTools, Lighthouse) for SPA testing.

Iframe Content

Iframes are separate documents. I can detect their presence but can't scan cross-origin content without violating CORS. I flag this limitation in the report.

ARIA Overrides

If an element has aria-hidden="true", it's invisible to screen readers — even if it has other accessibility issues. I had to adjust checks to respect ARIA states.

What I'd Do Differently

Use a headless browser for premium scans. Cheerio is fast but limited. For a paid tier, I'd add Playwright or Puppeteer to check rendered styles, computed contrast, and JavaScript-generated content.

Add axe-core integration. The axe-core library is battle-tested and catches issues I haven't coded for yet. I wanted to build the core myself first to learn, but I'll likely integrate it soon.

More granular reporting. Right now, results are grouped by WCAG level. I should add filtering by issue type, element, and page section.

Lessons Learned

  1. Accessibility is hard. Even automated checks require nuance. There's no one-size-fits-all rule.
  2. Be honest about limitations. Users trust tools that admit what they can't do.
  3. Speed matters. Developers won't use a slow tool. Cheerio's simplicity pays off.
  4. Edge cases are infinite. Every new scan reveals a pattern I didn't anticipate.

What's Already Live

Since the initial build, I've shipped:

  • Continuous monitoring — scheduled scans with email alerts (paid tiers at $29/$79/$199/month)
  • Historical tracking — see how your accessibility score changes over time
  • Multi-page scans — crawl entire sites, not just one page
  • AI-powered fix suggestions — actionable code snippets to resolve each issue

What's Next

What I'm working on now:

  • PDF compliance reports — downloadable, shareable with clients or legal
  • CI/CD integration — GitHub Action to catch accessibility regressions before deploy
  • EU localization — Czech/German landing pages for the European market

Try It Yourself

AccessiGuard is free and always will be for single scans. No signup required.

👉 accessiguard.app

Scan your site, get actionable feedback, and fix issues before they become problems.


Building in public as a solo founder. If you have feedback, questions, or want to discuss accessibility testing, drop a comment below. I read and respond to everything.


Want more technical deep-dives? Follow me here on Dev.to. Next up: "How I Built Continuous Accessibility Monitoring with Cron Jobs and Serverless Functions."

Top comments (0)