Chris

Posted on Apr 7

Your accessibility score is lying to you

#a11y #frontend #automation #lighthouse

Automated accessibility testing tools, such as axe-core by Deque, WAVE, Lighthouse are bit like a spellcheck for web accessibility. They are really useful for identifying and resolving many common accessibility issues quickly.

There are a whole range of tools that provide similar services, a way to detect some of the most common accessibility issues across a page.

The problem with automated accessibility scores

The problem is the way their reporting gives a score of out 100%. It gives the impression to the uneducated that an automated scoring once it reaches 80 or 90% is pretty good. However, these scores can be deeply misleading.

Automated tests typically detect only 20% to 40% of real accessibility issues. What about with AI I hear you scream? I'm sure that will increase but for now let's pause that for this post. Like a spell-checker that flags spelling mistakes but cannot understand meaning or context, it can't tell you if the book makes sense. These tools identify technical errors but miss many barriers that only humans can detect.

Deque’s own marketing materials claim they can detect up to 57% of issues, although at the time of writing I find it hard to review how they're arrived at this. Which websites? How was this tested etc? Are there user testing videos?

How this scoring misleads those in power

I was sat in a presentation recently, cringing, where a Product Owner and Lead Designer proudly assert their automated score of 70% suggesting their "almost there" when they are so far away from the reality...

Suddenly there was another epic piece of work to educate certain stakeholders about this misleading nature of this score.

A site scoring 70% might appear nearly compliant but if we accept the marketing claims of 57% then a “70%” score equates to roughly 39.9% of actual accessibility compliance. This discrepancy leads people to believe that accessibility work is largely complete, when in fact the majority of blockers remain unresolved.

Automated score (%)	Approx. % of actual issues detected (57%)
30	17.1
40	22.8
50	28.5
60	34.2
70	39.9
80	45.6
90	51.3
100	57

The wider consequences

When teams focus on improving their automated score, accessibility becomes a checkbox exercise rather than a genuine effort to create accessible experiences. Developers start “fixing for the tool” instead of fixing for disabled users. The whole goal is to simply get the tooling to give a green light.

This has several negative effects:

Teams make superficial somewhat performative, changes to satisfy tooling rather unblock disabled people.
Businesses suddenly think they are compliant when they are not, giving them a sense of false confidence.
Leadership tend to use these scores to justify reducing investment in accessibility.
Most importantly, disabled users remain unable to complete tasks such as checking out, navigating menus, or using interactive features.

Why automated tools still matter

Don't get me wrong, automated accessibility tools should not be dismissed, They are excellent for identifying obvious issues and ensuring consistency across large codebases. However, they are only a starting point, not a comprehensive solution. They are not a replacement for testing with real disabled users.

The things below can't be skipped

Manual testing with assistive technologies
User testing with people with disabilities

Without these, even a “perfect” automated score is somewhat meaningless.

Time to get uncomfortable

The uncomfortable truth is that, in many organisations Accessibility isn’t treated as a commitment to unblocking people, it’s a risk management piece. For some leaders, it’s not about people, it’s about protection.

They invest in automated tools, chase high Accessibility scores because if they’re ever challenged legally, they can point to those numbers as “evidence” of compliance, hoping no one looks too closely.

Sometimes the companies selling these Accessibility testing tools also have a vested interest in keeping those scores high. Their products are compared against other platforms, and a higher “score” looks better in sales demos. They get their subscription fees whether or not disabled people can actually use the product or service.

Update the metrics

I would love for these tools to update their scoring metrics.

Change their metrics, imagine if axe-core or Lighthouse had a maximum score of 57%. There was no way to get to 100%, that would shift the understanding instantly.

Misunderstanding these scores can give an organisations a dangerous illusion of compliance and may not actually improve the experience for disabled people.

Top comments (11)

AgentKit • Apr 21

This matches what we found running weekly scans on AI product landing pages. We audited 29 of them over a few months -- every single one passed axe-core's color contrast and alt-text checks. Perfect scores on those specific rules. But then we did keyboard-only walkthroughs and screen reader testing, and all 29 had structural WCAG failures that the automated tools either missed entirely or flagged as "needs review" (which nobody reviews).

The 30-40% detection range you mention feels right from our data too. The scary part isn't the score itself, it's that teams use it as a stopping point. "We got 95, ship it."

Chris • Apr 24

😔 Exactly, so many disabled people are still blocked.

Elmar Chavez • Apr 9

This is an eye opener for me. I've been using Lighthouse for checking the accessibility for most of my projects. Good read, I'll keep this in mind.

Chris • Apr 24

I'm glad! This is what worries me, it's not your fault it's down to the Google team - why setup the scoring like this?

Archit Mittal • Apr 14

This is an important distinction that too many teams miss. A Lighthouse 100 gives a false sense of security because automated tools can only catch about 30% of real accessibility issues. The ones they miss - logical tab order, meaningful focus management, screen reader announcement timing - are often the ones that actually block users. I've started adding manual keyboard-only navigation testing as a CI gate in my projects. It's not automated accessibility testing, but spending 2 minutes tabbing through your critical flows catches more real issues than any score.

Victor Okefie • Apr 10

The table is the evidence. A 90% automated score is actually 51% of real issues. That's not a measurement gap. That's a lie told in percentages. The tools keep the scale because 57% doesn't sell. 100% does. The real fix isn't better automation. It's admitting that accessibility can't be reduced to a number. But that doesn't fit on a dashboard. So the illusion persists.

Chris • Apr 24

"That's a lie told in percentages" Exactly you've nailed the problem around what "sells"

Bhavin Sheth • Apr 9

I have seen teams celebrate a 90% Lighthouse score while basic keyboard navigation was completely broken 😅

Automated tools are great for quick wins, but real issues only show up when you actually try using the product like a user. Score ≠ accessibility.

Chris • Apr 24

Exactly, why does Google lighthouse do this? They're not selling people on scores, they're just miseducating a whole generation of Developers about accessibility

ShaynaProductions • Apr 10

I use axe-jest to ensure my components pass minimal testing. It's useful for catching common programming mistakes as a sanity check. I do think a tool like Accessibility Insights (free) can help guide developers and testers toward better accessibility. While it has an automated segment, most of it requires manual interactions, and each test scenario maps to WCAG.

Chris • Apr 24 • Edited

They're great tools but they're only part of the solution, the worry is that people fix for the tool and think they're done when this is not the case.