Accessibility Testing That Actually Works

Last year we inherited a project from another agency. The client told us it was "fully accessible" because they'd run a Lighthouse audit and scored 96. When we tested with a screen reader, the main navigation was completely unusable. The modal for creating new records trapped focus with no escape. Every table had identical "Edit" link text, so a screen reader user would hear "Edit, Edit, Edit, Edit" with no idea which record each link referred to. A 96 Lighthouse score, and a screen reader user couldn't complete a single core workflow.

What You Need to Know

Automated accessibility scanners catch approximately 30% of WCAG failures. They're necessary but nowhere near sufficient
Manual keyboard testing takes 15 minutes per page and catches issues no scanner can detect
Screen reader testing is the single highest-value manual test you can run
Build accessibility checks into CI rather than saving them for a pre-launch audit
User testing with disabled people finds problems that even expert manual testing misses

The 30% Problem

Automated tools are good at checking things that can be measured by reading HTML. Does the image have alt text? Does the text meet contrast requirements? Does the form input have an associated label? Is the HTML valid?

These are real issues and they're worth catching. But they're the easy ones. The harder problems are structural, behavioural, and contextual. No scanner can tell you that your keyboard focus order jumps from the sidebar to the footer, skipping the main content. No scanner can tell you that your drag-and-drop interface has no keyboard alternative. No scanner can tell you that your alt text says "image" instead of describing what the image actually shows.

30%

approximate share of WCAG issues detectable by automated tools alone

Source: UK Government Digital Service, Accessibility Audit Research, 2020

That 30% figure comes up consistently across research. The UK Government Digital Service found similar numbers when they audited their own services. Deque, who make axe, are open about the limitations. Automated scanning is a floor, not a ceiling.

Layer 1: Automated Scanning in CI

Start with automation. It's cheap and it catches the low-hanging fruit.

axe-core integrates into most testing frameworks. Run it against every page in your integration tests. When it finds a failure, the build fails. This prevents regressions. You won't ship a form without labels because the CI pipeline won't let you.

// Example: axe with Playwright
const { test, expect } = require('@playwright/test')
const AxeBuilder = require('@axe-core/playwright').default

test('dashboard has no accessibility violations', async ({ page }) => {
  await page.goto('/dashboard')
  const results = await new AxeBuilder({ page }).analyze()
  expect(results.violations).toEqual([])
})

Lighthouse CI is another option, more general purpose. It runs accessibility audits alongside performance and SEO checks. Useful for a broad view but less detailed than axe for accessibility specifically.

Set a threshold. Don't just report the score. Fail the build if accessibility violations appear. A report that nobody reads is the same as no report. A build that blocks on violations actually prevents issues from reaching production.

Run scans against your most-used pages first. Dashboard, main list view, primary form, detail page. Cover the core workflows before sweeping every route.

Layer 2: Manual Keyboard Testing

This takes 15 minutes per page and catches problems that automated tools structurally cannot detect.

The test is simple. Put your mouse in a drawer. Open the page. Try to complete the primary task using only your keyboard. Tab through the interface. Activate buttons with Enter. Navigate menus with arrow keys. Open and close modals. Fill in forms. Submit them.

What you'll find:

Focus order problems. The tab sequence jumps around the page instead of following the visual layout. You tab from the header straight to the footer, missing the main content entirely.

Focus traps. You open a dropdown or modal and can't tab out of it. You're stuck. The only escape is clicking the mouse or refreshing the page, neither of which a keyboard-only user can do.

Missing keyboard handlers. Custom components that respond to clicks but not to Enter or Space. A settings toggle that works with a mouse but does nothing when you press Enter on it.

Invisible focus. You're tabbing through the page but you can't see where you are. The focus indicator has been removed or is too subtle to see against the background.

John and I disagree on how often this should happen. He thinks every sprint. I think every significant UI change, which ends up being roughly the same cadence. The point is that it needs to be regular and attached to the development cycle, not saved for a quarterly audit.

This is where most teams stop. And it's where the most impactful issues live.

You don't need to be an expert screen reader user. You need to turn one on and attempt a basic task. VoiceOver comes free on every Mac (Cmd+F5). NVDA is free on Windows.

What to listen for:

Does the page have a logical structure? When you navigate by headings (VO+Cmd+H on Mac), do you get a meaningful outline of the page? Or do you hear "heading level 2, heading level 2, heading level 2" with no hierarchy?

Are interactive elements announced correctly? When you focus a button, does the screen reader say "Submit application, button" or just "Submit"? When you focus a link, does it say "View case 2847" or "Click here"?

Do dynamic changes get announced? When you submit a form and a success message appears, does the screen reader announce it? If not, the user has no idea their action worked. Use aria-live="polite" for status updates and aria-live="assertive" for errors.

Do modals work? When a dialog opens, does the screen reader announce the dialog title? Can you navigate within it? When it closes, does focus return to the button that opened it?

The first time I used a screen reader on something I'd built, I was embarrassed - visually polished on screen, but through a screen reader, it was chaos. That experience changed how I build things.

Rainui Teihotua

Chief Creative Officer

Layer 4: User Testing with Disabled People

This is the layer that catches what even expert testers miss. A person who uses assistive technology daily has workflows, mental models, and expectations that a sighted developer testing with VoiceOver for 15 minutes won't replicate.

We've done this on three enterprise projects now. Every time, the testers found issues that none of us caught. Not edge cases. Core workflow problems. Things like: the filter sidebar requires too many keystrokes to use efficiently, so they ignore it entirely and use browser find instead. Or: the notifications badge updates visually but doesn't announce, so they miss time-sensitive alerts.

Finding participants in New Zealand takes effort. Blind Low Vision NZ and the Deaf Society of New Zealand are good starting points. Budget for it. Pay participants properly. Treat it like any other user research, because that's exactly what it is.

Making It Stick

The pattern that works is layered and continuous:

Every build: Automated axe-core scans in CI against core pages. Build fails on violations.

Every significant UI change: 15-minute keyboard test by the developer who built it. Part of the PR checklist, same as code review.

Every release: Screen reader spot-check on new or changed workflows. Doesn't need to be exhaustive. 30 minutes covering the main changes.

Every quarter: Full manual audit of core workflows. Ideally including at least one external tester who uses assistive technology.

This isn't onerous. The total time investment is maybe two hours per sprint for a team of five. The alternative is shipping inaccessible software to hundreds of daily users who can't complain because the software is mandated, running a panicked audit before a government contract renewal, and spending three times longer on remediation than prevention would have cost.

Accessibility testing that works is the same as any testing that works. It's automated where possible, manual where necessary, continuous rather than periodic, and built into the development process rather than bolted on at the end.