Embed video without YouTube ads or buffering. Try SmartVideo free →

Video Accessibility: Captions, Transcripts & WCAG Compliance (2026)

WCAG Level AA requires captions, audio descriptions, and a keyboard-accessible player for your website videos. Here's exactly what you need to do — with the April 2026 ADA deadline approaching.

Person using assistive technology on a computer, representing digital accessibility and inclusive web design

The ADA Title II compliance deadline for state and local government websites is April 24, 2026 — weeks away. If your site has video content without captions, you're not just losing viewers. You're potentially facing legal action in a landscape where digital accessibility lawsuits hit 4,187 in 2024 and are tracking 37% higher in 2025 (UsableNet, 2024).

But here's the thing most compliance guides won't tell you: making your videos accessible isn't just about avoiding lawsuits. It's a practical business decision. 80% of people who use captions aren't hearing impaired — they're watching in noisy environments, in quiet offices, or in their non-native language (Verizon Media & Publicis Media, 2019). Accessible video is better video for everyone.

WCAG Level AA requires: Synchronized captions for pre-recorded video, audio descriptions (not just transcripts), and a keyboard-accessible player.
Captions are not subtitles: Captions include speaker ID, sound effects, and music cues. Subtitles only translate dialogue and do NOT satisfy WCAG requirements.
April 2026 ADA deadline: Large public entities must meet WCAG 2.1 AA for web content, including video. Private businesses face active litigation now.
Business case: Captioned videos see 80% higher completion rates and 12% more views — accessibility is a growth lever, not just a compliance checkbox.
ℹ️
What is video accessibility? Video accessibility means ensuring that video content can be perceived, understood, and interacted with by people of all abilities — including those who are deaf, hard of hearing, blind, or have motor impairments. In practice, this involves captions, transcripts, audio descriptions, and an accessible video player (W3C WAI, 2024).

Most guides frame video accessibility as a legal requirement you need to check off. That's true — but it's also incomplete. The business case is strong enough to justify the work even without the legal risk.

Reach. Over 70% of Americans now watch content with subtitles or captions enabled (Kapwing, 2024). That includes people in noisy commuter environments, non-native English speakers, and anyone who just prefers reading along. If your video doesn't have captions, you're excluding the majority of viewers — not a minority.

Engagement. Viewers are 80% more likely to watch a video to completion when closed captions are available (3Play Media). Facebook's internal research found captions increased video views by 12% (Lemonlight). For product videos, training content, or explainer videos on your website, that's a measurable impact on conversion and retention.

SEO. Search engines can't watch your video, but they can index your captions and transcripts. Adding a text transcript to your video page gives Google crawlable content that matches search queries. This is particularly valuable for online course videos and training content where long-tail search traffic matters.

Legal risk. The enforcement landscape is real and growing. 4,187 digital accessibility lawsuits were filed in 2024, and 2025 is pacing 37% higher. The DOJ's April 2024 final rule establishes WCAG 2.1 Level AA as the standard for ADA Title II compliance, with the first deadline hitting April 24, 2026 for public entities serving populations over 50,000 (DOJ, 2024).

Person using assistive technology on a computer, representing digital accessibility and inclusive web design
Accessible video design benefits everyone — not just users with disabilities. Photo on Unsplash

Closed Captions vs. Subtitles: The Difference That Matters for Compliance

These terms are often used interchangeably, but they mean different things — and the distinction matters for WCAG compliance.

Feature Closed Captions Subtitles
Purpose Designed for deaf/hard-of-hearing viewers Designed for language translation
Dialogue Yes Yes
Speaker identification Yes — identifies who is speaking Usually not
Sound effects Yes — [door slams], [phone rings] No
Music cues Yes — [upbeat music], [suspenseful score] No
WCAG compliant? Yes — satisfies SC 1.2.2 No — missing required non-dialogue audio
Can be toggled? Yes (closed) or burned in (open) Usually toggleable

The bottom line: If you have subtitles on your video but no proper closed captions, you're not WCAG compliant. Captions must include all meaningful audio — dialogue, speaker identification, sound effects, and music cues — not just a translation of the spoken words.

WCAG Video Requirements Explained

WCAG (Web Content Accessibility Guidelines) is the technical standard that defines what "accessible" means for web content. Here's what it requires for video, broken down by conformance level:

Level SC # Requirement Applies To
A 1.2.1 Text alternative for audio-only or video-only content Pre-recorded audio-only and video-only
A 1.2.2 Synchronized captions Pre-recorded video with audio
A 1.2.3 Audio description OR full text transcript Pre-recorded synchronized media
AA 1.2.4 Live captions Live video with audio
AA 1.2.5 Audio description (transcript alone is NOT sufficient) Pre-recorded synchronized media
AA 2.1.1 All player controls keyboard accessible All media players
AA 4.1.2 ARIA labels on all player controls All media player controls
AAA 1.2.6 Sign language interpretation Pre-recorded synchronized media
AAA 1.2.8 Full text alternative for all media Pre-recorded synchronized media

Most organizations target Level AA — that's what the DOJ requires for ADA Title II and what courts reference in ADA Title III cases. Level AAA is aspirational but rarely required by law.

The most commonly misunderstood requirement is SC 1.2.5 (audio descriptions at Level AA). At Level A, you can provide a full text transcript as an alternative. But at Level AA, a transcript alone isn't enough — you must provide actual audio descriptions that narrate visual content for blind users. This catches many organizations off guard because they assume their transcript covers everything.

How to Make Your Videos Accessible

Here's the practical implementation, step by step:

1. Add Closed Captions

Captions are the highest-impact accessibility feature. Here's the workflow:

Generate a first draft. Use an auto-captioning service (YouTube, Rev, Otter.ai, or Descript) to create an initial caption file. Auto-generated captions have improved significantly, but they're still not compliant on their own — the FCC standard requires 99% accuracy, and auto-captions typically hit 85-95%.

Edit for accuracy. Review every line. Fix proper nouns, technical terms, and homophones. Add speaker identification ([James:], [Customer:]) and non-dialogue audio ([background music], [applause], [door closes]).

Save as WebVTT (.vtt) or SRT (.srt). These are the two standard caption formats supported by virtually all video players. VTT is the web-native format and supports more styling options. For a detailed walkthrough, see our guide on how to create a VTT caption file.

Upload to your video platform. How you attach captions depends on your hosting setup. Most dedicated video hosting platforms accept VTT/SRT uploads through their dashboard. If you're self-hosting with an HTML5 video player, you add a <track> element pointing to your VTT file.

2. Provide Transcripts

A transcript is a text version of all audio and visual content in your video. Unlike captions (which are time-synchronized), transcripts are a standalone document — typically placed below the video on the same page.

For Level A compliance, a transcript can serve as your "media alternative" (SC 1.2.3). At Level AA, you still need audio descriptions in addition to any transcript.

Transcripts also serve a practical SEO function: they give search engines full-text content to index on your video page, which is particularly valuable for embedded video content that would otherwise be invisible to crawlers.

3. Add Audio Descriptions (When Required)

Audio descriptions are a narrated track that describes important visual content — actions, scene changes, on-screen text, charts — during natural pauses in dialogue. They're designed for blind and visually impaired users.

When are they required?

  • Level A: You can provide a text transcript instead (SC 1.2.3)
  • Level AA: Audio descriptions are required — a transcript alone doesn't satisfy SC 1.2.5
  • Practical exemption: If all visual information is already conveyed through the audio track (e.g., a talking-head video where the speaker describes everything), no additional audio description is needed

For most corporate and marketing videos, the practical approach is to script your videos with accessibility in mind. If the narrator verbally describes what's on screen ("As you can see in the chart, costs dropped 40% in Q3"), you've effectively built audio description into the original content.

4. Ensure Player Accessibility

This is the requirement most guides skip — and where many organizations fail compliance audits. WCAG 2.1 Level AA requires:

  • Keyboard navigation: Play, pause, volume, seek, captions toggle, and fullscreen must all be operable with keyboard only (no mouse required)
  • ARIA labels: Every player control must have proper ARIA attributes so screen readers can announce what each button does
  • Focus indicators: When tabbing through controls, users must see a visible focus indicator showing which control is active
  • No keyboard traps: Users must be able to tab into and out of the video player without getting stuck
🚀
Your video player is part of your compliance posture.
When you embed video on your website, the player's keyboard navigation, ARIA labels, and caption support directly affect your WCAG compliance. SmartVideo delivers video through an accessible player with caption track support — no YouTube branding, no ads, and no third-party compliance gaps. Learn how it works

Why Your Video Hosting Choice Affects Compliance

Here's something the W3C documentation and compliance guides don't address: your video hosting platform is part of your accessibility stack.

If you embed a YouTube video on your site, YouTube's embedded player becomes part of your page. And YouTube's embedded iframe has known accessibility limitations:

  • Keyboard navigation in the embedded player can be inconsistent — tabbing through controls doesn't always follow a logical order
  • Caption controls in embedded mode behave differently than on youtube.com — users may not be able to toggle captions reliably
  • Auto-generated captions on YouTube are enabled by default but rarely meet the 99% accuracy standard required for compliance
  • You don't control the player — YouTube can change their player's accessibility features at any time, and you inherit those changes on your site

This doesn't mean YouTube embeds are inherently non-compliant. But it means you're outsourcing a critical compliance component to a platform you don't control. For organizations serious about accessibility — particularly those subject to ADA Title II — using a dedicated video platform where you control the player, caption tracks, and delivery gives you more predictable compliance.

Team collaborating on digital content, representing inclusive web design and accessibility planning
Building accessible video into your workflow benefits your entire audience. Photo on Unsplash

Common Video Accessibility Mistakes

These are the failures that come up most often in compliance audits:

Relying on auto-generated captions without review. YouTube and other platforms generate captions automatically, but they regularly miss proper nouns, technical terms, and speaker changes. Auto-captions also don't include sound effects or music cues. They're a starting point, not a finished product.

Providing subtitles instead of captions. If your caption file only contains translated dialogue — no speaker identification, no sound effects, no music cues — it doesn't satisfy WCAG SC 1.2.2. This is the most common "we thought we were compliant" failure.

Assuming a transcript replaces audio descriptions at Level AA. At Level A, yes. At Level AA (the standard courts and the DOJ reference), you need actual audio descriptions for videos where visual content isn't described in the dialogue. A transcript on the page isn't sufficient.

Ignoring the video player itself. You can have perfect captions and transcripts, but if your video player can't be operated by keyboard alone, you fail SC 2.1.1. This is particularly common with custom-built players and some embedded third-party players.

Not captioning background music and sound effects. If a dramatic music swell signals a mood change, or a notification sound indicates an action, that audio carries meaning. Captions should note it: [suspenseful music], [notification chime], [audience laughter].

Publishing live video without real-time captions. Level AA (SC 1.2.4) requires live captions for live video with audio. If you're live streaming events, webinars, or presentations, you need a real-time captioning service or a qualified CART provider.

Video Accessibility Checklist

Use this to audit your existing video content or plan new videos:

Requirement Level Check
Synchronized closed captions on all pre-recorded video A Captions include dialogue, speaker ID, sound effects, music
Caption accuracy reviewed (99% target) A Auto-captions edited for proper nouns, technical terms
Text alternative for audio-only content A Podcast episodes have transcripts
Audio description or transcript for visual content A Visual-only info is described in audio or transcript
Audio descriptions provided (not just transcript) AA Visual content narrated for blind users
Live captions for live video AA Real-time captioning service in place
Player controls keyboard accessible AA Play, pause, volume, seek, CC all work via keyboard
Player controls have ARIA labels AA Screen readers can announce each control's function
No keyboard traps in the player AA Users can tab in and out of the player
Caption file format is VTT or SRT Best practice Web-standard formats ensure cross-player compatibility

The regulatory framework can be confusing. Here's a quick orientation:

ADA Title II applies to state and local government entities. The DOJ's April 2024 final rule requires WCAG 2.1 Level AA compliance for web content. Deadline: April 24, 2026 for entities serving populations over 50,000; April 26, 2027 for smaller entities.

ADA Title III applies to private businesses ("places of public accommodation"). There's no explicit federal deadline, but courts have increasingly ruled that websites are covered, and WCAG 2.1 AA is the standard courts reference. If you're a business with a website that includes video, you're in scope — and 4,187 lawsuits in 2024 show enforcement is active.

Section 508 applies to federal agencies and their contractors. It directly references WCAG 2.0 Level AA (with ongoing alignment to 2.1). If you sell to the federal government or produce content for federal agencies, your videos must comply.

All three frameworks converge on the same practical standard: WCAG 2.1 Level AA. If you meet that, you're covered for virtually all U.S. regulatory contexts.

What's the difference between closed captions and subtitles?

Closed captions include all meaningful audio in a video: dialogue, speaker identification, sound effects (door slams, phone rings), and music cues (suspenseful music, upbeat score). They're designed for deaf and hard-of-hearing viewers. Subtitles only translate spoken dialogue and are designed for viewers who speak a different language. For WCAG compliance, you need closed captions — subtitles alone don't satisfy the captioning requirements because they omit non-dialogue audio information.

Do I need audio descriptions for every video?

Not necessarily. Audio descriptions are required at WCAG Level AA (SC 1.2.5) for pre-recorded video where important visual content is not already described in the audio track. If your video is a talking-head presentation where the speaker verbally describes everything on screen, including charts, slides, and on-screen text, no additional audio description is needed. But if your video shows visual demonstrations, text overlays, or actions that aren't narrated, you need an audio description track for Level AA compliance.

Are YouTube auto-generated captions WCAG compliant?

No, not on their own. YouTube's auto-generated captions typically achieve 85 to 95 percent accuracy, which falls short of the 99 percent accuracy standard the FCC uses as a benchmark. Auto-captions also miss speaker identification, don't include sound effect descriptions, and frequently mishandle proper nouns and technical terms. They're a useful starting point for generating a draft caption file, but you need to review and edit them for accuracy before relying on them for compliance.

What's the difference between open and closed captions?

Open captions are permanently burned into the video — they're always visible and can't be turned off. Closed captions are delivered as a separate text track (usually a VTT or SRT file) that viewers can toggle on or off. Both satisfy WCAG captioning requirements. Closed captions are generally preferred because they give users control and can be styled differently by different players. Open captions are useful for social media platforms that don't support caption tracks natively.

Does the April 2026 ADA deadline apply to my business?

The April 24, 2026 deadline specifically applies to ADA Title II, which covers state and local government entities with populations of 50,000 or more. Smaller public entities have until April 26, 2027. Private businesses fall under ADA Title III, which has no specific federal deadline but faces active litigation. Over 4,000 digital accessibility lawsuits were filed in 2024 alone. Even if the April 2026 deadline doesn't technically apply to your private business, courts increasingly reference WCAG 2.1 Level AA as the standard, making proactive compliance the safer path.

What caption file format should I use?

WebVTT (VTT) is the recommended format for web video. It's the native caption format for HTML5 video players, supports styling options, and is widely supported across browsers and video platforms. SRT (SubRip) is the other common format and works well for most platforms but has fewer styling options than VTT. Both formats are simple text files with timestamps and caption text. Most video hosting platforms accept both. If you're embedding video on a website using an HTML5 player, VTT is the standard choice.

Can I just add a transcript instead of captions?

A transcript is not a substitute for captions. WCAG SC 1.2.2 (Level A) specifically requires synchronized captions for pre-recorded video with audio. A transcript doesn't provide the same experience because it's not time-synchronized with the video content. However, transcripts serve a different purpose: they satisfy SC 1.2.3 at Level A as a media alternative, they're useful for search engine optimization, and they help users who prefer reading over watching. Best practice is to provide both captions and a transcript.

How do I test my video player for keyboard accessibility?

Unplug your mouse and try to use your video player with only the keyboard. Tab to the player and verify you can reach all controls: play/pause, volume, seek bar, captions toggle, and fullscreen. Each control should show a visible focus indicator (usually an outline) when selected. Press Enter or Space to activate controls. Most importantly, verify you can Tab out of the player after interacting with it — keyboard traps (where focus gets stuck inside the player) are a common failure. If any control is unreachable or the player traps focus, it fails WCAG SC 2.1.1.

Is YouTube's embedded player WCAG compliant?

YouTube's player on youtube.com meets many accessibility standards, but the embedded iframe version that appears on your website has known limitations. Keyboard navigation through embedded player controls can be inconsistent, caption toggling behavior differs from the native YouTube experience, and you have limited control over the player's accessibility features since YouTube controls the embedded iframe code. This doesn't automatically make YouTube embeds non-compliant, but it means you're depending on a third party to maintain accessibility features on your site. For organizations with strict compliance requirements, using a dedicated video player where you control the caption tracks and keyboard navigation provides more predictable compliance.

How much does video captioning cost?

Costs vary widely depending on your approach. Auto-captioning services like Otter.ai or Descript are included in subscriptions starting around $10 to $25 per month, but you'll still need to manually review and edit for accuracy. Professional human captioning services typically charge $1 to $3 per minute of video through providers like Rev, 3Play Media, or Verbit. For a 10-minute video, that's $10 to $30 for professional captions. Given that captioned videos see 80 percent higher completion rates and 12 percent more views, the ROI on captioning is strong even without the compliance angle.