Video and audio: accessibility for developers

  • Author: Access iQ ®
  • Date: 1 Feb 2013
  • Access: Premium

Quick facts

The proliferation of online multimedia content has the potential to make web content more accessible to all people in convenient and easy to absorb formats, but there is also the potential to render content even less accessible if appropriate steps are not taken.

  • Provide a transcript for audio content
  • Provide captions on video content
  • Provide an audio equivalent or full text alternative for video content
  • Ensure media player accessibility
  • Ensure keyboard accessibility and avoid keyboard traps
  • Give the user control to start, pause and stop the video or audio content

The web has seen a proliferation of video and audio content in recent years, as bandwidth has increased, and browsers and media players have improved their ability to deliver such content smoothly.

Multimedia once referred to CD-ROMs and highly interactive applications that offered choices to users to consume a variety of content across different types of media, from video and sound through to moving images, text and other interactive elements. More recently, multimedia usually refers to playing audio and/or video content within a browser and that is where this topic will focus.

This has the potential to make web content more accessible to all people in convenient and easy to absorb formats, but there is also the potential to render content even less accessible if appropriate steps are not taken.

WCAG 2.0 refers to multimedia content as time-based media because both audio content and video content depend on the passing of time to deliver the full content. It includes both pre-recorded and live-streamed content, allowing for slight differences in the requirements for the live video/audio content.

Guideline 1.2 Time-based media states:

Provide alternatives for time-based media.

This has ramifications for the provision of all online video and audio content, including:

  • audio-only
  • video-only
  • audio-video
  • audio and/or video with interaction

There are nine success criteria involved — three that establish Level A conformance, two that take conformance to Level AA and another four to reach Level AAA conformance.

WCAG 2.0 notes that where time-based media presents a version of content that is also provided in text form, the time-based media does not need to be made accessible. This is because the “main” text version provides the accessible alternative to the video already. For example, if a webpage contains a complete text transcript of a speech, and a video of the speech being made is also provided, then the video does not need to be made accessible by adding captions.

Note: There is no restriction whether a single version of multimedia is enhanced to be made accessible in whatever ways required (for example by the inclusion of open or closed captions), or two versions are made available: one standard (without captions) and one accessible (with captions). In some circumstances, the former approach may be more appropriate, as it means maintaining only one resource.

You may also come across the term synchronised media. This refers to content where audio and video is synchronised to present meaningful content, for example a video slideshow with an audio track. This not only needs to be made accessible, but in such a way that the synchronisation is maintained.

It is permissible to pause an element of the presentation in order to include accessible details, but the synchronisation must remain intact. For example, if the visual element presented requires audio descriptions, the previously existing audio may need to be paused or extended in order to fit the description at an appropriate place that does not change the existing synchronisation.

At all points, the aim is to ensure that multimedia content is accessible to people with disabilities. WCAG 2.0 is designed to offer pragmatic, testable solutions to help verify this goal is achieved.

Making video and audio content accessible

To meet WCAG 2.0 Level A requirements for pre-recorded audio and video content, you must provide:

  • For audio only content: full-text transcripts
  • For video only content: audio equivalent or full text alternative
  • For pre-recorded audio-video content: audio description or full text alternative, as well as captions, either closed or open

To meet WCAG 2.0 Level AA, you must additionally have captions for live content and audio descriptions for pre-recorded audio-video content, as opposed to the choice between audio descriptions and full-text alternative specified for Level A.

Transcripts (audio-only)

Transcripts are required for all pre-recorded audio-only content. A transcript is a text version of all dialogue and all meaningful sounds in an audio file.

For example, if a character in a video hears a dog bark off screen and turns her head in reaction, the transcript should include something like [dog barks off-screen to the right, Jane looks towards the dog]. However, if the sound is not meaningful, then those sounds do not need to be described, such as if two people are talking on a park bench and children are playing in the background and have nothing to do with the content,.

We recommend transcripts and full text alternatives be provided in HTML so they can be accessed via a web browser. Since the person is already using a browser it is not good practice to make them download a file and open it in another application to continue with your content. Both PDF and DOC/DOCX (Word) are proprietary formats that require the user to download and install third-party software to view the file.

Audio equivalent or full text alternative (video-only)

Either an audio equivalent or full text alternative is required for all pre-recorded video-only content.

An audio equivalent is an audio track that describes the information in the video. A full text alternative is a text transcript that describes the same information that is displayed visually in the video.

A full text alternative for audio-video content will also satisfy Success Criterion 1.2.1: Audio-only and video-only (pre-recorded) which states:

For pre-recorded audio-only and pre-recorded video-only media, the following are true, except when the audio or video is a media alternative for text and is clearly labelled as such: (Level A)

Audio description or full text alternative (pre-recorded audio-video)

Either an audio description or a full text alternative (described above) is required for all pre-recorded audio-video content. This includes the requirement, to include audio descriptions, similar to the text descriptions mentioned at “Transcripts (audio-only)” above, of any events that are visual only.

Audio description is descriptive narration of all the important visual elements of a video for the benefit of people who are blind or vision impaired. Audio description is written and recorded so that it is presented in-between dialogue and other important audio elements. For example if a scene has the character jogging in the park, there should be a narrator, speaking in a neutral voice, something like [John jogs in the park] to describe the meaningful action to people who cannot see it.

To meet WCAG 2.0 Level A for pre-recorded audio-video content, there is a choice between audio description and a full text alternative, as outlined in Success Criterion 1.2.3 Audio description or media alternative (pre-recorded):

An alternative for time-based media or audio description of the pre-recorded video content is provided for synchronised media, except when the media is a media alternative for text and is clearly labelled as such. (Level A)

To meet WCAG 2.0 Level AA, the option for full text alternative is not available and audio description must be provided for all pre-recorded audio-video content, as outlined in Success Criterion 1.2.5 Audio description (pre-recorded):

Audio description is provided for all pre-recorded video content in synchronised media. (Level AA)


Captions are required for all pre-recorded audio-video content that has audio content that does not have an alternative that is not auditory. Just as video requires visual content to be available in an alternate medium, like text or sound, audio is required to be in an alternate format for people who cannot hear it.

Captions are a transcription of all audio elements of a video, including dialogue, sound effects, song lyrics, and descriptions of the lyrics or characteristics of music if they are important in the context of the video. Captions generally appear at the bottom of the screen and are synchronised with the video's audio track.

To satisfy WCAG 2.0 Level A, captions may be either 'open' or 'closed'. Open captions are encoded onto a video (i.e. they are part of the video image itself) and will be seen by anyone who watches it. Closed captions are a text overlay and can be turned on or off by the viewer. The means to control the captions must also be accessible.

WCAG 2.0 Level AA conformance requires that captions also be provided for the audio component of live video. This requires the use of a live captioning service, something most organisations will outsource.

Media player accessibility

As well as providing alternatives to the original media, the media player itself must also be accessible.

You may, for instance, have provided audio description for people who are blind or vision impaired. That is unhelpful, however, if that person cannot access the controls of the media player to play the video, or turn the audio description on or off.

Some of the common accessibility issues found in media players are:

  • Keyboard accessibility
  • Screen reader compatibility
  • Keyboard traps

Keyboard accessibility

Keyboard accessibility is the ability to operate the media player and its controls using only a keyboard or keyboard alternative. This is particularly important for people who are blind and vision impaired who navigate a website using only a keyboard, and for people with a physical disability who may use a keyboard alternative, such as a switch input system.

There are two steps involved in assessing keyboard accessibility:

  1. Determine every function a user can perform using either the mouse or the keyboard together. In the case of a media player, that may be playing or pausing the video, stop playing the video, turn on or off closed captions, changing the volume of the audio or moving through the video via the video scrubber.
  2. Check that each function identified can be performed using only the keyboard. Every button and control the user is presented with must be able to be easily controlled by the keyboard.

There is a separate topic that discusses keyboard accessibility in more detail and Success Criterion 2.1.1 Keyboard covers this condition.

Screen reader compatibility

Another common barrier in media players occurs when the buttons and menus cannot be accessed, and therefore cannot be read out, by a screen reader. Screen reader compatibility goes hand in hand with keyboard accessibility.

A person who is blind or vision impaired needs to be able to operate the media player controls via a keyboard, and a screen reader must be able to read out the button label for them to know what that button or control does and how to operate it. For example a volume control must communicate the state of the control ("60% ... 70%") as it is being operated.

Keyboard traps

Keyboard traps occur when a user uses the keyboard to move to a component in a webpage (e.g. the video player) but then the user cannot move away from that component, thus 'trapping' them. This is a common problem with media players that use plug-ins like Adobe Flash Player. The Flash player can be made more accessible, but there are alternatives to playing video, like YouTube's HTML5 video player that avoid some of the common pitfalls by remaining in HTML.

To satisfy Success Criterion 2.1.2 No keyboard trap, users must be able to move away from the media player (or any other component) using the keyboard only. Furthermore, if the user must use keys other than the arrow or tab keys, they must be accessibly advised of this method.

Other considerations

This is not an exhaustive list of how to make a media player accessible, but it does identify some of the common accessibility barriers found in media players.

There are other considerations around media players and time-based media, such as:

WCAG 2.0 references

Related Techniques for WCAG 2.0