Skip to 0 minutes and 4 secondsPROFESSOR MIKE WALD: The sound in a video, film, or TV programme can present barriers for people with impaired hearing, such as Susan or Lars. To illustrate this, how well can you follow what's happening in this clip when we remove the sound? [VIDEO PLAYING WITH NO SOUND] Like Susan or Lars, you can understand much more when captions are provided.

Skip to 1 minute and 35 secondsSusan and Lars can get by with captions, but some deaf people who know sign language, may prefer on-screen signing. But this is not always readily available.

Skip to 2 minutes and 29 secondsCaptions or subtitles for deaf and hard of hearing people-- as they're called in some countries-- provide text information on the screen that can show what people are saying, explain which character is speaking, as well as describing background noise, such as a telephone ringing, dog barking, or children playing. The picture in a video or TV programme can present barriers for people with impaired vision, such as Carol or Maria. How well can you follow what's happening in this clip when we remove the picture? [PHONE RINGING]

Skip to 3 minutes and 11 secondsTAMSYN: Hello.

Skip to 3 minutes and 12 secondsANNA: Hi. I am really sorry. I bumped into a friend, and I got talking.

Skip to 3 minutes and 16 secondsTAMSYN: I've been waiting 20 minutes. Hurry up.

Skip to 3 minutes and 19 secondsANNA: I will. I'm on my way. I promise I won't be too long.

Skip to 3 minutes and 21 secondsTAMSYN: OK.

Skip to 3 minutes and 33 secondsANNA: Hi. I'm so sorry.

Skip to 3 minutes and 36 secondsTAMSYN: Don't worry about it. You're here now.

Skip to 3 minutes and 42 secondsMIKE WALD: Audio description can help by providing some of the missing visual information through speech.

Skip to 3 minutes and 49 secondsAUDIO DESCRIPTION: Tamsyn is waiting for Anna outside the theatre. She looks annoyed and checks her watch. She takes her phone out of one of her pockets and looks at the phone. It's Anna calling.

Skip to 4 minutes and 2 secondsTAMSYN: Hello.

Skip to 4 minutes and 3 secondsANNA: Hi. I am really sorry. I bumped into a friend, and I got talking.

Skip to 4 minutes and 7 secondsTAMSYN: I've been waiting 20 minutes. Hurry up.

Skip to 4 minutes and 10 secondsANNA: I will I'm on my way. I promise I won't be too long.

Skip to 4 minutes and 13 secondsTAMSYN: OK.

Skip to 4 minutes and 15 secondsAUDIO DESCRIPTION: Tamsyn puts the phone away. 15 minutes later, Tamsyn is still waiting and is visibly annoyed. Anna finally arrives.

Skip to 4 minutes and 25 secondsANNA: Hi. I'm so sorry. I'm sorry.

Skip to 4 minutes and 27 secondsTAMSYN: Don't worry about it. You're here now.

Skip to 4 minutes and 30 secondsAUDIO DESCRIPTION: They both start walking into the theatre.

Skip to 4 minutes and 33 secondsMIKE WALD: Here's a quick run through of the barriers that can be overcome by using captions, subtitles, sign language, audio descriptions, and transcripts. Captions are normally a two line block of text that is overlaid on the screen. When people speak very fast, the block of text may disappear off the screen faster than you are able to read it. And so captioners often simplify rather than present every word. Captioners have time to optimise block captions for recorded video, but live TV verbatim captioning, a very fast speech, needs to be created using chorded phonetic keyboards, whereas average speech rates can be captioned by repeating and simplifying what is being said using speech recognition.

Skip to 5 minutes and 28 secondsLive captions are harder to read than block captions, because they scroll upwards and may contain errors. Subtitles are a translation of a language spoken in the video into a target language. Sign language is preferred by deaf viewers, whose first language is sign language and who find reading captions too difficult. Deaf people in different countries can have their own sign language. Transcripts can be static or interactive. Static transcripts only provide the text transcription of the words that have been spoken and so cannot convey which text corresponds with which image in a video. An interactive transcript is not restricted to two lines, as it is presented next to a video and highlights the words as they are spoken.

Skip to 6 minutes and 25 secondsAudio description is used to provide a spoken description of important information that would otherwise be only available visually. Audio descriptions can only be inserted when no other speech or sounds occur. And so the short time available often means that only descriptions that are absolutely essential to understanding can be provided. Web systems are available that can pause the video to allow extra time for the audio description, for example, You Describe. We hope you have found this brief overview of captions, subtitles, sign language, transcripts, and audio descriptions helpful.

Skip to 7 minutes and 10 secondsEach country has their own regulations for captions or audio descriptions for both TV and internet and can vary in terms of the accuracy, number of hours, types of broadcasts, and the times they are shown. We look forward to your comments about your experiences of text or sign language captioning, and audio description, and the regulations in your country.

Video and audio barriers

The video in this step helps you understand the barriers video and audio can present to people with hearing or vision impairments.

  • A clip is shown with no sound. How much did you understand without sound?

  • The same clip is shown again but with captions and sign language. How much more did you understand now?

  • A second clip is presented with no images shown. How much did you understand?

  • The same clip is presented with audio description. How much more did you now understand?

  • The same clip is presented with images as well as audio description. How much more did you now understand?


What can subtitles and text do? and not do?

Subtitles or captions for hard of hearing people provide text information on the screen that is only available through sound. As well as providing a text transcription of what is being said, captions can explain which character is speaking and that for example a telephone is ringing or a dog is barking. Text cannot however convey tone of voice.

TV captioning

Only two lines of text are provided for TV captions and so when people are speaking very quickly the text may disappear from the screen too quickly to be able to read.

TV captioners therefore often simplify the captions rather than present verbatim captions. Pre-prepared block captions can be used for recorded videos whereas scrolling captions are used for live captioning which are harder to read.

Highly trained captioners using chorded phonetic keyboards are necessary for verbatim captioning of very fast speech (e.g. 200 words per minute) whereas repeating and simplifying what is being said using speech recognition can be used for slower speech rates.

Accuracy measures normally simply count the number of errors in captions without taking into account the relative importance of the words in conveying meaning. This makes it very difficult to measure the accuracy of simplified captions and compare them with verbatim captions.

Web captioning

On the web it is possible to provide many more lines of text than on a TV and also to automatically identify which word is being spoken through underlining or highlighting.

Foreign language subtitles

Foreign language subtitles are used to present a text translation into a target language from the language being spoken in the video. It is cheaper than providing the spoken soundtrack in the target language.

Sign language

Sign language is required by Deaf viewers whose first language is sign language and find reading captions too difficult. Every country has its own sign language.

Audio description

Audio description is used to provide a spoken description of important information that is only available visually.

This information is presented when there is no other information being presented through speech or sound and so the short time available often means that only descriptions that are absolutely essential to understanding can be provided.

Web systems are available that pause the video to allow extra time for the audio description.

When listening to an audio with no video then all the spoken information required can be provided by a transcript. It is much faster to read a transcript than listen to spoken information presented at normal speed.

Unlike captions, a transcript for a video however cannot convey which text corresponds with which sounds and which images.


Captions are also very popular with many people who have no disability as they can help understand what is being said when there is a low sound level and/or a high background noise level.

There can be different regulations for captions or audio descriptions for broadcasters in different countries for both TV and internet and can vary in terms of the accuracy, number of hours, types of broadcasts and the times they are shown.

What are your experiences of text or sign language captioning and audio description and the regulations in your country?


© This work is created by the University of Southampton and licensed under CC-BY 4.0 International Licence. Erasmus + MOOCs for Accessibility Partnership.

Share this video:

This video is from the free online course:

Digital Accessibility: Enabling Participation in the Information Society

University of Southampton