Multimedia Transcripts & Captioning

Multimedia Transcripts & Captioning Guidelines

Transcripts: Basic transcripts are a text version of the speech and non-speech audio information needed to understand the content.

Captions: Captions (called “subtitles” in some areas) provide content to people who are Deaf and hard-of-hearing. Captions are a text version of the speech and non-speech audio information needed to understand the content. They are synchronized with the audio and usually shown in a media player when users turn them on.

The Accessibility for Ontarians with Disabilities Act, 2005 (AODA) sets out the compliance requirements for alternative formats, transcripts, and captioning of multimedia content including pre-recorded audio and video content. It states in section 14.4:

  • 1. By January 1, 2014, new internet websites and web content on those sites must conform with WCAG 2.0 Level A.
  • 2. By January 1, 2021, all internet websites and web content must conform with WCAG 2.0 Level AA, other than,
    1. success criteria 1.2.4 Captions (Live), and
    2. success criteria 1.2.5 Audio Descriptions (Pre-recorded).

The Web Content Accessibility Guidelines (WCAG) sets out the implementation requirements for multimedia transcripts and captioning. In states in 1.2 Time-based Media: Provide alternatives for time-based media:


Automatically generated captions are produced by a computer momentarily (typically minutes) after uploading your video to either or Microsoft Stream. All videos uploaded directly to or uploaded to Sakai through the Echo360 Embed Media button will automatically generate a transcript. All recordings of Microsoft Teams meetings are processed through and hosted on Microsoft Stream where transcripts are automatically generated, as well.

Automatic Speech Recognition is not perfect and transcripts can contain errors with word recognition, spelling, and grammar. With both Echo360 and Microsoft Stream the transcripts can be edited and turned into captions. See below for links to step-by-step help articles on editing transcripts in these platforms.

How to edit a transcript in Echo360

How to edit a transcript in Microsoft Stream

Human-Assisted Captions

Human-assisted captions combine ASR captions as a starting point with a human editor to further increase the accuracy of the caption to 99%+. They take longer to produce but might be the best and most appropriate option for someone in need of caption-related accommodations in your course. Brock University Student Accessibility Services will follow-up with instructors who are teaching courses with an identified student who requires accommodation.