Transcripts from Captions?

young person looking at computer for online learning

The subject of automatic captioning continues to be debated but Gerald Ford Williams has produced a really helpful “guide to the visual language of closed captions and subtitles” on UX Collective as a “user-centric guide to the editorial conventions of an accessible caption or subtitle experience.” It has a series of tips with examples and several very useful links at the bottom of the page for those adding captions to videos. There is also a standard for the presentation of different types of captions across multimedia ISO/IEC 20071-23:2018(en).

However, in this article transcripts are something that also need further discussion, as they are often used as notes gathered from a presentation, as a result of lecture capture or an online conference with automatic captioning. They may be copied from the side of the presentation, downloaded after the event or presented to the user as a file in PDF/HTML or text format depending on the system used. Some automated outputs provide notification of speaker changes and timings, but there are no hints as to content accuracy prior to download.

The problem is that there also seem to be many different ways to measure the accuracy of automated captioning processes which in many cases become transcriptions. 3PlayMedia suggest that there is a standard, saying “The industry standard for closed caption accuracy is 99% accuracy rate. Accuracy measures punctuation, spelling, and grammar. A 99% accuracy rate means that there is a 1% chance of error or a leniency of 15 errors total per 1,500 words” when discussing caption quality.

The author of the 3PlayMedia article goes on to illustrate many other aspects of ‘quality’ that need to be addressed, but the lack of detailed standards for the range of quality checks means that comparisons between the various offerings are hard to achieve. Users are often left with several other types of errors besides punctuation, spelling and grammar. The Nlive project team have been looking into these challenges when considering transcriptions rather than captions and have begun to collect a set of additional issues likely to affect understanding. So far, the list includes:

  • Number of extra words added that were not spoken
  • Number of words changed affecting meaning – more than just grammar.
  • Number of words omitted
  • Contractions … e.g. he is – he’s, do not … don’t and I’d could have three different meanings I had, I would, or I should!

The question is whether these checks could be included automatically to support collaborative manual checks when correcting transcriptions?

Below is a sample of the text we are working on as a result of an interview to demonstrate the differences between three commonly used automatically generated captioning systems for videos.

Sample 1

Sample 2

Sample 3

So stuck. In my own research, and my own teaching. I’ve been looking at how we can do the poetry’s more effectively is one of the things so that’s more for structuring the trees, not so much technology, although technology is possibleso starting after my own research uh my own teaching i’ve been looking at how we can do laboratories more effectively is one of the things so that’s more for structuring laboratories not so much technology although technology is part of the laboratoryso stop. In my own research on my own teaching, I’ve been looking at how we can do the ball trees more effectively. Is one thing, so that’s more for structuring the voluntary is not so much technology, although technology is part little bar tree

Having looked at the sentences presented in transcript form, Professor Mike Wald pointed out that Rev.com (who provide automated and human transcription services) state that we should not “try to make captions verbatim, word-for-word versions of the video audio. Video transcriptions should be exact replications, but not captions.” The author of the article “YouTube Automatic Captions vs. Video Captioning Services” highlights several issues with automatic closed captioning and reasons humans offer better outcomes. Just in case you want to learn more about the difference between a transcript and closed cations 3PlayMedia wrote about the topic in August 2021 “Transcription vs. Captioning – What’s the Difference?”.

AudioNote for iPad & iPhone

AudioNote screen grabsAudioNote is a fantastic note taking app. The official description from the app store tells you how you can synchronise notes and audio with each key point being linked to the moment when the lecturer talks about that subject.  Because it works on a tablet or phone there is no need to wait for the laptop to boot up.

Bookmarks can be created throughout the audio recording to highlight important points for easy referencing.  It allows you to take pictures and insert them into your notes and AudioNotes can be exported to Evernote, saved and organised there. A yellow background can be used instead of white for those with visual stress/sensitivity. It costs £2.99 and is available from the iTunes store.

This YouTube video is a good introduction to AudioNote

This comes with thanks to the Disability Advisory Service at Imperial College

Taking notes on an iPad and using iCloud

icloud“I don’t like pens and papers! Too much waste and extra cost.  I’ve been using my iPad with a stylus and several note taking/drawing apps so far. I synchronise all my notes with iCloud (it was iWorld before Apple introduced iCloud) and I’m perfectly happy – so is my room as it doesn’t have stacks of paper and pens around :>)”  Trinity – computer scientist

There are so many note taking apps and drawing apps that it is hard to advise which ones are the best but a combination of Evernote and Skitch is a good one – the Appadvice site has a note taking advice page with many more apps and the University of Exeter have a blog with a review of some more useful time management, maps, social network and note taking iPad apps. They mention WritePad that has handwriting recognition. 

WritePad for iPad YouTube Video

 

Listening to a webcast and taking notes on DraftPad on the IPhone or iPad

DarftPad for mobile“I can listen to a webcast and take notes.  Previously, this required getting transportation to the presentation and lugging a Braille notetaker.  Now I use my netbook for the webcast and my phone with external keyboard and the DraftPad app to take notes.”

DraftPad is free and offers a very accessible interface that can be used with VoiceOver or once the text has been copied, then select ‘Speak’.  It  allows you to send or share your notes via email, SMS and social networking sites as well as open them in other apps that may be on your device such as DocsToGo for more formatting, Evernote for linking with other notes, DropBox for sharing or backing up file.  The app also links with text speaking apps such as SpeakText Free.

This strategy came thanks to Pat Pound on My Life Simplified via Accessible Web and Apps!

LiveScribe for note taking, planning and diagrams

livescribe

” I find if I use the LiveScribe with the ear buds just hanging loosely round my neck, then the microphone from the pen does not pick up the scratching when writing, but still records the lecture or meeting”  (You need the digital pen with the special paper notepads and the software for transferring notes to the computer or tablet and it can be used with Evernote)

Ursula

Audio Notetaker used with an Olympus Digital Recorder

“I have found Audio Notetaker very useful for gathering quotes from audio recordings when interviewing people as well as recording and making note from lectures.  You can also see PowerPoint slides alongside the audio sections and jump to each section. It has saved so much time when I used to have to work through the recordings on my Olympus recorder

Taking Notes Live with Audio Notetaker – YouTube Video