Speech recognition (also known as “automatic speech recognition”, “computer speech recognition”, or just “speech to text”) enables the recognition of spoken language into text by computers.
If automatic captions are available (on YouTube, for example), they’ll automatically be published on the video.
However, as we all know, automatic captions might misrepresent the spoken content due to various reasons, such as mispronunciations, accents, dialects, or background noise. This can be slightly irritating because the listener must always keep in mind that the transcription is unreliable. On the other hand, it is also a great learning opportunity for a language class
The other day, I played this video to a group of pre-intermediate students.
I had listened to the video beforehand, with the automatic captions on, and concentrated on the discrepancies between the spoken and the transcribed versions.
I copied the parts with the problematic areas like this:
I played the video to the class with the captions off first for Sts to get a general idea of what the story was about. We discussed it a bit as a class. Before I played the video again with the captions on, I pointed out that there might be some discrepancies. I asked the students to take notes of any errors they’d come across.
After Sts compared their notes, I projected the slips above one by one. I asked Sts to look at their notes and tell me what the problems were. These are the corrected parts:
- catch one bush calf
- do you not have your spear and your arrow
- looked at her husband
- bush calf
- she tore its throat open
- she tore its throat … bush calf
- bush calf
- bush calf …bush calf
- bush calf …. a stick through the bush calf’s body
- he said: uu uh hh
- come eat
- he ate and he
Later in the computer laboratory:
Sts worked in pairs. Each pair got a short section of the video to transcribe (with the captions off!). Note: You can let Sts transcribe with the captions on because there’s still a lot to work on in terms of punctuation.
We put the story together and practiced retelling it as a chain activity (each pair only read their part).
But that’s another story! 🙂