Want to learn English through YouTube? Hearing impaired? Often watch videos in a noisy environment or on mute? You will definitely benefit from YouTube automatic captions. This feature quickly generates real-time subtitles to make videos more accessible.
Even though the feature is getting better, it's not perfect yet because accents, slang, and background noise can mess with the accuracy. This article will put YouTube auto English captions to test for accuracy and provide practical suggestions.
Manual captions: Video creators submit manually for improved accuracy and professionalism. Can use .srt, .vtt, .scc, etc. formats and can be uploaded in multiple languages.
Auto-generated captions: Use YouTube ASR, a.k.a. Automatic Speech Recognition, technology to transform video voice to text without human intervention, saving time and effort.
Auto-translated captions: Use GNMT, a.k.a Google Neural Machine Translation, system to generate text in a variety of languages, including English. It translates from current manual or automatic subtitles, depending on quality. If manual subtitles are available, they are given priority.
I chose 7 YouTube video samples that all include 3 types of English subtitles mentioned above. I can easily download them using subtitle downloaders for comparison.
Sample | Description | Duration | Auto-generated caption accuracy | Auto-translated caption accuracy |
---|---|---|---|---|
Video 1 | From the TED-Ed channel, a slow-paced animated video | 5:19 | 79% | 88% |
Video 2 | A fast-paced conversation between 2 people from TED | 10:06 | 76% | 93% |
Video 3 | A solo talk with an Indian English from TEDWomen | 10:31 | 77% | 93% |
Video 4 | My favorite rapper, Eminem's song - My Name Is | 4:08 | 93% | 88% |
Video 5 | Educational videos with background music (instrumental music) | 7:17 | 93% | 86% |
Video 6 | 3 people commenting on a soccer game, with cheers and whistles | 17:51 | 87% | 82% |
Video 7 | Introduce Hematology | Types of Anemia, many technical medical terms included | 35:59 | 77% | 90% |
Note: While I was testing, I noticed something pretty interesting. For the same TED video, the subtitles manually uploaded by TED and YouTube are not the same. I guess YouTube manual subtitles based on the ASR draft and the uploaders only fixed the obvious mistakes. So, if you want the most accurate subtitles for a TED video, go to its official website to copy transcript.
Since automatically generated subtitles are more commonly used. I'll go on to show types of errors it often makes in transcription. Not only do they affect the viewing experience, they can cause distortion of key information. Here are some examples, with the original subtitles on the left and the automatically generated subtitles on the right.
1. Incorrect recognition of proper nouns
2. Mixing up words that sound similar
3. Spelling mistakes
4. Missing punctuation
5. Number + Unit mishandling
You can go to https://www.diffchecker.com/ or install Subtitle Edit software (Windows only) to compare the original subtitles with the changed subtitles.
Limitations of the speech recognition model: the inability to intelligently recognize proper nouns, homophones, punctuation, sentence segmentation, jargon, slang, etc.
Background noise: YouTube video contains music, cheers, whistle, car horn sound, applause, or others in the background.
Various accents: People from India, Australia, Japan, Germany, or other different countries may speak English with local accents.
Personal speaking habits: Speak quickly, drop sounds, slur words, etc.
Multiple people speak simultaneously: overlapping conversations interfere with recognition.
Poor audio quality: The audio provided by the YouTube video comes with low volume and intermittent sound.
To ensure the best sound quality, please record your YouTube videos in a quiet environment. Use an external microphone and speak at a steady pace. If you think that YouTube subtitles are too inaccurate to understand for the viewer, it is better to adjust them manually.
If you're just watching a video... Click on your avatar in the upper right corner, select “Send feedback”, write your feedback about the inaccuracy of the video's automatic subtitles and send it. Or contact the video uploader and ask him to adjust the subtitles.
If for further use… While YouTube automatic captions is not accurate enough, many users need direct access to subtitle files for translation, editing, or study. You can download the subtitles and make your own adjustments and corrections.
To download YouTube subtitles, use DownSub.com. It is the best online YouTube subtitle downloader, allowing you to download manual, auto-generated, and auto-translated captions with software.
To download YouTube videos with subtitles, use Cisdem Video Converter. It can download all types of subtitles in .vtt, .srt, .ttml and other formats. Also supports saving YouTube, Facebook, Twitter, Instagram videos and embed .srt subtitles into them.
I still have great confidence in YouTube's automatic subtitles. It's not quite 90% accurate at the moment, lacking in details like punctuation, formatting, tone and recognizing jargon, dialects, etc., but it retains the key information and doesn't lose the core semantics. As YouTube improves its AI technology, the accuracy will surely improve in the future. If YouTube automatic captions are not available, see troubleshooting and settings.
Emily is a girl who loves to review various multimedia software. She enjoys exploring cutting edge technology and writing how-to guides. Hopefully her articles will help you solve your audio, video, DVD and Blu-ray issues.