Automatic Speech recognition(ASR) is a task where you automatically extract linguistic text from audio/video files.
Guess what, Facebook AI & Hugging Face are here for our rescue.
Facebook AI recently open sourced the Wav2Vec 2.0 model which helps in Automatic Speech Recognition From 10 Minute Samples & Hugging Face integrated it with Transformers v4.3.0.
In this video, I'll show you how you can utilize the lethal combination of Facebook AI’s Wav2Vec2 & Hugging Face Transformers library to generate text from audio.
I hope you all like it 🙂