This article is part of a tutorial series on txtai, an AI-powered semantic search platform.
This article covers the transcription of audio files to text using models provided by Hugging Face.
Install dependencies
Install txtai
and all dependencies. Since this article is using optional pipelines, we need to install the pipeline extras package.
pip install txtai[pipeline]
# Get test data
wget -N https://github.com/neuml/txtai/releases/download/v3.5.0/tests.tar.gz
tar -xvzf tests.tar.gz
Create a Transcription instance
The Transcription instance is the main entrypoint for transcribing audio to text. The pipeline abstracts transcribing audio into a one line call!
The pipeline executes logic to read audio files into memory, run the data through a machine learning model and output the results to text.
from txtai.pipeline import Transcription
# Create transcription model
transcribe = Transcription("facebook/wav2vec2-large-960h")
Transcribe audio to text
The example below shows how to transcribe a list of audio files to text. Let's transcribe audio to text and look at each result.
from IPython.display import Audio, display
files = ["Beijing_mobilises.wav", "Canadas_last_fully.wav", "Maine_man_wins_1_mil.wav", "Make_huge_profits.wav", "The_National_Park.wav", "US_tops_5_million.wav"]
files = ["txtai/%s" % x for x in files]
for x, text in enumerate(transcribe(files)):
display(Audio(files[x]))
print(text)
print()
Baging mobilizes invasion craft along coast as tiwan tensions escalates
Canada's last fully intact ice shelf has suddenly collapsed forming a manhatten sized ice berg
Main man wins from lottery ticket
Make huge profits without working make up to one hundred thousand dollars a day
National park service warns against sacrificing slower friends in a bare attack
U s virus cases top a million
Overall the results are solid. Each result sounds phonetically like the audio. There is an open task with the Hugging Face models to use a language model to decode the model outputs and further improve result accuracy.
Keep an eye out for those updated models!
Discussion (0)