Create account

DEV Community

kama-meshi

Posted on Jun 2, 2022

The awesome speech recognition toolkit: Vosk!

#python #machinelearning #node

What is Vosk?

Vosk is a speech recognition toolkit supporting over 20 languages.
The language model is 50MB light and easy to embed. So you will easily can do speech recognition completely offline.

Vosk provides bindings for Python, Java, C#, and also Node.js!

Supports 20+ languages and dialects
Works offline, even on lightweight devices - Raspberry Pi, Android, iOS

See Vosk's page for detail.

Let's try!

Install Vosk

Now you can try Vosk with Python!
Vosk can be installed by pip. However, I prefer poetry, so I'll install it there.

⚠️ Poetry will try to install the latest version (0.3.38). But that version is not compatible with MacOS. So I installed it by specifying the version to be installed by pip. (as of 2022-05-19)

And you can download the python module from Vosk examples.

Download the language model

The language model is available here. Extract the zip file and place it.

Prepare an audio file

You will need an audio file in the correct format - PCM 16khz 16bit mono.

If you are English speaker, you can get the test voice from Vosk example.

You can convert with ffmpeg.



ffmpeg -i my_voice.wav -ar 16000 -ac 1 -f s16le my_voice_16khz.wav

Run Vosk

Run the python module...

Done it!! 🎉
There are some differences. But, Vosk also recognized Japanese Kanji characters. 🀄

I'm a Japanese speaker, so recognized a Japanese audio file.
The text of the audio is "ご視聴ありがとうございました！グッドボタンとチャンネル登録よろしくお願いします！".

The complete commands is below.



poetry add vosk@0.3.32
curl -O https://raw.githubusercontent.com/alphacep/vosk-api/v0.3.32/python/example/test_simple.py
curl -O https://alphacephei.com/vosk/models/vosk-model-small-ja-0.22.zip
unzip vosk-model-small-ja-0.22.zip
mv vosk-model-small-ja-0.22/ model/
poetry run python test_simple.py my_voice_16khz.wav

The codes are on GitHub and Replit.
I hope you'll enjoy Vosk too! Thank you.

kama-meshi / HelloVosk

Sample Vosk repl with Python.

Hello Vosk

This is a sample repl for Vosk with Python.

Sample voice

Let's recognize this voice 🎤

"ご視聴ありがとうございました！グッドボタンとチャンネル登録よろしくお願いします！"

Usage

poetry install
poetry run python main.py

And my repl is in replit.

https://replit.com/@kama-meshi/HelloVosk

Special Thanks

Voice: こえやさん

View on GitHub

replit.com