What is Vosk?
Vosk is a speech recognition toolkit supporting over 20 languages.
The language model is 50MB light and easy to embed. So you will easily can do speech recognition completely offline.
Vosk provides bindings for Python, Java, C#, and also Node.js!
- Supports 20+ languages and dialects
- Works offline, even on lightweight devices - Raspberry Pi, Android, iOS
See Vosk's page for detail.
Now you can try Vosk with Python!
Vosk can be installed by pip. However, I prefer poetry, so I'll install it there.
⚠️ Poetry will try to install the latest version (0.3.38). But that version is not compatible with MacOS. So I installed it by specifying the version to be installed by pip. (as of 2022-05-19)
And you can download the python module from Vosk examples.
Download the language model
The language model is available here. Extract the zip file and place it.
Prepare an audio file
You will need an audio file in the correct format - PCM 16khz 16bit mono.
If you are English speaker, you can get the test voice from Vosk example.
You can convert with ffmpeg.
ffmpeg -i my_voice.wav -ar 16000 -ac 1 -f s16le my_voice_16khz.wav
Run the python module...
Done it!! 🎉
There are some differences. But, Vosk also recognized Japanese Kanji characters. 🀄
I'm a Japanese speaker, so recognized a Japanese audio file.
The text of the audio is "ご視聴ありがとうございました！グッドボタンとチャンネル登録よろしくお願いします！".
The complete commands is below.
poetry add email@example.com curl -O https://raw.githubusercontent.com/alphacep/vosk-api/v0.3.32/python/example/test_simple.py curl -O https://alphacephei.com/vosk/models/vosk-model-small-ja-0.22.zip unzip vosk-model-small-ja-0.22.zip mv vosk-model-small-ja-0.22/ model/ poetry run python test_simple.py my_voice_16khz.wav
The codes are on GitHub and Replit.
I hope you'll enjoy Vosk too! Thank you.
kama-meshi / HelloVosk
Sample Vosk repl with Python.
This is a sample repl for Vosk with Python.
Let's recognize this voice
poetry install poetry run python main.py
And my repl is in replit.
- Voice: こえやさん
Top comments (0)