<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: kama-meshi</title>
    <description>The latest articles on DEV Community by kama-meshi (@kama_meshi).</description>
    <link>https://dev.to/kama_meshi</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F867614%2Fc75fcdd0-5c61-464d-a70f-123e92517ef2.gif</url>
      <title>DEV Community: kama-meshi</title>
      <link>https://dev.to/kama_meshi</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/kama_meshi"/>
    <language>en</language>
    <item>
      <title>The awesome speech recognition toolkit: Vosk!</title>
      <dc:creator>kama-meshi</dc:creator>
      <pubDate>Thu, 02 Jun 2022 09:28:29 +0000</pubDate>
      <link>https://dev.to/kama_meshi/the-awesome-speech-recognition-toolkit-vosk-i6o</link>
      <guid>https://dev.to/kama_meshi/the-awesome-speech-recognition-toolkit-vosk-i6o</guid>
      <description>&lt;h2&gt;
  
  
  What is Vosk?
&lt;/h2&gt;

&lt;p&gt;Vosk is a speech recognition toolkit supporting over 20 languages.&lt;br&gt;
The language model is 50MB light and easy to embed. So you will easily can do speech recognition completely offline.&lt;/p&gt;

&lt;p&gt;Vosk provides bindings for Python, Java, C#, and also Node.js!&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Supports 20+ languages and dialects &lt;/li&gt;
&lt;li&gt;Works offline, even on lightweight devices - Raspberry Pi, Android, iOS&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;See &lt;a href="https://alphacephei.com/vosk/" rel="noopener noreferrer"&gt;Vosk's page&lt;/a&gt; for detail.&lt;/p&gt;
&lt;h2&gt;
  
  
  Let's try!
&lt;/h2&gt;
&lt;h3&gt;
  
  
  Install Vosk
&lt;/h3&gt;

&lt;p&gt;Now you can try Vosk with Python!&lt;br&gt;
Vosk can be installed by pip. However, I prefer &lt;a href="https://python-poetry.org/" rel="noopener noreferrer"&gt;poetry&lt;/a&gt;, so I'll install it there.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;⚠️ Poetry will try to install the latest version (0.3.38). But that version is not compatible with MacOS. So I installed it by specifying the version to be installed by pip. (as of 2022-05-19)&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;And you can download the python module from &lt;a href="https://github.com/alphacep/vosk-api/blob/v0.3.32/python/example/test_simple.py" rel="noopener noreferrer"&gt;Vosk examples&lt;/a&gt;.&lt;/p&gt;
&lt;h3&gt;
  
  
  Download the language model
&lt;/h3&gt;

&lt;p&gt;The language model is available &lt;a href="https://alphacephei.com/vosk/models" rel="noopener noreferrer"&gt;here&lt;/a&gt;. Extract the zip file and place it.&lt;/p&gt;
&lt;h3&gt;
  
  
  Prepare an audio file
&lt;/h3&gt;

&lt;p&gt;You will need an audio file in the correct format - PCM 16khz 16bit mono.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;If you are English speaker, you can get &lt;a href="https://github.com/alphacep/vosk-api/blob/v0.3.32/python/example/test.wav" rel="noopener noreferrer"&gt;the test voice&lt;/a&gt; from Vosk example.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;You can convert with &lt;a href="https://ffmpeg.org/" rel="noopener noreferrer"&gt;ffmpeg&lt;/a&gt;.&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;

ffmpeg &lt;span class="nt"&gt;-i&lt;/span&gt; my_voice.wav &lt;span class="nt"&gt;-ar&lt;/span&gt; 16000 &lt;span class="nt"&gt;-ac&lt;/span&gt; 1 &lt;span class="nt"&gt;-f&lt;/span&gt; s16le my_voice_16khz.wav


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h3&gt;
  
  
  Run Vosk
&lt;/h3&gt;

&lt;p&gt;Run the python module...&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9crv5rjm4ywx4rs7lj4z.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9crv5rjm4ywx4rs7lj4z.gif" alt="Run in terminal"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Done it!! 🎉&lt;br&gt;
There are some differences. But, Vosk also recognized Japanese Kanji characters. 🀄&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;I'm a Japanese speaker, so recognized a Japanese audio file.&lt;br&gt;
The text of the audio is "ご視聴ありがとうございました！グッドボタンとチャンネル登録よろしくお願いします！".&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The complete commands is below.&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;

poetry add vosk@0.3.32
curl &lt;span class="nt"&gt;-O&lt;/span&gt; https://raw.githubusercontent.com/alphacep/vosk-api/v0.3.32/python/example/test_simple.py
curl &lt;span class="nt"&gt;-O&lt;/span&gt; https://alphacephei.com/vosk/models/vosk-model-small-ja-0.22.zip
unzip vosk-model-small-ja-0.22.zip
&lt;span class="nb"&gt;mv &lt;/span&gt;vosk-model-small-ja-0.22/ model/
poetry run python test_simple.py my_voice_16khz.wav


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;The codes are on GitHub and Replit.&lt;br&gt;
I hope you'll enjoy Vosk too! Thank you.&lt;/p&gt;


&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev.to%2Fassets%2Fgithub-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/kama-meshi" rel="noopener noreferrer"&gt;
        kama-meshi
      &lt;/a&gt; / &lt;a href="https://github.com/kama-meshi/HelloVosk" rel="noopener noreferrer"&gt;
        HelloVosk
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      Sample Vosk repl with Python.
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;
&lt;div class="markdown-heading"&gt;
&lt;h1 class="heading-element"&gt;Hello Vosk&lt;/h1&gt;

&lt;/div&gt;
&lt;p&gt;This is a sample repl for &lt;a href="https://alphacephei.com/vosk/" rel="nofollow noopener noreferrer"&gt;Vosk&lt;/a&gt; with Python.&lt;/p&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;Sample voice&lt;/h2&gt;

&lt;/div&gt;
&lt;p&gt;Let's recognize this &lt;a href="https://www.koeyasan.com/voices/558" rel="nofollow noopener noreferrer"&gt;voice&lt;/a&gt; 🎤&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;"ご視聴ありがとうございました！グッドボタンとチャンネル登録よろしくお願いします！"&lt;/p&gt;
&lt;/blockquote&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;Usage&lt;/h2&gt;

&lt;/div&gt;
&lt;div class="highlight highlight-source-shell notranslate position-relative overflow-auto js-code-highlight"&gt;
&lt;pre&gt;poetry install
poetry run python main.py&lt;/pre&gt;

&lt;/div&gt;
&lt;p&gt;And my repl is in &lt;a href="https://replit.com" rel="nofollow noopener noreferrer"&gt;replit&lt;/a&gt;.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;a href="https://replit.com/@kama-meshi/HelloVosk" rel="nofollow noopener noreferrer"&gt;https://replit.com/@kama-meshi/HelloVosk&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;Special Thanks&lt;/h2&gt;

&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;Voice: &lt;a href="https://www.koeyasan.com/" rel="nofollow noopener noreferrer"&gt;こえやさん&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;



&lt;/div&gt;
&lt;br&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/kama-meshi/HelloVosk" rel="noopener noreferrer"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;br&gt;
&lt;/div&gt;
&lt;br&gt;



&lt;div class="crayons-card c-embed text-styles text-styles--secondary"&gt;
    &lt;a href="https://replit.com/@kama-meshi/HelloVosk" rel="noopener noreferrer"&gt;
      replit.com
    &lt;/a&gt;
&lt;/div&gt;


</description>
      <category>python</category>
      <category>machinelearning</category>
      <category>node</category>
    </item>
  </channel>
</rss>
