DEV Community

Dilek Karasoy for Picovoice

Posted on

3

End-to-End Speech Recognition with Python

Let's start with why you should use Picovoice Python SDK when there are alternative libraries and in-depth tutorials on speech recognition with Python.

  1. Private - processes voice data on the device
  2. Cross-platform β€” Linux, macOS, Windows, Raspberry Pi, …
  3. Real-time - zero latency

-I do not need to say accurate I guess. I haven't seen any vendor claiming mediocre accuracy πŸ™ƒ


Now, let's get started!

1 β€” Install Picovoice

pip3 install picovoice
Enter fullscreen mode Exit fullscreen mode

2 β€” Create a Picovoice Instant
Picovoice SDK consists of Porcupine Wake Word, enabling custom hotwords and Rhino Speech-to-Intent, enabling custom voice commands. Jointly they enable hands-free experiences.
Porcupine, set an alarm for 1 hours and 13 seconds.
Porcupine detects the hotword "Porcupine", then Rhino captures the user’s intent and provides intent and intent details as seen below:

{
    is_understood: true,
    intent: setAlarm,
    slots: {
        hours: 1,
        seconds: 13
    }
}
Enter fullscreen mode Exit fullscreen mode

To create a Picovoice instance we need Porcupine and Rhino models, paths to the models and callbacks for hotword detection and inference completion. For the simplicity, we'll use pre-trained Porcupine and Rhino models, however, you can train custom ones on the Picovoice Console: While exploring the Picovoice Console, grab your AccessKey, too! Signing up for Picovoice Console is free, no credit card required.

from picovoice import Picovoice
keyword_path = ...  # path to Porcupine wake word file (.PPN)
def wake_word_callback():
    pass
context_path = ...  # path to Rhino context file (.RHN)
def inference_callback(inference):
    print(inference.is_understood)
    if inference.is_understood:
        print(inference.intent)
        for k, v in inference.slots.items():
            print(f"{k} : {v}")

pv = Picovoice(
    access_key=${YOUR_ACCESS_KEY}
    keyword_path=keyword_path(),
    wake_word_callback=wake_word_callback,
    context_path=context_path(),
    inference_callback=inference_callback)
Enter fullscreen mode Exit fullscreen mode

Do not forget to replace model path and AccessKey placeholders.

3 β€” Process Audio with Picovoice
Pass frames of audio to the engine:

pv.process(audio_frame)
Enter fullscreen mode Exit fullscreen mode

4 β€” Read audio from the Microphone
Install [pvrecorder](https://pypi.org/project/pvrecorder/) and read the audio:

from pvrecoder import PvRecoder
# `-1` is the default input audio device.
recorder = PvRecoder(device_index=-1)
recorder.start()
Enter fullscreen mode Exit fullscreen mode

Read audio frames from the recorder and pass it to .process method:

pcm = recorder.read()
pv.process(pcm)
Enter fullscreen mode Exit fullscreen mode

5β€” Create a GUI with Tkinter
Tkinter is the standard GUI framework shipped with Python. Create a frame, add a label showing the remaining time to it, then launch:

window = tk.Tk()
time_label = tk.Label(window, text='00 : 00 : 00')
time_label.pack()

window.protocol('WM_DELETE_WINDOW', on_close)

window.mainloop()
Enter fullscreen mode Exit fullscreen mode

Some resources:
Source code for the tutorial
Original Medium Article
Picovoice SDK
Picovoice Console

AWS Security LIVE!

Tune in for AWS Security LIVE!

Join AWS Security LIVE! for expert insights and actionable tips to protect your organization and keep security teams prepared.

Learn More

Top comments (0)

AWS Security LIVE!

Tune in for AWS Security LIVE!

Join AWS Security LIVE! for expert insights and actionable tips to protect your organization and keep security teams prepared.

Learn More

πŸ‘‹ Kindness is contagious

Discover a treasure trove of wisdom within this insightful piece, highly respected in the nurturing DEV Community enviroment. Developers, whether novice or expert, are encouraged to participate and add to our shared knowledge basin.

A simple "thank you" can illuminate someone's day. Express your appreciation in the comments section!

On DEV, sharing ideas smoothens our journey and strengthens our community ties. Learn something useful? Offering a quick thanks to the author is deeply appreciated.

Okay