DEV Community

Cover image for Day 4: Voice Activity Detection with Python
Dilek Karasoy for Picovoice

Posted on

1

Day 4: Voice Activity Detection with Python

When it comes to speech recognition, most only know about Automatic Speech Recognition (ASR). Voice Activity Detection (VAD) is an important and fundamental piece in any product related to speech. Voice AI vendors integrate VAD into their ASRs but do not offer it separately. Picovoice also built Cobra for internal use initially. Then make it public due to the market demand as there is no alternative to Google’s WebRTC VAD, which does not work on all platforms.

You can read more on what voice activity detection is, but today is the day to learn how to detect voice activity with Cobra VAD Python SDK:

1. Install VAD SDK

pip3 install pvcobra
Enter fullscreen mode Exit fullscreen mode

Sign up for Picovoice Console if you haven't already done (it's free) to grab your AccessKey.

2. Implement in Python

import pvcobra

handle = pvcobra.create(access_key)
Enter fullscreen mode Exit fullscreen mode

When initialized, the valid sample rate is given by handle.sample_rate. The expected frame length is handle.frame_length. The engine accepts 16-bit linearly-encoded PCM and operates on single-channel audio.

def get_next_audio_frame():
    pass
while True:
    voice_probability = handle.process(get_next_audio_frame())
Enter fullscreen mode Exit fullscreen mode

Congratulations!

Top comments (0)

Sentry image

See why 4M developers consider Sentry, “not bad.”

Fixing code doesn’t have to be the worst part of your day. Learn how Sentry can help.

Learn more

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay