DEV Community

Divyanshu Shekhar
Divyanshu Shekhar

Posted on • Edited on

2

Speech Recognition in Python

Installing Speech Recognition Packages in Python

There are many voice recognition packages exist on PyPI. Some of them are:

  1. apiai

  2. assemblyai

  3. google-cloud-speech

  4. pocketsphinx

  5. SpeechRecognition

  6. watson-developer-cloud

  7. wit

In this blog, we will primarily focus on SpeechRecognition Module.

SpeechRecognition Library
$ pip install SpeechRecognition
This will install the Speech Recognition Package in Python. Now, we can use this package and its function for speech recognition. And can move a step further in our Voice Assistant Creation.

Speech Recognition will use our machine’s microphone to recognize the speech and convert it to string. We will have to install PyAudio for this purpose.

When we pip install pyaudio , error occurs, so this time we will install pyaudio by downloading and then installing using pipwin.

Download PyAudio .whl file from the link. Change the directory to the downloaded file.

PyAudio whl install

$ pip install .\PyAudio-0.2.11-cp39-cp39-win_amd64.whl
Enter fullscreen mode Exit fullscreen mode

One more work around is first install pipwin then install pyaudio using pipwin.

$ pip install pipwin
$ pipwin install pyaudio
Enter fullscreen mode Exit fullscreen mode

The necessary packages for Speech Recognition have been installed. Now we can code the speech recognition in python.

Speech Recognition in Python

import speech_recognition as sr

recognizer = sr.Recognizer()
with sr.Microphone() as source:
    print("Listening...")
    recognizer.adjust_for_ambient_noise(source)
    audio = recognizer.listen(source)
    try:
        print("Recognizing...")
        query = recognizer.recognize_google(audio)
    except sr.UnknownValueError:
        print("Could not understand audio")
    print(query.lower())
Enter fullscreen mode Exit fullscreen mode

Let’s understand the code line by line.

First of all import the speech_recognition library, in this case, we have imported it as an alias as the original name of the import is quite long.

Recognizer class in Speech Recognition Library

Recognizer instance

recognizer = sr.Recognizer()
Enter fullscreen mode Exit fullscreen mode

After importing, the first step is to create an instance of the Recognizer present in the speech_recognition library.

Now the recognition variable that contains the speech recognition instance of the Recognizer will be used to call any function in it.

Read the whole post Python Speech Recognition from the original Post.

Image of Timescale

🚀 pgai Vectorizer: SQLAlchemy and LiteLLM Make Vector Search Simple

We built pgai Vectorizer to simplify embedding management for AI applications—without needing a separate database or complex infrastructure. Since launch, developers have created over 3,000 vectorizers on Timescale Cloud, with many more self-hosted.

Read more →

Top comments (0)

Billboard image

The Next Generation Developer Platform

Coherence is the first Platform-as-a-Service you can control. Unlike "black-box" platforms that are opinionated about the infra you can deploy, Coherence is powered by CNC, the open-source IaC framework, which offers limitless customization.

Learn more