we all know python is cheesy easy even when it became a little uncomfortable with us when typing the code there always awesome ones that came with an easy comfortable package to ease that process
...okay okay enough talking
intro
a month ago I was working on a deep learning model that can detect a list of words (Google command Dataset) I give the model a one-second .wav file and the model do its magic while I am finishing the project it came to my mind why I can't give the model direct input from my microphone,
so I started searching around the web how to get input from the mic and output that same data saved in the .wav file, to be honest
I don't found any pythonic easy way at all, here is what I found and how to get output from your microphone :
getting output from your microphone with python
first I show my way and then, you can see how it work under the hood (Long raw python CODE)
- first, install
pyaudio
on your machine - now clone Recwpy Repo and extract the .zip file
- in the extracted folder open main.py file and let's type
from record import Mic
mic = Mic() # inialize the the Mic class
mic.record_toFile(duration=10) # record a 10-sec wav file
and done as simple as that you now build a recorder app with python that can record a 10-sec .wav file
really no extra work at all, see the Doc Here
Now here is how these two lines works under the hood
first, we import numpy
and pyaudio
packages
then
p = pyaudio.PyAudio() # inialize the PyAudio
# then configure how pyaudio will handle the mic input and out streem of data
data_chunk = 4 # simply divide the data 4 times for easy processing
stream = p.open(
format=pyaudio.paInt16, #every sample from the mic is of type INT16 (-2**15, (2**15)-1)
channels= 1, # one channel or stereo channels
rate= 44100, # number of samples recorded every second
input=True, # get input from your mic
output=True, # and give us the output please
frames_per_buffer=data_chunck)
now you can get output from the microphone...Right? yes and Nope let me show what I mean
byte_data = stream.read()
the line above give us data as bytes we need to convert to integers
int_data = np.frombuffer(byte_data, dtype=np.int16)
print(int_data)
# output [12,0,-1,...,458,13547]
yeahππ...you just get output from your mic
now let's continue we just get a NumPy array of integers No that we can use to store it to a file we need to gather all upcoming arrays for a whole second ...yes my friend the above operation can take as little as a few milliseconds
Frames = []
for _ in range(0, int(RATE / CHUNK * 1)):
byte_data = stream.read()
int_data = np.frombuffer(byte_data, dtype=np.int16)
Frames.append(int_data) # append every array of data to list
# now we need all these arrays to be one big list (stack them together)
Frames= np.concatenate(Frames,axis=0)
Now we have one big array contain all the microphone output for one second so all we have to do store them to wave file and we are done
import wave
output_file = "demo_recoderded.wav"
# configure the wavfile by setting some parameter which we defined before
wf = wave.open(output_file,'wb')
wf.setnchannels(1) # number of channels
wf.setsampwidth(p.get_sample_size(pyaudio.paInt16))
wf.setframerate(44100) # the frame rate
wf.writeframes(b''.join(frames)) # join the data frames we stored
in the list
the width of a sample defines the number of bits required to represent the value on a storage
Now check your folder directory you should see an audio file called "demo_recoderded.wav" π₯ππ....boom! you did it
Let's warp up and set the conclusion
star and clone the repository and just get the Mic up and running in just two lines of code what you expect from this section π€π
for more information and Docs about the repo click here
this is the first time I write in my entire life tell me what you think and give us your feedback...Thank you.
Credits:
Photo by Robinson Recalde on Unsplash
and The awesome tutorial by Mark Jay
Top comments (0)