@michaeljolley I create a sample to send audio directly to deepgram from the microphone.
I have created a repository for this sample here github.com/FrankPohl/DeepGram.NETS...
But there is one thing I do not understand. I resample the audio input to PCM with a sample rate of 16000. But in the deepgram options is 44100 given as a sample rate. If I change that to 16000 I do not get a transcription. Why is that?
That's a REALLY good question @fp! That code isn't resampling, it's converting it from 32-bit to 16-bit. IeeeFloat is a 32-bit format. We're basically converting C# long into C# short. This blog post does a good job at describing the differences in the two.
@michaeljolley Which code is not resampling? The code in my example on Github converts from to short, that's right. But in a second step it does resampling because I'm averaging 3 consecutive input values into one output value. I thing this means that the sampling rate is reduced from 48000 to 16000. The wav file that is written in parallel has this sampling rate and sounds alright.
For further actions, you may consider blocking this person and/or reporting abuse
We're a place where coders share, stay up-to-date and grow their careers.
@michaeljolley I create a sample to send audio directly to deepgram from the microphone.
I have created a repository for this sample here github.com/FrankPohl/DeepGram.NETS...
But there is one thing I do not understand. I resample the audio input to PCM with a sample rate of 16000. But in the deepgram options is 44100 given as a sample rate. If I change that to 16000 I do not get a transcription. Why is that?
That's a REALLY good question @fp! That code isn't resampling, it's converting it from 32-bit to 16-bit. IeeeFloat is a 32-bit format. We're basically converting C# long into C# short. This blog post does a good job at describing the differences in the two.
@michaeljolley Which code is not resampling? The code in my example on Github converts from to short, that's right. But in a second step it does resampling because I'm averaging 3 consecutive input values into one output value. I thing this means that the sampling rate is reduced from 48000 to 16000. The wav file that is written in parallel has this sampling rate and sounds alright.