Hi, I am trying to solve a very similar problem here.
I have an audio clip where a person says a particular Matra once!
Like this - Om Namah Shivay - This is your input voice
Now, The person starts chanting the same mantra Over and over and without any stop
Om Namah ShivayOm Namah ShivayOm Namah ShivayOm Namah ShivayOm Namah ShivayOm Namah ShivayOm Namah ShivayOm Namah ShivayOm Namah ShivayOm Namah ShivayOm Namah ShivayOm Namah ShivayOm Namah ShivayOm Namah ShivayOm Namah ShivayOm Namah ShivayOm Namah Shivay
Note that there is no fixed silence between each time it is being said.
I need to show the count of the number of times he has spoken it correctly in runtime, as he speaks.
How can I achieve this using Python or machine learning,
Note that the mantra can be very different as well as very long also, He can say it in various volumes and pitches.
Hi, I am trying to solve a very similar problem here.
I have an audio clip where a person says a particular Matra once!
Like this - Om Namah Shivay - This is your input voice
Now, The person starts chanting the same mantra Over and over and without any stop
Om Namah ShivayOm Namah ShivayOm Namah ShivayOm Namah ShivayOm Namah ShivayOm Namah ShivayOm Namah ShivayOm Namah ShivayOm Namah ShivayOm Namah ShivayOm Namah ShivayOm Namah ShivayOm Namah ShivayOm Namah ShivayOm Namah ShivayOm Namah ShivayOm Namah Shivay
Note that there is no fixed silence between each time it is being said.
I need to show the count of the number of times he has spoken it correctly in runtime, as he speaks.
How can I achieve this using Python or machine learning,
Note that the mantra can be very different as well as very long also, He can say it in various volumes and pitches.
try using Whisper