We as humans, we love communicating in general and for those of us who aren't social enough to communicate with people, they get impressed by Interactive technology and AI and how can a machine respond back with a human-like response. Maybe that's why chatbots are everywhere now on the internet! I mean who wouldn't like to communicate with something and gets an immediate, smart and human-like response?
Interactive technology allows for a two-way flow of information through an interface between the user and the technology, combine this with AI and you have the perfect way to communicate with your users. What's more human-like than being able to speak to a machine or a system and find it able to understand your words regardless the language you speak and hear the information your requested with your own ears? It's like speaking to your TV, Air-conditioner or even your light system.
My Deepgram Use-Case
Imagine visiting a portfolio or a company's website and you just want to skip all the introductory stuff and speak with the person responsible. If it was a portfolio, maybe you want to ask questions like "What experience do you have?" or "What would I choose to work with you?" and you want to ask these questions without having to wait for the owner to reply back or having to read all your way to this point of information. This is where the idea popped into my mind.
Deepgram is offering an amazing AI speech recognition service for multiple languages which can take the user's audio and transform it into text. So why don't we use this to get the user's request and send it back to our system and according to specific keywords with the right algorithm, the system will be able to identify the user's request and immediately provide back the information they want in audio.
Chatbots are cool for a reason, they help all kind of businesses to achieve their goals with very little effort. You can see the huge success by looking into modern analysis about chatbots or AI customer support on this Article from thrivemyway.com.
Being able to speak with a bot in your language is much more fun and efficient than texting. Adding AI bot to your website will definitely increase interactivity, and even if the visitor is not interested in your business they would just love to play around with your bot 🤖
Dive into Details
I'm planning to use Deepgram voice recognition services to create AI bot which can understand audio and respond with the requested information also in audio.
This will eventually become an NPM package where you can install it in your project and use it for whatever reason you want. It will be very helpful especially in portfolios.
This is how it will go:
- The user will speak out what they want.
- The audio will be handled by Deepgram to transform it into text.
- The text will be translated into English.
- The translation will run through the logic or algorithm to pick up specific keywords to identify the user's needs.
- The response will be in text previously inserted by the owner.
- The text response will be translated to the language which the user selected.
- Eventually, the user will hear the response with their preferred language and with the information they asked for.
I'm working on the implementation of this project but I highly doubt I can complete it before the Hachathon ends. This will be my fourth project in this amazing Hackathon and I really enjoyed working with Deepgram and combining it with so many technologies like Laravel, VueJS, LameJS, Sound Visualization, MarkedJS and so many more.
Thank you for checking this out and please let me know what do you think and what can make it better. Also you can check out my other submissions from here 👇:
Top comments (6)
Looks like a great project to me! How are you going to do if the user makes mistakes in writing the keywords, in this case the program will not see that the keyword corresponds to that entered by the owner?
Well, I have created this bot using Deepgram speech to text technology. It can recognize some keywords to give you a previously entered response. dev.to/moose_said/add-ai-robot-to-...
If I understood correctly, it can recognize the theme of a message written by the user by transforming it into a keyword? Or he can recognize a word even if it has spelling or other errors?
It takes the user audio and transform it into words. And then look for specific keywords to identify the user's needs.
Ok, I understand, but how exactly does it manage to find the user's need? It is necessary to have specific words written beforehand (and in this case if there are spelling mistakes it can be annoying) or does the program understand the meaning of the key words?
Yes. The bot is looking through the transcript to identify previously entered keywords. Those keywords are written in the right spelling so if something was spelled incorrectly it would not be able to identify the user's needs. But it's very unlikely that the spelling will be incorrect because the user isn't typing, they are talking and Deepgram is transforming it into words.