Hey everyone!
Today was a day where I barely had any time to work on coding, but I did work some more on understanding on understanding how the personal assistant project I'm working on will work. I've started testing out the tools I'll be using in a stand-alone manner before implementing them into a flow.
There are multiple things to consider when it comes to an application that makes phone calls for you, which are:
Transcription of the other party's speech.
Generating responses to the other party's speech while following the user's wishes.
Synthesizing the generated response (sounding it out to the other party)
The transcription is certainly the part that needs the most fine-tuning, as it tends to be a major contributor to latency.
These are the components I'll be testing out before working on the server's architecture. The one that will handle the phone call requests and the one that will actually make the phone calls. Both are different and contain different routes, so that's some food for thought!
That's it for today,
Happy coding everyone!
Top comments (0)