Hey everyone!
Today, I worked on the real-time transcription app project, implementing Diart was super easy and successful and it's working great. Some few touch-ups to do but nothing serious, I'm sure that I'll make a pull request to the library I'm contributing this to tomorrow. I just need to finish making a changelog, test Diart with a faster-whisper backend, and that's it! Also make the model configurable and whatnot.
Love the way this came out, this took a long time to make but I learned a loooooot about streaming and working with Whisper. Some lessons, like the fact that you can condition the model with past transcriptions, were learned the hard way. Or, not checking the sample rate when trying to decode audio files. Which was apparently my problem along with the block size which was wrong, however, that I don't think I would've thought of. I've never had that problem before.
Also worked on a text summarization project for a client but I won't take any more clients after this, I'll focus on my projects :)
The next days following this will be amazing in terms of projects which will come in a lot faster now (I hope!)
That's it everyone,
Happy coding!
Top comments (0)