DEV Community

Sunder Kumar
Sunder Kumar

Posted on

Speech to Text using Assembly AI

This is a submission for the AssemblyAI Challenge : Sophisticated Speech-to-Text.

What I Built

I built a Speech-to-Text Application that showcases the power of Universal-2, AssemblyAI’s latest speech-to-text model. The application:

  1. Supports Multilingual Transcription Users can choose from multiple languages, ensuring global accessibility.
  2. Outputs with Formatting and Timestamps application delivers well-structured transcripts, complete with proper nouns, punctuation, and timestamps.
  3. User-Friendly Interface built using Streamlit, the app features an intuitive frontend for easy navigation and interaction.

Demo

Link to Github Repository

Journey

Incorporating Universal-2:
The application utilizes Universal-2 through AssemblyAI’s robust API. The backend:

  1. Uploads audio files using AssemblyAI's upload endpoint.
  2. Submits transcription requests, including optional parameters like language_code and punctuate.
  3. Polls transcription progress until completion and fetches the final transcript with timestamps, and word-by-word breakdown.

Screenshots

Home Page
Audio Processing
Final Results

Team Submission:
I worked on this project independently-Sunder Kumar

Heroku

Amplify your impact where it matters most — building exceptional apps.

Leave the infrastructure headaches to us, while you focus on pushing boundaries, realizing your vision, and making a lasting impression on your users.

Get Started

Top comments (0)

Image of Quadratic

The best Excel alternative with Python built-in

Quadratic is the all-in-one, browser-based AI spreadsheet that goes beyond traditional formulas for powerful visualizations and fast analysis.

Try Quadratic free

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay