This is a submission for the AssemblyAI Challenge : Sophisticated Speech-to-Text.
https://youtu.be/81zSDsmsUVE?si=9tD_jVf6XeiNDlDk
What I Built
I built Lyricify, a web app that generates song lyrics from audio files. With this app, users can upload an audio file of a song, and it processes the file to extract lyrics using AssemblyAI’s Universal-2 Speech-to-Text model. The app is simple, efficient, and designed to help artists, producers, and music enthusiasts convert audio content into readable text effortlessly.
Key Features
Upload Audio Files:
Supports multiple audio formats.Lyric Generation:
Extracts lyrics accurately, even in noisy environments.
Save Sessions Locally: Leveraging useLocalStorage, users can save and revisit their transcription results without re-uploading files.User-Friendly Interface:
Built with modern design principles for seamless user experience.Responsive Design:
Works smoothly on both desktop and mobile devices.
Source code Lyrics Generator Source Code
Demo
Screenshots
How I Incorporated AssemblyAI
AssemblyAI’s Universal-2 Speech-to-Text Model powers the lyric extraction feature of Lyricify. Its robust capabilities to transcribe speech from various audio formats with high accuracy were critical to the success of this app. I utilized its APIs to process uploaded audio files and retrieve clean, structured lyric data.
Journey
The app qualifies for additional prompts as I explored advanced features of AssemblyAI:
Speaker Diarization:
Differentiating between lead singers and backup vocals to improve transcription quality.Content Filtering:
Filtering out instrumental segments or non-vocal parts to focus on lyrics.
To enhance user experience, Lyricify uses the useLocalStorage hook to save transcription results locally. Users can return to the app and access their previously generated lyrics without the need to reprocess audio files. This feature ensures seamless session management and quick access to data.->
Tech Stack
Frontend:
Next.js and TypeScript for a modern, performant web application.Backend:
Integrated AssemblyAI’s APIs for speech-to-text processing.Storage:
Implemented useLocalStorage to store and retrieve transcription sessions.
Styling: Tailwind CSS for responsive and clean UI components.
Top comments (0)