DEV Community

Cover image for Media-Powered Blog Creator, Content Analyzer and Translation App
Busayo Samuel
Busayo Samuel

Posted on

Media-Powered Blog Creator, Content Analyzer and Translation App

This is a submission for the AssemblyAI Challenge : Sophisticated Speech-to-Text.

What I Built

I built AudioWhisperer, a web application that include 5 tools to provide speech-to-text functionalities for everyday use:

  1. Blog Tutorial Generator: Transforms audio transcripts into markdown-formatted tutorial articles using Lemur LLM. The front-end displays both the rendered and raw Markdown formats.

  2. Fluency Analyzer: According to several research papers, a way to improve fluency and build the ability to make impromptu speeches is to practice talking about a random topic for 1–3 minutes.
    This tool builds up on this theory. It generates 5 random topics, then it allows users to record or upload 1-minute audio samples, and uses Lemur LLM to analyze user's fluency, and it returns an analysis of the transcript, with pain points and suggestions for improvement.

  3. Content Moderation Tool: Uses AssemblyAI's content moderation parameter to identify potentially harmful or sensitive words. The front-end visualizes the data through a bar chart and a table.

  4. Transcript Translation: Provides multilingual transcript translation capabilities.

  5. Subtitle Generation: Allows users to generate, download, and create subtitle files in multiple languages.

Demo

Web link
Server code
Client code

Rendered Blog Page
Image description
Markdown Blog page
Image description
Content Analyzer Data Page
Image description
Subtitle Generator
Image description
Translation Generator
Image description
Topics Generator
Image description
Fluency Analysis
Image description

Journey

Assembly AI model's allowed me to create a user-friendly application, where I got to use the audio-to-text feature across all the 5 tools I worked on. In my code, I set the content moderation parameter to true (content_safety: true) to detect and list out sensitive content in media files. I added disfluency tracking by setting disfluencies: true. I also included the language_detection: true parameter, which helped to detect the original language of audio files. This was needed for me to efficiently translate the transcript to other languages.
For generating blog content and analyzing transcript fluency, I used Lemur AI.

My use of Lemur LLM for the blog content generator and fluency analysis qualifies my submission for the "No More Monkey Business" prompt.

Top comments (0)