Media-Powered Blog Creator, Content Analyzer and Translation App

#devchallenge #assemblyaichallenge #ai #api

This is a submission for the AssemblyAI Challenge : Sophisticated Speech-to-Text.

What I Built

I built AudioWhisperer, a web application that include 5 tools to provide speech-to-text functionalities for everyday use:

Blog Tutorial Generator: Transforms audio transcripts into markdown-formatted tutorial articles using Lemur LLM. The front-end displays both the rendered and raw Markdown formats.
Fluency Analyzer: According to several research papers, a way to improve fluency and build the ability to make impromptu speeches is to practice talking about a random topic for 1–3 minutes.
This tool builds up on this theory. It generates 5 random topics, then it allows users to record or upload 1-minute audio samples, and uses Lemur LLM to analyze user's fluency, and it returns an analysis of the transcript, with pain points and suggestions for improvement.
Content Moderation Tool: Uses AssemblyAI's content moderation parameter to identify potentially harmful or sensitive words. The front-end visualizes the data through a bar chart and a table.
Transcript Translation: Provides multilingual transcript translation capabilities.
Subtitle Generation: Allows users to generate, download, and create subtitle files in multiple languages.

Demo

Web link
Server code
Client code

Rendered Blog Page

Markdown Blog page

Content Analyzer Data Page

Subtitle Generator

Translation Generator

Topics Generator

Fluency Analysis

Journey

Assembly AI model's allowed me to create a user-friendly application, where I got to use the audio-to-text feature across all the 5 tools I worked on. In my code, I set the content moderation parameter to true (content_safety: true) to detect and list out sensitive content in media files. I added disfluency tracking by setting disfluencies: true. I also included the language_detection: true parameter, which helped to detect the original language of audio files. This was needed for me to efficiently translate the transcript to other languages.
For generating blog content and analyzing transcript fluency, I used Lemur AI.

My use of Lemur LLM for the blog content generator and fluency analysis qualifies my submission for the "No More Monkey Business" prompt.

DEV Community

Media-Powered Blog Creator, Content Analyzer and Translation App

What I Built

Demo

Journey

Top comments (0)

Read next

A beginner's guide to the Flux.1-Dev model by Black-Forest-Labs on Huggingface

AI Agents Tutorial For Beginners

AI System Combines Face Analysis and Body Signals to Better Detect Human Emotions

New Open-Source AI Model OLMo 2 Matches Leading Language Models While Using Less Computing Power