DEV Community

Cover image for Transforming Interviews into Publishable Stories with AssemblyAI
Giovanni Improta
Giovanni Improta Subscriber

Posted on • Edited on

Transforming Interviews into Publishable Stories with AssemblyAI

This is a submission for the AssemblyAI Challenge : No More Monkey Business.

What I Built

Insightview is a modern web application that streamlines the interview workflow for journalists. By leveraging AssemblyAI's LeMUR and Universal-2 technology, it transforms raw interview recordings into structured, actionable content, dramatically reducing the time from recording to publication.

Key Features:

  • πŸŽ₯ Audio/video file upload with real-time preview
  • πŸ—£οΈ Advanced transcription with speaker identification
  • ⭐ Automatic highlight extraction of key moments
  • ✍️ AI-powered article draft generation
  • πŸ“€ Export interview's subtitles in VTT format

Demo

Experience Insightview in action at the Live Demo. Upload your own interview recordings to see how the platform handles transcription, highlight extraction, and article generation in real-time.

Explore the project repository on GitHub.

Journey

Working for Italy's leading news organization, I interact daily with journalists who've expressed a common pain point: the time-consuming process of transcribing interviews and adding subtitles. When I discovered the AssemblyAI challenge, I saw an opportunity to build a solution that could transform their workflow.

Insightview was built using Next.js and TypeScript, with a clean and modern UI powered by Tailwind CSS and Shadcn UI components.
Insightview integrates AssemblyAI's technology in three powerful ways:

1. Transcription

I used AssemblyAI's Universal-2 speech-to-text model for accurate interview transcription, ensuring faithful proper nouns, precise text formatting, and casing. Advanced speaker identification and labeling are powered by LeMUR. Users can generate subtitles from the transcription and export them as a VTT file. For video uploads, subtitles are automatically added to the video preview.

Transcription feature

2. Smart Highlights

I implemented LeMUR to automatically extract the most engaging and impactful quotes from interviews. The prompt is designed to identify powerful, insightful, or memorable segments between 5 to 15 seconds long. Each highlight is precisely linked to the original audio or video timestamp using Universal-2's accurate timestamp detection, enabling journalists to quickly preview the extracted segments.

Highlights feature

3. Article Generation

One of the most intriguing applications of LeMUR is the article drafting feature. I crafted a sophisticated prompt that enables LeMUR to:

  • Grasp the interview context
  • Extract key highlights
  • Organize content in a journalistic style
  • Preserve the original tone and emotional nuances
  • Generate clean HTML with proper semantic markup

This feature isn’t meant to replace the expertise of seasoned journalists but serves as a valuable tool to inspire and accelerate their work. It provides a draft template as a starting point, allowing them to focus on refining and enhancing the final article.

Article generation feature

Other prompts submission

This submission also qualifies for the Sophisticated Speech-to-Text prompt, showcasing Universal-2's strengths in accurate transcription, proper noun recognition, and precise timestamp detection seamlessly integrated into the application.

Top comments (3)

Collapse
 
miketalbot profile image
Mike Talbot ⭐

Wow, this is great. Can definitely see why you won here. "Way to go" in solving a real problem.

Collapse
 
eioluseyi profile image
EIO β€’ Emmanuel Imolorhe

Can we get some sample audio/video to test the demo with?
I've failed marvelously at googling this.

Collapse
 
romolo_samuelevelati profile image
Romolo Samuele Velati

Bravo Gio' !