🎙️ AudioIntel

Amit Wani · 2024-11-25T04:29:09Z

This is a submission for the AssemblyAI Challenge : Sophisticated Speech-to-Text & No More Monkey Business. 🏆 What I Built 🛠️ I built AudioIntel - a powerful platform that transforms audio content into actionable intelligence using AssemblyAI's cutting-edge APIs. The platform helps users extract valuable insights from audio content through advanced transcription, analysis, and AI-powered features. ✨ 🔗 Live Demo: https://audiointel.amitwani.dev 🎥 Demo Video Journey 🗺️ The Inspiration 💡 The idea for AudioIntel came from my own struggles with processing audio content efficiently. As someone who consumes a lot of podcasts, interviews, and video content, I often found myself wanting to quickly extract key insights without listening to hours of content. I realized this was a common pain point for many content creators, researchers, and professionals. 🎧 Learning & Iterations 📚 🔄 Integration with AssemblyAI's powerful APIs for transcription and analysis 🗣️ Leveraging AssemblyAI's speaker diarization and sentiment analysis features 🧠 Leveraging AssemblyAI with LeMUR for summarization, question answering, and intelligent content analysis ⚠️ Error handling in audio processing and real-time status updates 🔄 State management for handling complex UI interactions ⚡ Performance optimization for processing large audio files 💾 Database integration using Neon PostgreSQL with Drizzle ORM 🔒 User authentication implementation with Better Auth 🌐 Language translation features using Google Translate API 📤 File upload handling through UploadThing integration Features Showcase ✨ Multiple Input Sources 📥 📁 File Upload: Support for various audio formats through UploadThing integration 🎙️ Browser Recording: Direct audio capture using the Web Audio API 📺 YouTube Integration: YouTube video to audio conversion and analysis Real-time Analysis 📊 👥 Speaker diarization with timeline visualization 😊 Sentiment analysis with color-coded segments 🔍 Interactive transcript search and navigation 💬 Interactive chat with the transcript Smart Content Generation 📝 🤖 AI-powered blog post creation 💭 Context-aware chat interface 📌 Key sections identification with timestamps Language Translation 🌍 🔄 Translate transcript to multiple languages Screenshots 📸 Multiple Sources - Audio file, Record file & YouTube 📱 Overview & Analysis 📊 Interactive Features ⚡ Tech Stack 💻 🔥 Framework : Next.js 14 with App Router 📝 Language : TypeScript 💾 Database : Neon PostgreSQL with Drizzle ORM 🎨 UI : Tailwind CSS + shadcn/ui 🎙️ Audio Processing : AssemblyAI 📤 File Upload : UploadThing 📊 Analytics : OpenPanel 🔒 Authentication : Better Auth 🌐 Translation : Google Translate 🚀 Deployment : Vercel Techincal Archicture 🏗️ Technical Implementation ⚙️ AssemblyAI Integration 🔌 I leveraged several powerful features from AssemblyAI's SDK: Transcription API const transcript = await assemblyai . transcripts . transcribe ({ audio : fileUrl , speaker_labels : true , summarization : true , summary_model : " conversational " , summary_type : " bullets " , sentiment_analysis : true , }); Enter fullscreen mode Exit fullscreen mode LeMUR for Content Generation // Generate blog post const { response : blogPostResponse } = await assemblyai . lemur . task ({ transcript_ids : [ transcript . id ], prompt : `Generate a blog post from the transcript in markdown format` , final_model : " anthropic/claude-3-5-sonnet " , }); // Generate actionable insights const { response : insights } = await assemblyai . lemur . task ({ transcript_ids : [ transcript . id ], prompt : `Provide actionable insights from the transcript` , final_model : " anthropic/claude-3-5-sonnet " , }); Enter fullscreen mode Exit fullscreen mode LeMUR for Interactive Chat const { response : qas } = await assemblyai . lemur . questionAnswer ({ transcript_ids : [ transcriptId ], final_model : " anthropic/claude-3-5-sonnet " , questions : [{ question : userMessage , answer_format : " short sentence " }], }); Enter fullscreen mode Exit fullscreen mode Future Enhancements 🚀 Multi-language support Advanced analytics dashboard API endpoints Custom templates Advanced search capabilities Source Code 🔗 mtwn105 / audio-intel AudioIntel - Audio/Video Intelligence, Transcripts, Summary, and much more 🎙️ AudioIntel Transform audio into actionable intelligence with our powerful AI platform. AudioIntel helps you extract valuable insights from audio content through transcription, analysis, and AI-powered features. Live Demo ✨ Features 🎵 Multiple Input Methods Upload audio files (MP3, WAV) Record directly in browser Analyze YouTube videos 🤖 AI-Powered Analysis Smart summaries and key takeaways Sentiment analysis Speaker identification Actionable insights generation 📝 Content Generation Automatic blog post creation Interactive chat with transcripts Key sections identification 🔍 Advanced Features Timeline view with precise timestamps Multi-speaker detection Searchable transcripts Real-time sentiment tracking 🚀 Getting Started Prerequisites Node.js 18+ npm or yarn AssemblyAI API key Installation Clone the repository git clone https://github.com/yourusername/audio-intel.git cd audio-intel Install dependencies npm install # or yarn install Enter fullscreen mode Exit fullscreen mode Set up environment variables cp .env.example .env Enter fullscreen mode Exit fullscreen mode Required environment variables: <div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content="ASSEMBLYAI_API_KEY=your_api_key NEXT_PUBLIC_APP_URL=http://localhost:3000 UPLOADTHING_TOKEN=your_uploadthing_token GOOGLE_GENERATIVE_AI_API_KEY=your_google_generative_ai_api_key GOOGLE_TRANSLATE_API_KEY=your_google_translate_api_key BETTER_AUTH_SECRET=your_better_auth_secret BETTER_AUTH_BASE_URL=http://localhost:3000 DATABASE_URL=your_database_url"> ASSEMBLYAI_API_KEY=your_api_key NEXT_PUBLIC_APP_URL=http://localhost:3000 UPLOADTHING_TOKEN=your_uploadthing_token GOOGLE_GENERATIVE_AI_API_KEY=your_google_generative_ai_api_key GOOGLE_TRANSLATE_API_KEY=your_google_translate_api_key BETTER_AUTH_SECRET=your_better_auth_secret BETTER_AUTH_BASE_URL=http://localhost:3000 DATABASE_URL=your_database_url Run the development server npm run dev # or yarn dev Enter fullscreen mode Exit fullscreen mode Open… View on GitHub Submission 📝 This submission was made for the AssemblyAI Challenge for "Sophisticated Speech-to-Text" & "No More Monkey Business" Prompts. Conclusion 🎉 I had a great time participating in the AssemblyAI Challenge and learned a lot from the experience. I'm looking forward to seeing what other developers come up with! 🚀 Thank you Dev.To & AssemblyAI for organizing this challenge and providing such a great platform for developers to showcase their skills! 🎉

Transform audio into actionable intelligence with our powerful AI platform. AudioIntel helps you extract valuable insights from audio content through transcription, analysis, and AI-powered features.

Live Demo

✨ Features

🎵 Multiple Input Methods
- Upload audio files (MP3, WAV)
- Record directly in browser
- Analyze YouTube videos
🤖 AI-Powered Analysis
- Smart summaries and key takeaways
- Sentiment analysis
- Speaker identification
- Actionable insights generation
📝 Content Generation
- Automatic blog post creation
- Interactive chat with transcripts
- Key sections identification
🔍 Advanced Features
- Timeline view with precise timestamps
- Multi-speaker detection
- Searchable transcripts
- Real-time sentiment tracking

🚀 Getting Started

Prerequisites

Node.js 18+
npm or yarn
AssemblyAI API key

Installation

Clone the repository

git clone https://github.com/yourusername/audio-intel.git
cd audio-intel

Install dependencies

npm install
# or
yarn install

Set up environment variables

cp .env.example .env

Required environment variables:

ASSEMBLYAI_API_KEY=your_api_key
NEXT_PUBLIC_APP_URL=http://localhost:3000
UPLOADTHING_TOKEN=your_uploadthing_token
GOOGLE_GENERATIVE_AI_API_KEY=your_google_generative_ai_api_key
GOOGLE_TRANSLATE_API_KEY=your_google_translate_api_key
BETTER_AUTH_SECRET=your_better_auth_secret
BETTER_AUTH_BASE_URL=http://localhost:3000
DATABASE_URL=your_database_url

Run the development server

npm run dev
# or
yarn dev

Open…

DEV Community

AudioIntel - Transform Audio into Actionable Intelligence

What I Built 🛠️

🔗 Live Demo: https://audiointel.amitwani.dev

🎥 Demo Video

Journey 🗺️

The Inspiration 💡

Learning & Iterations 📚

Features Showcase ✨

Multiple Input Sources 📥

Real-time Analysis 📊

Smart Content Generation 📝

Language Translation 🌍

Screenshots 📸

Multiple Sources - Audio file, Record file & YouTube 📱

Overview & Analysis 📊

Interactive Features ⚡

Tech Stack 💻

Techincal Archicture 🏗️

Technical Implementation ⚙️

AssemblyAI Integration 🔌

Future Enhancements 🚀

Source Code 🔗

mtwn105 / audio-intel

AudioIntel - Audio/Video Intelligence, Transcripts, Summary, and much more

🎙️ AudioIntel

Live Demo

✨ Features

🚀 Getting Started

Prerequisites

Installation

Submission 📝

Conclusion 🎉

Top comments (3)