DEV Community

Ramandeep Singh
Ramandeep Singh

Posted on

🌟 Speech Assistant: AI-Powered Audio-to-Text & MoM Generator

Hi Everyone! πŸ‘‹

I’m excited to share a new project I’ve been working on: Speech Assistant. This application leverages cutting-edge technologies like Generative AI, OpenAI, and Streamlit to simplify audio-to-text conversion and automate the creation of Minutes of Meeting (MoM).

With Speech Assistant, you can effortlessly convert meeting recordings, call logs, or any audio file into text in multiple languages. It doesn’t stop thereβ€”it can generate insightful MoMs with key points, action items, and sentiment analysis, and even transform text back into audio!

You Check out this app here - Smart-Speech-Assistant

In this article, I’ll walk you through the features, tech stack, and how you can use or contribute to the project. Let’s dive in! πŸš€

Python

Streamlit

Speech Assistant is a cutting-edge tool powered by Generative AI to help you transform your audio files into actionable insights. The project leverages Python, OpenAI, and Streamlit to provide seamless audio-to-text conversion, text-to-audio synthesis, and automated Minutes of Meeting (MoM) creation.

✨ Features

  • πŸŽ™οΈ Audio-to-Text Conversion: Convert meetings, call recordings, and other audio files into text in multiple languages.
  • πŸ” Text-to-Audio: Convert text back into audio in any language.
  • πŸ“ Minutes of Meeting Generator: Automatically generate MoM for your audio recordings, including:
    • Call Sentiments Analysis
    • Key Points
    • Summaries
    • Action Points
  • πŸ“₯ Download Options: Export MoM as a downloadable file for easy sharing.
  • 🌍 Multilingual Support: Handle multiple languages for both audio and text.

πŸ› οΈ Tech Stack

  • Language: Python
  • Frontend: Streamlit
  • Backend: OpenAI APIs for Generative AI tasks
  • Libraries:
    • Speech-to-text: OpenAI Whisper
    • Text-to-speech: OpenAI
    • Sentiment Analysis: OpenAI GPT
    • File Handling: Pandas, OS

πŸš€ Getting Started

Prerequisites

  1. Python 3.8+
  2. OpenAI API Key
  3. Install dependencies using:
   pip install -r requirements.txt  

### Installation  

1. Clone this repository:  
Enter fullscreen mode Exit fullscreen mode


bash

git clone https://github.com/r123singh/speech-assistant.git

cd speech-assistant


2. Run the Streamlit application:  
Enter fullscreen mode Exit fullscreen mode


bash

streamlit run app.py




### Usage  

1. Upload an audio file (supported formats: `.wav`, `.mp3`, `.m4a`).  
2. Select the desired operation:  
   - Convert audio to text.  
   - Generate MoM.  
   - Convert text to audio.  
3. View and download the generated outputs.  

## πŸ“‚ Project Structure  

speech-assistant/  
β”‚  
β”œβ”€β”€ main.py               # Main Streamlit application  
β”œβ”€β”€ requirements.txt     # Python dependencies  
β”œβ”€β”€ utils.py/               # Utility functions for processing  
β”œβ”€β”€ assets/              # Example input/output files  
└── README.md            # Project documentation  

## πŸ“– Example  

### Input:  

- **Audio File**: Team meeting recording.  

### Output:  

- **Transcription**:  
  Welcome to our Q4 planning meeting. Today we'll discuss key objectives and allocate action items...  

- **Generated MoM**:  

  - **Call Sentiments**: Positive with constructive feedback.  
  - **Summary**: Discussed Q4 objectives, marketing strategy, and resource allocation.  
  - **Action Points**:  
    - Complete budget analysis by Dec 20.  
    - Finalize campaign design by Jan 5.  

## 🀝 Contributing  

We welcome contributions! Please follow these steps:  

1. Fork the repository.  
2. Create a feature branch: `git checkout -b feature-name`.  
3. Commit your changes: `git commit -m 'Add some feature'`.  
4. Push to the branch: `git push origin feature-name`.  
5. Open a pull request.  

⚑ **Transform your conversations into insights with Speech Assistant!**  
Enter fullscreen mode Exit fullscreen mode

Top comments (0)