Hi Everyone! π
Iβm excited to share a new project Iβve been working on: Speech Assistant. This application leverages cutting-edge technologies like Generative AI, OpenAI, and Streamlit to simplify audio-to-text conversion and automate the creation of Minutes of Meeting (MoM).
With Speech Assistant, you can effortlessly convert meeting recordings, call logs, or any audio file into text in multiple languages. It doesnβt stop thereβit can generate insightful MoMs with key points, action items, and sentiment analysis, and even transform text back into audio!
You Check out this app here - Smart-Speech-Assistant
In this article, Iβll walk you through the features, tech stack, and how you can use or contribute to the project. Letβs dive in! π
Speech Assistant is a cutting-edge tool powered by Generative AI to help you transform your audio files into actionable insights. The project leverages Python, OpenAI, and Streamlit to provide seamless audio-to-text conversion, text-to-audio synthesis, and automated Minutes of Meeting (MoM) creation.
β¨ Features
- ποΈ Audio-to-Text Conversion: Convert meetings, call recordings, and other audio files into text in multiple languages.
- π Text-to-Audio: Convert text back into audio in any language.
- π Minutes of Meeting Generator: Automatically generate MoM for your audio recordings, including:
- Call Sentiments Analysis
- Key Points
- Summaries
- Action Points
- π₯ Download Options: Export MoM as a downloadable file for easy sharing.
- π Multilingual Support: Handle multiple languages for both audio and text.
π οΈ Tech Stack
- Language: Python
- Frontend: Streamlit
- Backend: OpenAI APIs for Generative AI tasks
-
Libraries:
- Speech-to-text: OpenAI Whisper
- Text-to-speech: OpenAI
- Sentiment Analysis: OpenAI GPT
- File Handling: Pandas, OS
π Getting Started
Prerequisites
- Python 3.8+
- OpenAI API Key
- Install dependencies using:
pip install -r requirements.txt
### Installation
1. Clone this repository:
bash
git clone https://github.com/r123singh/speech-assistant.git
cd speech-assistant
2. Run the Streamlit application:
bash
streamlit run app.py
### Usage
1. Upload an audio file (supported formats: `.wav`, `.mp3`, `.m4a`).
2. Select the desired operation:
- Convert audio to text.
- Generate MoM.
- Convert text to audio.
3. View and download the generated outputs.
## π Project Structure
speech-assistant/
β
βββ main.py # Main Streamlit application
βββ requirements.txt # Python dependencies
βββ utils.py/ # Utility functions for processing
βββ assets/ # Example input/output files
βββ README.md # Project documentation
## π Example
### Input:
- **Audio File**: Team meeting recording.
### Output:
- **Transcription**:
Welcome to our Q4 planning meeting. Today we'll discuss key objectives and allocate action items...
- **Generated MoM**:
- **Call Sentiments**: Positive with constructive feedback.
- **Summary**: Discussed Q4 objectives, marketing strategy, and resource allocation.
- **Action Points**:
- Complete budget analysis by Dec 20.
- Finalize campaign design by Jan 5.
## π€ Contributing
We welcome contributions! Please follow these steps:
1. Fork the repository.
2. Create a feature branch: `git checkout -b feature-name`.
3. Commit your changes: `git commit -m 'Add some feature'`.
4. Push to the branch: `git push origin feature-name`.
5. Open a pull request.
β‘ **Transform your conversations into insights with Speech Assistant!**
Top comments (0)