voicegptoss
π€ Voice Agent with gpt-oss-120b - Openai open source model
A lightning-fast voice AI agent powered by OpenAI's new gpt-oss-120b model, running locally with Cerebras AI acceleration and Vapi integration. Experience blazing-fast Time To First Token (TTFT) of 0.3-0.7 seconds for real-time conversational AI.
β¨ Features
- Ultra-Low Latency: TTFT of 0.3-0.7s using OpenAI's gpt-oss-120b model
- Local Deployment: Run your voice agent locally with public tunnel access
- Cerebras AI Acceleration: Leverages Cerebras AI's inference infrastructure for optimal performance
- Vapi Integration: Seamless voice interface through Vapi's telephony platform
- Real-time Processing: True real-time voice conversations with minimal delay
π Performance
This implementation achieves exceptional performance metrics:
- Time To First Token (TTFT): 0.3-0.7 seconds
- Model: OpenAI GPT-4o Realtime (OSS)
- Infrastructure: Cerebras AI + Local deployment
- Latency: Optimized for real-time voice interactions
π οΈ Tech Stack
- AI Model: OpenAI gpt-oss-120b
- Inference: Cerebras AI
- Voice Platform: Vapi
- Tunneling: ngrok
- Backend: Python
- Deployment: Local with public exposure
π Prerequisites
- Python 3.8+
- Git
- ngrok account and installation
- Cerebras AI API key
- Vapi account
π Quick Start
1. Clone the Repository
git clone git@github.com:Navashub/voicegptoss.git
cd voicegptoss
2. Set Up Environment
Create a .env
file in the project root:
touch .env
Add your Cerebras AI API key to the .env
file:
CEREBRAS_API_KEY=your_cerebras_api_key_here
3. Get Cerebras AI API Key
- Visit Cerebras AI
- Sign up for an account
- Navigate to API keys section
- Generate a new API key
- Copy the key to your
.env
file
4. Set Up ngrok
- Create an account at ngrok.com
- Install ngrok on your system:
# Windows (using chocolatey)
choco install ngrok
# macOS (using homebrew)
brew install ngrok/ngrok/ngrok
# Linux
curl -s https://ngrok-agent.s3.amazonaws.com/ngrok.asc | sudo tee /etc/apt/trusted.gpg.d/ngrok.asc >/dev/null
echo "deb https://ngrok-agent.s3.amazonaws.com buster main" | sudo tee /etc/apt/sources.list.d/ngrok.list
sudo apt update && sudo apt install ngrok
- Authenticate ngrok with your token:
ngrok config add-authtoken YOUR_NGROK_AUTHTOKEN
5. Install Dependencies
pip install -r requirements.txt
6. Run the Application
python main.py
The application will:
- Start the local server
- Create an ngrok tunnel
- Display the public URL in the console
7. Configure Vapi
- Copy the public ngrok URL from your console output
- Go to your Vapi dashboard
- Add the public URL as your webhook endpoint
- Configure your voice agent settings
8. Test Your Voice Agent
Your voice agent is now live and ready to handle calls through Vapi!
π§ Configuration
Environment Variables
-
CEREBRAS_API_KEY
: Your Cerebras AI API key for model inference -
NGROK_AUTHTOKEN
: Your ngrok authentication token (optional, can be set via ngrok config)
Customization
You can modify the voice agent behavior by editing the configuration in main.py
:
- Adjust model parameters
- Modify response formatting
- Configure webhook endpoints
- Set custom voice settings
π Performance Optimization
This setup is optimized for minimal latency:
- Cerebras AI: Provides fast inference for the GPT-4o model
- Local Deployment: Eliminates additional network hops
- ngrok Tunneling: Secure public access without complex networking
- Optimized Code: Streamlined request/response handling
π Troubleshooting
Common Issues
ngrok Authentication Error
# Make sure you're using the tunnel authtoken, not API key
ngrok config add-authtoken YOUR_TUNNEL_AUTHTOKEN
Cerebras API Key Issues
- Verify your API key is correctly added to
.env
- Check your Cerebras AI account has sufficient credits
- Ensure API key has proper permissions
Connection Issues
- Check firewall settings
- Verify ngrok tunnel is active
- Confirm webhook URL in Vapi matches ngrok public URL
π Monitoring
Monitor your voice agent performance:
- Check console logs for TTFT metrics
- Monitor Cerebras AI usage in their dashboard
- Track call quality in Vapi analytics
- Use ngrok dashboard for tunnel statistics
π€ Contributing
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
π License
This project is licensed under the MIT License - see the LICENSE file for details.
π Acknowledgments
- OpenAI for GPT-4o Realtime model
- Cerebras AI for high-performance inference
- Vapi for voice interface platform
- ngrok for secure tunneling solution
π Support
If you encounter any issues or have questions:
- Check the troubleshooting section above
- Open an issue on GitHub
- Review the logs for error details
β‘ Ready to build the future of voice AI? Get started now!
Top comments (0)