DEV Community

Cover image for Trying out the Openai new open-source models
Navas Herbert
Navas Herbert

Posted on

Trying out the Openai new open-source models

voicegptoss

🎀 Voice Agent with gpt-oss-120b - Openai open source model

A lightning-fast voice AI agent powered by OpenAI's new gpt-oss-120b model, running locally with Cerebras AI acceleration and Vapi integration. Experience blazing-fast Time To First Token (TTFT) of 0.3-0.7 seconds for real-time conversational AI.

✨ Features

  • Ultra-Low Latency: TTFT of 0.3-0.7s using OpenAI's gpt-oss-120b model
  • Local Deployment: Run your voice agent locally with public tunnel access
  • Cerebras AI Acceleration: Leverages Cerebras AI's inference infrastructure for optimal performance
  • Vapi Integration: Seamless voice interface through Vapi's telephony platform
  • Real-time Processing: True real-time voice conversations with minimal delay

πŸš€ Performance

This implementation achieves exceptional performance metrics:

  • Time To First Token (TTFT): 0.3-0.7 seconds
  • Model: OpenAI GPT-4o Realtime (OSS)
  • Infrastructure: Cerebras AI + Local deployment
  • Latency: Optimized for real-time voice interactions

πŸ› οΈ Tech Stack

  • AI Model: OpenAI gpt-oss-120b
  • Inference: Cerebras AI
  • Voice Platform: Vapi
  • Tunneling: ngrok
  • Backend: Python
  • Deployment: Local with public exposure

πŸ“‹ Prerequisites

  • Python 3.8+
  • Git
  • ngrok account and installation
  • Cerebras AI API key
  • Vapi account

πŸš€ Quick Start

1. Clone the Repository

git clone git@github.com:Navashub/voicegptoss.git
cd voicegptoss
Enter fullscreen mode Exit fullscreen mode

2. Set Up Environment

Create a .env file in the project root:

touch .env
Enter fullscreen mode Exit fullscreen mode

Add your Cerebras AI API key to the .env file:

CEREBRAS_API_KEY=your_cerebras_api_key_here
Enter fullscreen mode Exit fullscreen mode

3. Get Cerebras AI API Key

  1. Visit Cerebras AI
  2. Sign up for an account
  3. Navigate to API keys section
  4. Generate a new API key
  5. Copy the key to your .env file

4. Set Up ngrok

  1. Create an account at ngrok.com
  2. Install ngrok on your system:
   # Windows (using chocolatey)
   choco install ngrok

   # macOS (using homebrew)
   brew install ngrok/ngrok/ngrok

   # Linux
   curl -s https://ngrok-agent.s3.amazonaws.com/ngrok.asc | sudo tee /etc/apt/trusted.gpg.d/ngrok.asc >/dev/null
   echo "deb https://ngrok-agent.s3.amazonaws.com buster main" | sudo tee /etc/apt/sources.list.d/ngrok.list
   sudo apt update && sudo apt install ngrok
Enter fullscreen mode Exit fullscreen mode
  1. Authenticate ngrok with your token:
   ngrok config add-authtoken YOUR_NGROK_AUTHTOKEN
Enter fullscreen mode Exit fullscreen mode

5. Install Dependencies

pip install -r requirements.txt
Enter fullscreen mode Exit fullscreen mode

6. Run the Application

python main.py
Enter fullscreen mode Exit fullscreen mode

The application will:

  • Start the local server
  • Create an ngrok tunnel
  • Display the public URL in the console

7. Configure Vapi

  1. Copy the public ngrok URL from your console output
  2. Go to your Vapi dashboard
  3. Add the public URL as your webhook endpoint
  4. Configure your voice agent settings

8. Test Your Voice Agent

Your voice agent is now live and ready to handle calls through Vapi!

πŸ”§ Configuration

Environment Variables

  • CEREBRAS_API_KEY: Your Cerebras AI API key for model inference
  • NGROK_AUTHTOKEN: Your ngrok authentication token (optional, can be set via ngrok config)

Customization

You can modify the voice agent behavior by editing the configuration in main.py:

  • Adjust model parameters
  • Modify response formatting
  • Configure webhook endpoints
  • Set custom voice settings

πŸ“Š Performance Optimization

This setup is optimized for minimal latency:

  • Cerebras AI: Provides fast inference for the GPT-4o model
  • Local Deployment: Eliminates additional network hops
  • ngrok Tunneling: Secure public access without complex networking
  • Optimized Code: Streamlined request/response handling

πŸ› Troubleshooting

Common Issues

ngrok Authentication Error

# Make sure you're using the tunnel authtoken, not API key
ngrok config add-authtoken YOUR_TUNNEL_AUTHTOKEN
Enter fullscreen mode Exit fullscreen mode

Cerebras API Key Issues

  • Verify your API key is correctly added to .env
  • Check your Cerebras AI account has sufficient credits
  • Ensure API key has proper permissions

Connection Issues

  • Check firewall settings
  • Verify ngrok tunnel is active
  • Confirm webhook URL in Vapi matches ngrok public URL

πŸ“ˆ Monitoring

Monitor your voice agent performance:

  • Check console logs for TTFT metrics
  • Monitor Cerebras AI usage in their dashboard
  • Track call quality in Vapi analytics
  • Use ngrok dashboard for tunnel statistics

🀝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgments

  • OpenAI for GPT-4o Realtime model
  • Cerebras AI for high-performance inference
  • Vapi for voice interface platform
  • ngrok for secure tunneling solution

πŸ“ž Support

If you encounter any issues or have questions:

  1. Check the troubleshooting section above
  2. Open an issue on GitHub
  3. Review the logs for error details

⚑ Ready to build the future of voice AI? Get started now!

Top comments (0)