mamba

Posted on Mar 19

Building an AI-Powered Voicemail Assistant with OpenAI and Twilio

#twilio #openai

Introduction

Voicemail has remained largely unchanged for decades. In a world where real-time communication is key, users often find voicemail tedious and inefficient. What if AI could transcribe, summarize, and even rank voicemail messages to help users stay in control?

That’s exactly what Vernon AI does! In this guide, we’ll build an AI-powered voicemail assistant using Twilio for call forwarding and voicemail handling and OpenAI’s Whisper and GPT APIs for transcription and intelligent summaries.

By the end of this tutorial, you’ll know how to:
• Set up Twilio to receive and store voicemails.
• Use OpenAI’s Whisper API to transcribe voicemails into text.
• Leverage GPT-4 to generate concise voicemail summaries.
• Store and retrieve data using MongoDB.

Let’s dive in!

Prerequisites

To follow along, you’ll need:
•A Twilio account (sign up at Twilio).
•An OpenAI API key (get one from OpenAI).
•A MongoDB database for storing voicemails.
•Python 3.9+ and Flask (for backend API development).
•Basic knowledge of REST APIs and webhooks.

Step 1: Setting Up Twilio to Receive Voicemails

1.1 Buy a Twilio Phone Number

Twilio provides virtual phone numbers that can receive calls and record voicemails. After signing up:
1.Go to Twilio Console > Phone Numbers.
2.Buy a local or toll-free number.
3.Under Voice & Fax, set the Webhook URL to your server (e.g., https://yourdomain.com/twilio/answer).

1.2 Twilio Webhook for Answering Calls

When a call is received, our webhook will play a greeting and start recording the voicemail.

from flask import Flask, request, Response
from twilio.twiml.voice_response import VoiceResponse

app = Flask(__name__)

@app.route("/twilio/answer", methods=["POST"])
def answer_call():
    response = VoiceResponse()
    response.say("Hi, you've reached Vernon AI. Please leave a message after the beep.")
    response.record(action="/twilio/voicemail_callback", max_length=120, finish_on_key="#")
    response.say("Thank you. Goodbye.")
    return Response(str(response), mimetype="text/xml")

if __name__ == "__main__":
    app.run(port=5000, debug=True)

This function:
•Greets the caller and instructs them to leave a message.
•Records the voicemail and sends it to /twilio/voicemail_callback.
•Says goodbye once recording is complete.

Step 2: Handling Voicemail Callbacks

Twilio will send the voicemail recording URL and caller information to our /twilio/voicemail_callback endpoint.


import os
import requests
from pymongo import MongoClient
from dotenv import load_dotenv

load_dotenv()
MONGODB_URI = os.getenv("MONGODB_URI")
TWILIO_ACCOUNT_SID = os.getenv("TWILIO_ACCOUNT_SID")
TWILIO_AUTH_TOKEN = os.getenv("TWILIO_AUTH_TOKEN")

client = MongoClient(MONGODB_URI)
db = client["voicemail_db"]
voicemails = db["voicemails"]

@app.route("/twilio/voicemail_callback", methods=["POST"])
def voicemail_callback():
    recording_url = request.form.get("RecordingUrl")
    caller_number = request.form.get("From")

    if not recording_url:
        return "No Recording URL", 400

    voicemail_entry = {
        "caller": caller_number,
        "audio_url": f"{recording_url}.mp3",
        "transcript": "",
        "summary": ""
    }
    voicemails.insert_one(voicemail_entry)
    return "Voicemail recorded", 200

This stores voicemail metadata (caller, audio URL) into MongoDB.

Step 3: Transcribing Voicemails with OpenAI Whisper

Now, let’s transcribe the voicemail using OpenAI Whisper.


import openai

openai.api_key = os.getenv("OPENAI_API_KEY")

def transcribe_voicemail(audio_url):
    response = requests.get(audio_url)
    with open("voicemail.mp3", "wb") as f:
        f.write(response.content)

    with open("voicemail.mp3", "rb") as f:
        transcript = openai.Audio.transcribe("whisper-1", f)
    return transcript["text"]

This function downloads the voicemail audio and transcribes it.

Step 4: Generating a Summary Using GPT-4

After transcription, we can summarize the voicemail with OpenAI’s GPT-4.


def summarize_voicemail(transcript):
    prompt = f"""
    Summarize this voicemail in a professional and concise way:
    "{transcript}"
    """

    response = openai.ChatCompletion.create(
        model="gpt-4",
        messages=[{"role": "user", "content": prompt}]
    )
    return response["choices"][0]["message"]["content"].strip()

Now, let’s update our MongoDB entry with the transcription and summary.


def process_voicemail(voicemail_id, audio_url):
    transcript = transcribe_voicemail(audio_url)
    summary = summarize_voicemail(transcript)

    voicemails.update_one(
        {"_id": voicemail_id},
        {"$set": {"transcript": transcript, "summary": summary}}
    )

Step 5: Displaying Voicemails in a Web Dashboard

You can now build a frontend to display the summarized voicemails, with:
•Caller ID
•Timestamp
•Transcript & Summary
•Audio Playback

You can use React, Next.js, or any frontend framework to fetch and display this data.

Conclusion

In this guide, we built an AI-powered voicemail assistant that:
✅ Answers calls and records voicemails using Twilio
✅ Transcribes messages with OpenAI Whisper
✅ Generates intelligent summaries with GPT-4
✅ Stores and retrieves voicemails using MongoDB

This is just the beginning! You can expand this project by:
• Adding voicemail categorization (urgent, spam, etc.).
• Enabling SMS/email notifications with summaries.
• Creating a voice-based chatbot to interact with callers.

Let me know what you think! Would you use an AI-powered voicemail assistant?
https://www.vernonaisolutions.com/

Get n8n VPS hosting 3x cheaper than a cloud solution

Get fast, easy, secure n8n VPS hosting from $4.99/mo at Hostinger. Automate any workflow using a pre-installed n8n application and no-code customization.

Start now

DEV Community

Building an AI-Powered Voicemail Assistant with OpenAI and Twilio

Get n8n VPS hosting 3x cheaper than a cloud solution

Top comments (0)

Join us for AWS Security LIVE!

Okay