DEV Community

TJ Durnford
TJ Durnford

Posted on

Connecting LLMs to Twilio: A Step-by-Step Guide

Integrating OpenAI's LLM with Twilio Using Vercel AI SDK

In this guide, we'll walk through the process of integrating OpenAI's language models with Twilio's Conversation Relay using the Vercel AI SDK. This integration allows you to create a virtual voice assistant that can handle user queries and provide information via a phone call. We'll cover setting up the project, configuring Redis, and running the project. Additionally, we'll explain how the bufferTransform function helps in sending larger chunks of data to Twilio, avoiding the inefficiency of sending one token at a time.

Prerequisites

  • Node.js and npm installed on your machine.
  • A Twilio account.
  • An OpenAI API key.
  • A Redis instance for managing conversation state.

Step 1: Setting Up the Project

First, create a new directory for your project and initialize it with npm:

mkdir twilio-openai-integration
cd twilio-openai-integration
npm init -y
Enter fullscreen mode Exit fullscreen mode

Install the necessary dependencies:

npm install ai express express-ws redis twilio @ai-sdk/openai uuid ws dotenv
npm install --save-dev typescript @types/node @types/ws @types/express-ws @types/express
Enter fullscreen mode Exit fullscreen mode

Step 2: Project Structure

Create the following file structure:

twilio-openai-integration/
│
├── managers/
│   └── ConversationManager.ts
│
├── types/
│   └── twilio.ts
│
├── utils/
│   └── bufferTransform.ts
│
├── .env
└── index.ts
Enter fullscreen mode Exit fullscreen mode

Step 3: Environment Configuration

Create a .env file in the root of your project and add your environment variables:

OPENAI_API_KEY=your-openai-api-key
PORT=5000
REDIS_URL=redis://localhost:6379
SERVER_DOMAIN=http://localhost:5000
TAVILY_API_KEY=your-twilio-api-key
Enter fullscreen mode Exit fullscreen mode

Step 4: Implementing the Server

In index.ts, implement the server logic:

import express from "express";
import ExpressWs from "express-ws";
import VoiceResponse from "twilio/lib/twiml/VoiceResponse";
import { CoreMessage, streamText } from "ai";
import { openai } from "@ai-sdk/openai";
import { v4 as uuid } from "uuid";
import { type WebSocket } from "ws";
import "dotenv/config";

import { ConversationManager } from "./managers/ConversationManager";
import { EventMessage } from "./types/twilio";
import { bufferTransform } from "./utils/bufferTransform";

const app = ExpressWs(express()).app;
const PORT = parseInt(process.env.PORT || "5000");

const welcomeGreeting = "Hi there! How can I help you today?";
const systemInstructions =
  "You are a virtual voice assistant. You can help the user with their questions and provide information.";

app.use(express.urlencoded({ extended: false }));

app.post("/call/incoming", async (_, res) => {
  const response = new VoiceResponse();

  response.connect().conversationRelay({
    url: `wss://${process.env.SERVER_DOMAIN}/call/connection`,
    welcomeGreeting,
  });

  res.writeHead(200, { "Content-Type": "text/xml" });
  res.end(response.toString());
});

app.ws("/call/connection", (ws: WebSocket) => {
  const sessionId = uuid();

  ws.on("message", async (data: string) => {
    const event: EventMessage = JSON.parse(data);
    const conversation = new ConversationManager(sessionId);

    if (event.type === "setup") {
      // Add welcome message to conversation transcript
      const welcomeMessage: CoreMessage = {
        role: "assistant",
        content: welcomeGreeting,
      };

      await conversation.addMessage(welcomeMessage);
    } else if (event.type === "prompt") {
      // Add user message to conversation and retrieve all messages
      const message: CoreMessage = { role: "user", content: event.voicePrompt };
      await conversation.addMessage(message);
      const messages = await conversation.getMessages();

      const controller = new AbortController();

      // Stream text from OpenAI model
      const { textStream, text: completeText } = await streamText({
        abortSignal: controller.signal,
        experimental_transform: bufferTransform,
        model: openai("gpt-4o-mini"),
        messages,
        maxSteps: 10,
        system: systemInstructions,
      });

      // Iterate over text stream and send messages to Twilio
      for await (const text of textStream) {
        if (controller.signal.aborted) {
          break;
        }

        ws.send(
          JSON.stringify({
            type: "text",
            token: text,
            last: false,
          })
        );
      }

      // Send last message to Twilio
      if (!controller.signal.aborted) {
        ws.send(
          JSON.stringify({
            type: "text",
            token: "",
            last: true,
          })
        );
      }

      // Add complete text to conversation transcript
      const agentMessage: CoreMessage = {
        role: "assistant",
        content: await completeText,
      };

      void conversation.addMessage(agentMessage);
    } else if (event.type === "end") {
      // Clear conversation transcript when call ends
      void conversation.clearMessages();
    }
  });

  ws.on("error", console.error);
});

app.listen(PORT, () => {
  console.log(`Local: http://localhost:${PORT}`);
  console.log(`Remote: https://${process.env.SERVER_DOMAIN}`);
});
Enter fullscreen mode Exit fullscreen mode

Explanation

  • Express and WebSocket Setup: We use express-ws to handle WebSocket connections, which are essential for real-time communication with Twilio's Conversation Relay.
  • Twilio VoiceResponse: This sets up a Twilio call and connects it to our WebSocket endpoint.
  • WebSocket Handling: We handle different types of events (setup, prompt, end) to manage the conversation state and interact with the OpenAI model.
  • OpenAI Integration: We use the Vercel AI SDK to stream text from OpenAI's model, transforming it with bufferTransform to send larger chunks.

Step 5: Implementing bufferTransform

In utils/bufferTransform.ts, implement the buffer transformation logic:

import { StreamTextTransform, TextStreamPart } from "ai";

export const bufferTransform: StreamTextTransform<any> = () => {
  let buffer = "";
  let threshold = 200;

  return new TransformStream<TextStreamPart<any>, TextStreamPart<any>>({
    transform(chunk, controller) {
      if (chunk.type === "text-delta") {
        buffer += chunk.textDelta;
        if (buffer.length >= threshold) {
          controller.enqueue({ ...chunk, textDelta: buffer });
          buffer = "";
          if (threshold < 5000) {
            threshold += 200;
          }
        }
      } else {
        controller.enqueue(chunk);
      }
    },
    flush(controller) {
      if (buffer.length > 0) {
        controller.enqueue({ type: "text-delta", textDelta: buffer });
      }
    },
  });
};
Enter fullscreen mode Exit fullscreen mode

Explanation

  • Buffering: The bufferTransform function accumulates text tokens into a buffer. Once the buffer reaches a certain size (threshold), it sends the accumulated text as a single chunk.
  • Dynamic Threshold: The threshold increases gradually to optimize the size of the chunks being sent, improving efficiency by reducing the number of WebSocket messages.

Step 6: Running the Project

Ensure your Redis instance is running and accessible. Then, start your server:

npm run build
node dist/index.ts
Enter fullscreen mode Exit fullscreen mode

Your server should now be running, ready to handle incoming calls and relay conversations through Twilio.

Conclusion

By following these steps, you've set up a system that integrates OpenAI's language models with Twilio's Conversation Relay, using the Vercel AI SDK. This setup allows for efficient communication by buffering text tokens and sending them in larger chunks, enhancing the performance of your virtual voice assistant.

Full Code on GitHub

You can view the full code for this project on GitHub: GitHub Repository

Speedy emails, satisfied customers

Postmark Image

Are delayed transactional emails costing you user satisfaction? Postmark delivers your emails almost instantly, keeping your customers happy and connected.

Sign up

Top comments (0)

Heroku

Simplify your DevOps and maximize your time.

Since 2007, Heroku has been the go-to platform for developers as it monitors uptime, performance, and infrastructure concerns, allowing you to focus on writing code.

Learn More

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay