DEV Community

Cover image for Creating a Text-to-Speech AI Agent in JavaScript using OpenAI API
Devops Den
Devops Den

Posted on

Creating a Text-to-Speech AI Agent in JavaScript using OpenAI API

Introduction

Have you ever wanted to convert text into speech using AI? OpenAI’s Text-to-Speech (TTS) API allows developers to generate high-quality speech from text. In this blog, we will build a simple AI-powered TTS agent in JavaScript using OpenAI's API. By the end, you'll have a working program that converts any text into speech and plays it back.

Prerequisites

Before we begin, ensure you have the following:

  • Node.js installed (Download here)
  • An OpenAI API Key (Get it here)
  • Basic knowledge of JavaScript

Step 1: Install DependenciesWe will use axios to interact with

OpenAI’s API and play-sound to play the generated audio.

npm install axios play-sound
Enter fullscreen mode Exit fullscreen mode

Step 2: Writing the TTS Function

We will create a function that:

  • Sends a request to OpenAI’s TTS API
  • Saves the generated audio
  • Plays the audio file
const axios = require('axios');
const player = require('play-sound')();
const fs = require('fs');

const OPENAI_API_KEY = 'your-api-key';

async function textToSpeech(text) {
    try {
        const response = await axios.post(
            'https://api.openai.com/v1/audio/speech',
            {
                model: 'tts-1',
                input: text,
                voice: 'alloy',
            },
            {
                headers: {
                    'Authorization': `Bearer ${OPENAI_API_KEY}`,
                    'Content-Type': 'application/json'
                },
                responseType: 'arraybuffer'
            }
        );

        const filePath = 'output.mp3';
        fs.writeFileSync(filePath, response.data);
        console.log('Playing audio...');
        player.play(filePath);
    } catch (error) {
        console.error('Error:', error.response ? error.response.data : error.message);
    }
}

textToSpeech("Hello, this is an AI-generated voice!");
Enter fullscreen mode Exit fullscreen mode

Step 3: Running the Script

Save the file as tts.js and run it using:

node tts.js
Enter fullscreen mode Exit fullscreen mode

Learn how to create image analysis with the Google Cloud Vision API.

Customization

  • Change the Voice: OpenAI provides multiple voices like alloy, echo, fable, etc. Try different voices!
  • Integrate into a Web App: Use this in a frontend React/Next.js project by calling the API via a backend.

Conclusion

With just a few lines of JavaScript, we have successfully built a powerful AI-powered text-to-speech agent. Whether for accessibility, automation, or just for fun, AI-driven voice synthesis is a game-changer. Try it out and enhance your projects with realistic AI voices!

SurveyJS custom survey software

Build Your Own Forms without Manual Coding

SurveyJS UI libraries let you build a JSON-based form management system that integrates with any backend, giving you full control over your data with no user limits. Includes support for custom question types, skip logic, an integrated CSS editor, PDF export, real-time analytics, and more.

Learn more

Top comments (0)

Billboard image

Create up to 10 Postgres Databases on Neon's free plan.

If you're starting a new project, Neon has got your databases covered. No credit cards. No trials. No getting in your way.

Try Neon for Free →