DEV Community

Cover image for Build an Interactive Voice Response Menu using Node.js and Express
Alex Lakatos 🥑 for Vonage

Posted on • Originally published at nexmo.com on

Build an Interactive Voice Response Menu using Node.js and Express

This article originally appeared on the Nexmo blog, but I wanted to add some more content to it specially for the dev.to community. If you don't want to follow along but you just want to try it out, I've put my code on Glitch and setup a Nexmo application with a few Nexmo phone numbers for different countries, you can call +442038973497 or +19373652539 and play with the interactive voice response. If you want me to setup a number for your contry, just tell me on Twitter, I'll provision it and update the post here for others to use.

We're going to build an interactive voice response menu, going through everything you need to know to set up a Node.js application that can receive inbound calls and capture user input entered via the keypad.

By following this tutorial you will end up with a simple application that can be extended to include more complex, interactive elements and give you a head start building interactive menus for your callers.

The code for this tutorial can be found on GitHub.

Prerequisites

  • A Nexmo account
  • Node.js installed on your machine
  • ngrok in order to make the code on our local machine accessible to the outside world
  • The Nexmo CLI: npm install -g nexmo-cli

Setup

When Nexmo receives a call on a number you have rented, an HTTP request is made to a URL (a 'webhook', that you specify) that contains all of the information needed to receive and respond to the call. This is commonly called the answer URL.

Nexmo sends all the information about the call progress to a webhook URL you'll specify when you create a Nexmo Application, called the event URL.

When a user presses a number on their keypad, you can collect it via DTMF (Dual Tone Multifrequency). Whenever a DTMF input is collected from the user, this is sent to a different webhook URL in your app which you'll also have to specify.

So let's start writing this webhook server already! I'll use express as a web application framework, so I need to install it. I'll need to deal with JSON bodies, so I'll install body-parser as well. Run the following command inside the project folder in your terminal:

npm install express body-parser
Enter fullscreen mode Exit fullscreen mode

Next up, in your main folder, create a new file called index.js and add a boilerplate express server, using body-parser, that listens on port 3000. For example:

const app = require('express')()
const bodyParser = require('body-parser')

app.use(bodyParser.json())

app.listen(3000)
Enter fullscreen mode Exit fullscreen mode

Receiving a phone call

I need to create the answer URL, that is where Nexmo is going to make a GET request and it expects to receive a Nexmo Call Control Object, or NCCO for short. It's nothing really fancy, a JSON object with a set of pre-defined action objects. We'll use the talk action to greet the caller and ask them to press a digit, setting the bargeIn option to true so that the user can enter a digit without waiting for the spoken message to finish. We'll add an input to the NCCO in order to capture the digit via DTMF. Set the maxDigits property to 1 and the eventURL to a handler on your server to receive and handle the input. To achieve all this, you can add the following code to your index.js file:

app.get('/webhooks/answer', (req, res) => {
  const ncco = [{
      action: 'talk',
      bargeIn: true,
      text: 'Hello. Please enter a digit.'
    },
    {
      action: 'input',
      maxDigits: 1,
      eventUrl: [`${req.protocol}://${req.get('host')}/webhooks/dtmf`]
    }
  ]

  res.json(ncco)
})
Enter fullscreen mode Exit fullscreen mode

Handle the user input

Let's add the code to handle incoming DTMF in index.js. Nexmo makes a POST request to our webhook, which we'll expose at webhooks/dtmf. When we receive the request we will create another talk action that inspects the request object and reads back the digits that the caller pressed:

app.post('/webhooks/dtmf', (req, res) => {
  const ncco = [{
    action: 'talk',
    text: `You pressed ${req.body.dtmf}`
  }]

  res.json(ncco)
})
Enter fullscreen mode Exit fullscreen mode

Log call events

We'll need to create another POST route in the app to log all the call related events coming from Nexmo. Add the following code to your index.js file:

app.post('/webhooks/events', (req, res) => {
  console.log(req.body)
  res.send(200);
})
Enter fullscreen mode Exit fullscreen mode

For reference, your final index.js file should look something like this one.

Now, you're set up and ready to run the code, you can do that by entering the following command in your terminal:

node index.js
Enter fullscreen mode Exit fullscreen mode

This will start a server and route any traffic to http://localhost:3000 through to your index.js file.

Expose your app with ngrok

In order to allow Nexmo to make requests to your app, you need to expose the code running on your local machine to the world.

ngrok is our tool of choice for this, and we've provided a great introduction to the tool that you can read to get up to speed if you haven't used it before.

Once you have ngrok installed, run ngrok http 3000 to expose your application to the internet. You’ll need to make a note of the ngrok URL generated as we’ll need to provide it to Nexmo in the next step (it’ll look something like http://e83658ff.ngrok.io). I'll refer to it later as YOUR_NGROK_URL.

Buy a number and create an app

With the server running and available to the world, we now need to get a Nexmo phone number and link this code, that will be running locally, to it.

Let's start by purchasing a number via the Nexmo CLI:

nexmo number:buy  --country_code US
Enter fullscreen mode Exit fullscreen mode

You can use a different country code if you want to. Make a note of the number you purchase, as we'll need it for the next step.

We now need to create a Nexmo application, which is a container for all the settings required for your application. We need to specify the answer URL and the event URL so Nexmo can interact with the server we created.

Use the Nexmo CLI to create your application making sure you substitute YOUR_NGROK_URL with your own generated URL that ngrok gave you earlier:

nexmo app:create "IVR Menu" YOUR_NGROK_URL/webhooks/answer YOUR_NGROK_URL/webhooks/events
Enter fullscreen mode Exit fullscreen mode

The response you'll get back will contain a huge private key output and, above that, an application ID. You can ignore the private key as it isn't necessary for handling inbound calls. Make a note of the application ID (which looks like this: aaaaaaaa-bbbb-cccc-dddd-abcdef123456).

We have an application that is connected to the server and a Nexmo phone number, but the phone number isn't connected to the application. So we'll need to link the number we just bought to the application we just created. You can do that using the Nexmo CLI to issue the following command, replacing YOUR_NEXMO_NUMBER and YOUR_APPLICATION_ID:

nexmo link:app YOUR_NEXMO_NUMBER YOUR_APPLICATION_ID
Enter fullscreen mode Exit fullscreen mode

That's everything needed to associate the code above with your Nexmo application and number. You can test it out by dialing the number you purchased and pressing a digit on your keypad!

Conclusion

In about thirty lines of JavaScript, you now have an application that has an interactive voice response menu. How could you expand this from here?

If you want to learn more about what is possible with inbound voice calls, and how you can make them more complex by adding features such as recording audio or connecting callers to your mobile phone, you can learn more about these actions in the NCCO reference.

As always, if you have any questions about this post feel free to DM me on Twitter, I'm @lakatos88. You can also email the Developer Relations team at Nexmo, devrel@nexmo.com, or join the Nexmo community Slack channel, where we’re waiting and ready to help.

Extra content for dev.to

I wanted to take this a step further. I've built silly things in the past, so I figured why not make this interactive voice response a bit silly, while still exploring the concepts of building it further. I wanted to add more digits to the input, and make it reach out to an API for data depending on the input. Since I'm working with numbers here and I've seen Hui Jing using the NumbersAPI, I thought I'd give it a try as well. We'll also make the interactive voice response recursive, so you can keep sending DTMF after every fact and get another one, until you've had enough and hang up the call.

Change the Answer URL

I felt like one digit was going to be limiting for the NumbersAPI and would get boring soon, so let's change the maxDigits property of the input action to be the maximum allowed by Nexmo, which is 20. Because we're allowing that many digits, the default timeOut of 3 won't be enough, so let's add a timeout of 10 seconds, which it the maximum Nexmo allows. With that big of a timeout, we should add an alternate submission method as well, so we don't have to wait 10 seconds every time. submitOnHash should do the trick, so the DTMF is submitter either after 10 seconds or after the user presses the # key.

app.get('/webhooks/answer', (req, res) => {
  const ncco = [{
      action: 'talk',
      voiceName: 'Chipmunk',
      bargeIn: true,
      text: '<speak>Hello. Please enter a number between 0 and <prosody rate="fast">99999999999999999999</prosody> followed by the # key.</speak>'
    },
    {
      action: 'input',
      maxDigits: 20,
      timeOut: 10,
      submitOnHash: true,
      eventUrl: [`${req.protocol}://${req.get('host')}/webhooks/dtmf`]
    }
  ]

  res.json(ncco)
})
Enter fullscreen mode Exit fullscreen mode

Because I changed the input so much, I thought I'd change the talk action as well, to reflect the input parameters. I've added a voiceName just for the fun of it, Chipmunk being my favourite. There are a bunch you can use with Nexmo, depending on the language and persona you want, you can check them all in the documentation. It was taking a bit too much while I tested this for it to speak 99999999999999999999, the biggest 20 digit number, so I needed to convert the text from plain to SSML or Speech Synthesis Markup Language. With SSML you can do things like mix multiple languages, control the speed, volume and pitch of synthesised text, and control pronunciation of words and numbers. In here I'm using it to change the rate of speech for the big number using the <prosody> tag in SSML.

Change the DTMF webhook

Now that we changed the answer webhook, we can accept a 20 digit number. Let's connect that to the NumbersAPI, get the random fact about that number, and then add it back to the talk action of the NCCO. We'll also add the same input action we used in the answer webhook, so the call keeps going and you can input another number to get more random facts about it.

First, we'll need to install a HTTP requests library, I'm not a fan of the default http one in Node.js. Coincidentally, it's called request, so let's install it via npm:

npm install request
Enter fullscreen mode Exit fullscreen mode

We'll make a request to http://numbersapi.com/${number} every time there is a POST request on the /webhooks/dtmf endpoint, where number is going to be the DTMF number from Nexmo. We'll need to sanitize it in case it comes up empty, when the user doesn't input anything before the timeout. I'll default it to 42 instead of 0, because 42 is the meaning of life. We'll append our own message to the one that comes back from the Numbers API, telling the user to input a number again or just hangup the call. Because these are two different sources, I wanted to make it clearer on the call, so I'm using SSML again instead of plain speech. I'll separate the messages with the <s> or sentence tag, that makes a pause between the two messages.

app.post('/webhooks/dtmf', (req, res) => {
  let number = req.body.dtmf || 42;
  let message = "";

  request(`http://numbersapi.com/${number}`, (error, response, body) => {
    if (error) {
      message = "The Numbers API has thrown an error."
    } else {
      message = body
    }

    const ncco = [{
        action: 'talk',
        bargeIn: true,
        voiceName: 'Chipmunk',
        text: `<speak><s>${message}</s> <s>Enter another number if you want to continue or just hang up the call if you've had enough.</s></speak>`
      },
      {
        action: 'input',
        maxDigits: 20,
        timeOut: 10,
        submitOnHash: true,
        eventUrl: [`${req.protocol}://${req.get('host')}/webhooks/dtmf`]
      }
    ]

    res.json(ncco)
  })
})
Enter fullscreen mode Exit fullscreen mode

Try it out

For reference, your final index.js file should look something like this one. If you've followed along this long, you'll need to restart your server by running node index.js again in your terminal, and you're good to go. Call your Nexmo number and start interacting with your keypad.

Top comments (0)