Petr Janik

Posted on Jul 28, 2022

RASA - Giving voice to web chatbot

#rasa #chatbot #nlp #ai

In this article, we will add text-to-speech to the web chat application created in the previous post. There won't be anything new specific to Rasa, but you will learn how to improve the chatbot integration into a website.

We will add a button that will toggle between enabled and disabled states. When enabled, the received messages from the chatbot will be read aloud by the browser.

HTML

We'll start by adding the button to HTML.
At the beginning of the <body>, we define two symbols – one for volume on and the other for volume off.



...
<body>
  <svg aria-hidden="true" style="position: absolute; width: 0; height: 0; overflow: hidden;" version="1.1" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
    <defs>
      <symbol id="icon-volume_on" viewBox="0 0 24 24">
        <path d="M14.016 3.234q3.047 0.656 5.016 3.117t1.969 5.648-1.969 5.648-5.016 3.117v-2.063q2.203-0.656 3.586-2.484t1.383-4.219-1.383-4.219-3.586-2.484v-2.063zM16.5 12q0 2.813-2.484 4.031v-8.063q1.031 0.516 1.758 1.688t0.727 2.344zM3 9h3.984l5.016-5.016v16.031l-5.016-5.016h-3.984v-6z"></path>
      </symbol>
      <symbol id="icon-volume_off" viewBox="0 0 24 24">
        <path d="M12 3.984v4.219l-2.109-2.109zM4.266 3l16.734 16.734-1.266 1.266-2.063-2.063q-1.547 1.313-3.656 1.828v-2.063q1.172-0.328 2.25-1.172l-4.266-4.266v6.75l-5.016-5.016h-3.984v-6h4.734l-4.734-4.734zM18.984 12q0-2.391-1.383-4.219t-3.586-2.484v-2.063q3.047 0.656 5.016 3.117t1.969 5.648q0 2.203-1.031 4.172l-1.5-1.547q0.516-1.266 0.516-2.625zM16.5 12q0 0.422-0.047 0.609l-2.438-2.438v-2.203q1.031 0.516 1.758 1.688t0.727 2.344z"></path>
      </symbol>
    </defs>
  </svg>
  <header class="header">
    <p class="title">Chat with Rasa chatbot</p>
  </header>
...

Objects created inside a <defs> element are not rendered directly. To display them, we will reference them with a <use> element:



...
<form id="form">
  <input id="message-input" autocomplete="off" autofocus/>
  <svg id="icon-volume-on" style="display: none" class="button voice-icon">
    <use xlink:href="#icon-volume_on"></use>
  </svg>
  <svg id="icon-volume-off" class="button voice-icon">
    <use xlink:href="#icon-volume_off"></use>
  </svg>
  <button class="button">Send</button>
</form>
...

We already had the <input> and <button> and only added the two <svg> elements which contain a <use> element.
We want the feature to be disabled by default and display only the volume off icon. Hence we hide the volume on icon by adding style="display: none" to it.

Styles

Let's make the button nicer by adding this to the <style> tag in the <head>:



.voice-icon {
    fill: currentColor;
    font-size: 22px;
    width: 1em;
}

This should be the result:

JavaScript

The last step is to add the behaviour.
Let's create a toggleVoice function and add it as a click listener to the two buttons.



let voiceEnabled = false;
const iconVolumeOn = document.getElementById('icon-volume-on');
const iconVolumeOff = document.getElementById('icon-volume-off');

function toggleVoice() {
    if (voiceEnabled) {
        voiceEnabled = false;
        iconVolumeOn.style.display = 'none';
        iconVolumeOff.style.display = 'block';
    } else {
        if ('speechSynthesis' in window) {
            voiceEnabled = true;
            iconVolumeOn.style.display = 'block';
            iconVolumeOff.style.display = 'none';
        } else {
            alert("Sorry, your browser doesn't support text to speech.");
        }
    }
}

iconVolumeOn.addEventListener('click', toggleVoice);
iconVolumeOff.addEventListener('click', toggleVoice);

Depending on whether the voice is enabled or disabled, the function displays the appropriate icon and hides the other icon. It also sets voiceEnabled accordingly. The voice is enabled only when the SpeechSynthesis API is supported by the user's browser (this API is described later on).

You can now toggle the button:

For the text-to-speech, we'll use Web Speech API. It allows us to create a SpeechSynthesisUtterance instance to which we assign a text.
When this instance is passed to the window.speechSynthesis.speak function, the browser reads the text aloud.

We want to read aloud every message from the chatbot after it is received, therefore we add this code at the end of the appendMessage function:



if (voiceEnabled && type === "received") {
    const voiceMsg = new SpeechSynthesisUtterance();
    voiceMsg.text = msg;
    window.speechSynthesis.speak(voiceMsg);
}

That's it!

Run the application

Serve the application on http://localhost:8080 by running npx http-server webchat.
In a new terminal window, run the rasa server with enabled cors for all origins: rasa run --cors "*".
In another terminal window, run the actions server: rasa run actions.
Open http://127.0.0.1:8080 and chat with the chatbot!

Repository for this tutorial:

petr7555 / rasa-dev-tutorial

You can checkout the state of the repository at the end of this tutorial by running:



git clone --branch 23-text-to-speech git@github.com:petr7555/rasa-dev-tutorial.git

DEV Community