DEV Community

Cover image for Building GenAI Application for Interactive Audiobooks using Lyzr’s VoiceBot Agent
Prajjwal Sule
Prajjwal Sule

Posted on

Building GenAI Application for Interactive Audiobooks using Lyzr’s VoiceBot Agent

In today's fast-paced digital world, storytelling has evolved beyond traditional mediums. With the emergence of innovative technologies, the way we experience narratives is undergoing a transformative shift. In this article, we'll explore how Lyzr's Interactive Audiobook is revolutionizing storytelling by seamlessly blending cutting-edge technology with the timeless art of storytelling.

Interactive Audiobooks with Lyzr’s VoiceBot Agent

Interactive storytelling represents a paradigm shift in the way we engage with narratives. Gone are the days of passive consumption; instead, audiences crave immersive experiences that allow them to actively participate in the story's evolution. VoiceBot, a versatile agent by Lyzr, utilizes OpenAI's powerful APIs to perform text-to-speech conversion, audio transcription, and text summarization into structured notes.

Lyzr’s Approach to Application Development

Lyzr offers an agent-centric approach to rapidly developing LLM (Large Language Model) applications with minimal code and time investment. Even if you're unfamiliar with the GenAI stack, Lyzr empowers you to build your AI applications effortlessly. It is the go-to solution for constructing GenAI apps without requiring an in-depth understanding of Generative AI.


Lyzr Open Source SDKs 🚀 | Lyzr Documentation

Welcome to the Lyzr Open Source Software Development Kit (SDK)!

favicon docs.lyzr.ai

Setting up the Project

Clone the "Interactive Audiobook" app repository.
Set up a virtual environment

python3 -m venv venv
source venv/bin/activate
Enter fullscreen mode Exit fullscreen mode

Create an environment file named .env and add your OpenAI API key

OPENAI_API_KEY = “Your_OpenAI_API_Key”
Enter fullscreen mode Exit fullscreen mode

Install the required dependencies

pip install lyzr streamlit
Enter fullscreen mode Exit fullscreen mode

Project Structure

The project includes directories for utilities, the main application file, a README, environment variables, and a Git ignore file.

Interactive-Audiobook

├── utils/
   ├── __init__.py
   └── utils.py


├── app.py

├── README.md

├── .env

├── .gitignore

└── requirements.txt

Enter fullscreen mode Exit fullscreen mode

Utility Functions

The utils.py file contains essential utility functions for the application, such as get_files_in_directory to retrieve file paths within a directory and prompt to generate specific prompts for creating children's stories.

import os
from dotenv import load_dotenv; load_dotenv()
from openai import OpenAI



def get_files_in_directory(directory):
    # This function help us to get the file path along with filename.
    files_list = []

    if os.path.exists(directory) and os.path.isdir(directory):
        for filename in os.listdir(directory):
            file_path = os.path.join(directory, filename)

            if os.path.isfile(file_path):
                files_list.append(file_path)

    return files_list


def prompt(user_input):
    prompt = f"""You are an expert to create kid's stories, create a complete story on this {user_input}. 
    Make sure story obeys these points: 
     1. Story should be short and precise.
     2. Story will cover from introduction to climax in 500-700 words. 
     3. Story will proivde valuable learning's for children's.
    """

    return prompt


def story_generator(prompt):
    API_KEY = os.getenv('OPENAI_API_KEY')
    ai = OpenAI(api_key=API_KEY)

    response = ai.completions.create(
        model="gpt-3.5-turbo-instruct",
        prompt=prompt,
        temperature=0.1,
        max_tokens=1000)

    story = response.choices[0].text.strip()
    return story


Enter fullscreen mode Exit fullscreen mode

The story_generator function you provided is responsible for generating a story based on a given prompt using the OpenAI GPT-3.5 model.

Creating an Entry Point for the Application ‘app.py’

The entry point script initializes the application, sets up the environment, and defines the location of the OpenAI API key.

import os
from PIL import Image
from utils import utils
from pathlib import Path
import streamlit as st
from dotenv import load_dotenv; load_dotenv()
from lyzr import VoiceBot


# Interactive Audiobook Application


audio_directory = 'audio'
os.makedirs(audio_directory, exist_ok=True)
original_directory = os.getcwd()

# replace this with your openai api key or create an environment variable for storing the key.
API_KEY = os.getenv('OPENAI_API_KEY')

Enter fullscreen mode Exit fullscreen mode

Interactive Audiobook by VoiceBot Agent

This core component generates interactive audiobooks. It takes a user story as input, converts it into an audio file using Lyzr's VoiceBot Agent, and stores the audio file in the designated directory.

def audiobook_agent(user_story:str):
    vb = VoiceBot(api_key=API_KEY)
    try:
        os.chdir(audio_directory)
        vb.text_to_speech(user_story)
    finally:
        os.chdir(original_directory)

Enter fullscreen mode Exit fullscreen mode

Implementing an Entry Point for the Application

The main execution point prompts the user to input a brief about the story, generates a story based on the prompt, displays a shortened version of the story, converts it into an audiobook, and presents it as an audio player. If the user fails to provide input, a warning message prompts them to do so.

if __name__ == "__main__":
    topic = st.text_input('Write breif about the story')
    if st.button('Create'):
        if topic:
            prompt = utils.prompt(user_input=topic)
            story = utils.story_generator(prompt=prompt)
            st.subheader('Glimpse of Story')
            shorten_story = story[:450]
            st.write(shorten_story)
            st.markdown('---')
            st.subheader('Story into audiobook')
            audiobook_agent(user_story=story)
            files = utils.get_files_in_directory(audio_directory)
            audio_file = files[0]
            st.audio(audio_file)            
        else:
            st.warning("Provide the content for story, don't keep it blank")

Enter fullscreen mode Exit fullscreen mode

Interactive Audiobooks powered by Lyzr’s VoiceBot Agent offer an immersive and engaging storytelling experience. By leveraging advanced technologies, this application enables users to create dynamic narratives and captivate audiences in new and innovative ways.

References

For further exploration and engagement, visit Lyzr’s website, book a demo, or join the community channels on Discord and Slack.

This article provides a comprehensive guide for building interactive audiobooks using Lyzr’s VoiceBot Agent, highlighting its role in transforming traditional storytelling into immersive experiences.

Top comments (0)