DEV Community

Aleksander Obuchowski
Aleksander Obuchowski

Posted on • Originally published at chatbotslife.com

How to make your first chat bot in 50 lines of code — theory and practice

How to make your first chat bot in 50 lines of code — theory and practice

1. How chat bots work?

Most of modern chat bot platforms consist of 3 main
things — intent recognition ,slot filling and dialog graph.

1.1 Intent recognition

Intent recognition is a text classification task which goal is to
capture specific intent behind a user query. This is motivated by the fact that
users tend to formulate their request in a lot of different ways so we need to
have a system that is able to tell if those messages relate to the same thing or
not. Let’s illustrate this with an example of a bank chat bot, where users can
ask the bot to withdraw money:


Visualization of intent detection system

You can see that although the requests are formulated in a lot of different ways
and in different styles they all mean basically the same thing and chat bot
should react in the same way. Therefore we need **text classification **model
that captures the semantics behind user sentences and assigns them to the
specific predefined class.

1.2 Slot filling

Once we know what action the user wants to take we need to capture specific
parameters of those actions. For example, if you want Alexa to play your
favorite song, you want her to play this specific song not just any song, so
besides detecting intent chat bots also need to
perform a task that is called slot-filling.

1.3 Dialog graph

Visualization of dialog graph

Another requirement for chat bot functionality is dialog graph. It’s goal is to steer conversation in the right direction. For example when you say “Check the weather” the chat bot could then ask “What day should I check the weather for?” and next it will be looking for intents like ‘tomorrow’ or ‘today’. The important part here is, there would be no point in asking the second question without the first one, so there is a need for a system that stores the information of the point in the conversation where we are and what are the possible next states.

1.4 Our chat bot

In this tutorial our goal is to create a simple chat bot, so we are going to focus only on intent detection task and simple dialog graph model. This is enough to make a chat bot that is able to answer FAQ and conduct as simple conversation.

2. Word and sentence embedding

Our goal in designing an intent detection is to create a system that, given a few examples for intent, can detect that a sentence given by the user is similar to these examples and therefore should have the same intent.

The problem behind this system is that we have to design a system for checking if 2 sentences are similar. This could be achieved by eg. counting how many overlapping words are in the new sentence and the sentences in training dataset. This is however a naive approach because a user can use a word that has similar meaning, but is different from the ones in the train examples.

2.1 Word embedding

A solution here is to use word embedding.


Word vectors ref : https://ruder.io/word-embeddings-1/](https://ruder.io/word-embeddings-1/

Word embeddings are mathematical representations of words encoded as vectors in n-dimentional space. Similar (used in the same context) words are close to each other in this space. This means that we can compare 2 or more words to each other not by e.g. the number of overlapping characters but by how close they are to each other in they embedded form.

2.2 Sentence embeddings

From word embeddings we can construct embeddings for the whole sentence. This can be done in a variety of ways, we can simply take the average of the word vectors, use weighted average to check how important the words are by e.g tf-idf
coefficient or even use more advanced methods like transformer neural
networks
.

2.3 Similarity

Once we have prepared embeddings for the sentences we have to design a way for comparing them. A simple widely used method here is cosine similarity that measures similarity between two vectors as the angle between them.


Cosine similarity ref: https://bit.ly/2X5470I

3.Building the chat bot

To create the sentence embedding we are going to use flair library. This library is based not only on static word embeddings but also analyses the words character by character which helps in dealing with out-of-vocabulary words.

In our model we are going to embed the examples for each intent and then, while processing the users message, find the most similar one. This approach is mainly taken as fast and simple one, illustrating how embedding work. Most of modern systems use neural networks (link to related articles can be found at the end),however this approach can still be used if you want to design a system that is fast and and doesn’t use a lot of resources.

We begin our program with creating the outline of the model.

outline of the program

import json
import os
import random
import time
from tqdm import tqdm
import pickle
from scipy.spatial.distance import cosine
from flair.data import Sentence
from flair.embeddings import WordEmbeddings, DocumentPoolEmbeddings
embeddings = DocumentPoolEmbeddings([WordEmbeddings('en')],pooling='mean',)
class chatbot:
@staticmethod
def prepare_embeddings(input_file,output_file):
pass
@staticmethod
def answer(message,embeddings_file,anwsers_file):
pass
view raw mockup.py hosted with ❤ by GitHub

Description

1–9: importing necessary libraries
11 : initialization of the flair model for creating embeddings of sentences. We are using English word embeddings and mean polling method for creating sentence embeddings from word embeddings.
13–20 : chatbot class, this class has two static methods one for creating embeddings and one for processing user message and answering it.

3.1 Preparing embeddings

Firstly we need to prepare a file containing our intents and their examples.This is a json dictionary that uses intents as keys and tables of examples as values.

intents.json:

{
"hello": [
"Hi",
"Hello",
"Welcome",
"Good morning"
],
"bye": [
"Bye",
"Later",
],
"whatsup": [
"How are you?",
"What's up?",
],
"about": [
"Tell me about yourself",
"Who are you?",
]
}
view raw intents.json hosted with ❤ by GitHub

Next we need to to create a function that constructs embeddings for the
examples.

@staticmethod
def prepare_embeddings(input_file,output_file):
global embeddings
embedded_intent_dict = {}
with open(input_file) as file:
intent_dict = json.load(file)
for intent,examples in tqdm(intent_dict.items()):
embedded_intent_dict[intent] = []
for example in examples:
sentence = Sentence(example)
embeddings.embed(sentence)
embedded_intent_dict[intent].append(sentence.embedding.detach().numpy())
if not os.path.exists(os.path.dirname(output_file)):
os.makedirs(os.path.dirname(output_file))
pickle.dump(embedded_intent_dict,open( output_file, "wb+"))
view raw embeddings.py hosted with ❤ by GitHub

Description:

4 : Creating new python dictionary for the embeddings
5–6: Opening the input file and loading it to python dictionary 7–8 : For each intent we create a table in the embeddings dictionary
9–12 For each example in the intent, we create a Flair sentence object that we can later embed using the model specified earlier. Finally we add the embedded sentence to the table
13–14: If the file doesn’t exist, we create it
15: We save the embedded dict. We use pickle instead of json to store the numpy arrays

3.2 Answering the message
answers.json:

{
"hello": [
"Hello, what can I help you with?",
"Hi what can I do for you today?"
],
"bye": [
"See you later",
"See you next time"
],
"whatsup": [
"I feel happy answering your questions"
],
"about": [
"I am bot created in 50 lines of code"
]
}
view raw answers.json hosted with ❤ by GitHub
@staticmethod
def answer(message,embeddings_file,anwsers_file):
global embeddings
with open(embeddings_file, 'rb') as file:
embedded_dict = pickle.load(file)
message_sentence = Sentence(message)
embeddings.embed(message_sentence)
message_vector = message_sentence.embedding.detach().numpy()
best_intent = ""
best_score = 1
for intent, examples in embedded_dict.items():
for example in examples:
score = cosine(message_vector, example)
if(score<best_score):
best_score = score
best_intent = intent
with open(anwsers_file) as file:
anwsers_dict = json.load(file)
if(best_intent in anwsers_dict):
return random.choice(anwsers_dict[best_intent])
else:
return "Error intent not in dict"
view raw answer.py hosted with ❤ by GitHub

Description

3: We use the embeddings model
4 -5 : We load load the embeddings file created earlier
6–8 : Embedding of user message
9–10 :Initializing best intent and best sore variables
11–16 For each intent we loop through it’s embedded examples and check the cosine similarity between users message and those examples. We chose the intent, which example has the highest similarity with the new message
17–18 : Loading the answers dict
19: Checking if intent chosen by the system is in the answers dict
20 : Return random answer from the ones assigned to the chosen
intent

Whole code

import json
import os
import random
import time
from tqdm import tqdm
import pickle
from scipy.spatial.distance import cosine
from flair.data import Sentence
from flair.embeddings import WordEmbeddings, DocumentPoolEmbeddings
embeddings = DocumentPoolEmbeddings([WordEmbeddings('en')],pooling='mean',)
class chatbot:
@staticmethod
def prepare_embeddings(input_file,output_file):
global embeddings
embedded_intent_dict = {}
with open(input_file) as file:
intent_dict = json.load(file)
for intent,examples in tqdm(intent_dict.items()):
embedded_intent_dict[intent] = []
for example in examples:
sentence = Sentence(example)
embeddings.embed(sentence)
embedded_intent_dict[intent].append(sentence.embedding.detach().numpy())
if not os.path.exists(os.path.dirname(output_file)):
os.makedirs(os.path.dirname(output_file))
pickle.dump(embedded_intent_dict,open( output_file, "wb+"))
@staticmethod
def answer(message,embeddings_file,anwsers_file):
global embeddings
with open(embeddings_file, 'rb') as file:
embedded_dict = pickle.load(file)
message_sentence = Sentence(message)
embeddings.embed(message_sentence)
message_vector = message_sentence.embedding.detach().numpy()
best_intent = ""
best_score = 1
for intent, examples in embedded_dict.items():
for example in examples:
score = cosine(message_vector, example)
if(score<best_score):
best_score = score
best_intent = intent
with open(anwsers_file) as file:
anwsers_dict = json.load(file)
if(best_intent in anwsers_dict):
return random.choice(anwsers_dict[best_intent])
else:
return "Error intent not in dict"
if __name__ == "__main__":
while True:
input_message = input("Message: ")
print(f"Bot:{chatbot.answer(input_message,embeddings_file='embedded_intents/test1.pkl',anwsers_file='answers/test1.json')}")
view raw bot.py hosted with ❤ by GitHub

4. Possible improvements

In this format the chat bot has to choose one of
the intents provided. This means we have no way of detecting if user said
something that doesn't belong to any of the intents. A possible solution is to check the numerical values of the cosine similarity and based on those observation assign a threshold value below which the bot will classify the message as the one it doesn’t know how to answer.

Heroku

This site is built on Heroku

Join the ranks of developers at Salesforce, Airbase, DEV, and more who deploy their mission critical applications on Heroku. Sign up today and launch your first app!

Get Started

Top comments (0)

A Workflow Copilot. Tailored to You.

Pieces.app image

Our desktop app, with its intelligent copilot, streamlines coding by generating snippets, extracting code from screenshots, and accelerating problem-solving.

Read the docs

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay