DEV Community

Cover image for How to build an AI chatbot with Ruby on Rails and ChatGPT
Rubyroid Labs for Rubyroid Labs

Posted on

How to build an AI chatbot with Ruby on Rails and ChatGPT

Introduction

In today's fast-paced world, businesses are confronted with the daunting task of delivering accurate and prompt responses to user inquiries. Whether it's assisting customers, sharing technical documentation, or simply exchanging knowledge, the need for a dependable and efficient system to address user questions has become absolutely vital. And that's where the incredible power of an AI chatbot, fueled by a specialized knowledge base, comes into play.

With a track record of over 300 software development projects, including several that involved OpenAI integration, Rubyroid Labs has been a trusted name since 2013. If you're seeking a reliable partner to seamlessly integrate ChatGPT into your Ruby on Rails application, contact us today.

An AI chatbot that can answer questions based on a specific knowledge base is a valuable asset for organizations looking to automate customer interactions and improve overall user experiences. Unlike more general-purpose chatbots, these knowledge-based chatbots are designed to provide precise and contextually relevant responses by leveraging a curated knowledge base of information.

The beauty of this approach lies in the ability to tailor the chatbot's responses to a specific domain or topic. By creating a knowledge base that encompasses relevant information about products, services, policies, or any other subject, the chatbot becomes an invaluable resource for users seeking specific information.

Use-cases for such knowledge-based chatbots are plentiful. For instance, an e-commerce company can build a chatbot that assists customers with product inquiries, availability, and shipping details. Similarly, educational institutions can employ chatbots to answer frequently asked questions about courses, admissions, and campus facilities. In addition, there are many other cases, some of which are listed in our other blog post.

In our case, we were asked to develop a chatbot for legal consulting. Therefore, it should base its answers only on the provided knowledgebase, and the answers should be very specific. The knowledgebase consists of 1 billion words. We faced many challenges, and with this article we will show you how we solved them.

Throughout this article, we'll guide you through the process of setting up a Ruby on Rails project, integrating ChatGPT, and building the functionality to retrieve and utilize the knowledgebase to answer user questions. By the end, you'll have the necessary skills to develop your own knowledge-based chatbot tailored to your organization's specific domain or topic that empowers users to obtain precise and relevant answers based on the specific knowledgebase you provide.

For the sake of this example, we are going to embed information from the RubyroidLabs website, thereby the AI chatbot can answer questions about the company.

There is what we are going to build:
Image description

Let’s get to this solution step by step.

Setup

Initialize Ruby on Rails project with PostgreSQL

Check environment

ruby --version # ruby 3.2.2
rails --version # Rails 7.0.5
Enter fullscreen mode Exit fullscreen mode

Initialize Rails project (docs)

rails new my_gpt --database=postgresql --css=tailwind
cd my_gpt
Enter fullscreen mode Exit fullscreen mode

Setup database

The best way to install PostgreSQL on your MacOS is not to install it at all. Instead, just run a docker container with a required PostgreSQL version. We will use ankane/pgvectorimage, therefore we will have pgvector extension preinstalled.

docker run -d -p 5432:5432 -e POSTGRES_PASSWORD=postgres --name my_gpt_postgres ankane/pgvector
Enter fullscreen mode Exit fullscreen mode

Add this to config/database.yml to the default or development section:

default: &default
  host: localhost
  username: postgres
  password: postgres
Enter fullscreen mode Exit fullscreen mode

Then initialize the database structure:

rake db:create
rake db:migrate
Enter fullscreen mode Exit fullscreen mode

Run the app

./bin/dev
Enter fullscreen mode Exit fullscreen mode

Setup PGVector

We will use the gem neighbor to work with PGVector. If you run PostgreSQL with Docker as described above, there is no need to install and build PGVector extension. So you can move on to this:

bundle add neighbor
rails generate neighbor:vector
rake db:migrate
Enter fullscreen mode Exit fullscreen mode

Setup OpenAI

To make OpenAI API calls, we will use ruby-openai gem.

bundle add ruby-openai
Enter fullscreen mode Exit fullscreen mode

Create config/initializers/openai.rb file with the following content:

OpenAI.configure do |config|
  config.access_token =  Rails.application.credentials.openai.access_token
  config.organization_id = Rails.application.credentials.openai.organization_id
end
Enter fullscreen mode Exit fullscreen mode

Add your OpenAI API key to the credentials. You can find them in your OpenAI account.

rails credentials:edit
Enter fullscreen mode Exit fullscreen mode
openai:
  access_token: xxxxx
  organization_id: org-xxxxx
Enter fullscreen mode Exit fullscreen mode

Build a simple chat with Hotwired

Create Questions controller app/controllers/questions_controller.rb:

class QuestionsController < ApplicationController
  def index
  end

  def create
    @answer = "I don't know."
  end

  private

  def question
    params[:question][:question]
  end
end
Enter fullscreen mode Exit fullscreen mode

Add routes to config/routes.rb:

resources :questions, only: [:index, :create]
Enter fullscreen mode Exit fullscreen mode

Create chat layout in app/views/questions/index.html.erb:

<div class="w-full">
  <div class="h-48 w-full rounded mb-5 p-3 bg-gray-100">
    <%= turbo_frame_tag "answer" %>
  </div>

  <%= turbo_frame_tag "new_question", target: "_top" do %>
    <%= form_tag questions_path, class: 'w-full' do |f| %>
      <input type="text"
             class="w-full rounded"
             name="question[question]"
             placeholder="Type your question">
    <% end %>
  <% end %>
</div>
Enter fullscreen mode Exit fullscreen mode

Display the answer with turbo stream. Create file app/views/questions/create.turbo_stream.erb and fill it with:

<%= turbo_stream.update('answer', @answer) %>
Enter fullscreen mode Exit fullscreen mode

Done 🎉 Open http://localhost:3000/questionsand check it out.

Image description

Prototype

Chat API

Let’s start with the simplest and the most obvious implementation - provide all our data to ChatGPT and ask it to base its answer only on the provided data. The trick here is “say ‘I don't know’ if the question can't be answered based on the context.”
So let’s copy all data from the services page and attach it as a context.

context = <<~LONGTEXT
  RubyroidLabs custom software development services. We can build a website, web application, or mobile app for you using Ruby on Rails. We can also check your application for bugs, errors and inefficiencies as part of our custom software development services.

  Services:
  * Ruby on Rails development. Use our Ruby on Rails developers in your project or hire us to review and refactor your code.
  * CRM development. We have developed over 20 CRMs for real estate, automotive, energy and travel companies.
  * Mobile development. We can build a mobile app for you that works fast, looks great, complies with regulations and drives your business.
  * Dedicated developers. Rubyroid Labs can boost your team with dedicated developers mature in Ruby on Rails and React Native, UX/UI designers, and QA engineers.
  * UX/UI design. Rubyroid Labs can create an interface that will engage your users and help them get the most out of your application.
  * Real estate development. Rubyroid Labs delivers complex real estate software development services. Our team can create a website, web application and mobile app for you.
  * Technology consulting. Slash your tech-related expenses by 20% with our help. We will review your digital infrastructure and audit your code, showing you how to optimize it.
LONGTEXT
Enter fullscreen mode Exit fullscreen mode

The message to ChatGPT is composed like that:

message_content = <<~CONTENT
  Answer the question based on the context below, and
  if the question can't be answered based on the context,
  say \\"I don't know\\".

  Context:
  #{context}

  ---

  Question: #{question}
CONTENT
Enter fullscreen mode Exit fullscreen mode

Then make an API request to ChatGPT:

openai_client = OpenAI::Client.new
response = openai_client.chat(parameters: {
  model: "gpt-3.5-turbo",
  messages: [{ role: "user", content: message_content }],
  temperature: 0.5,
})
@answer = response.dig("choices", 0, "message", "content")

Enter fullscreen mode Exit fullscreen mode

Image description

Image description

Deal-breaker
The thing is that each Chat API or Completion API has limits.

Image description

For gpt-3.5-turbo, it’s 4,096 tokens by default. Let’s measure how many tokens our data consist of with OpenAI Tokenizer:

Image description

It’s only 276 tokens, not a lot. However, it’s only from one page. In total, we have 300K tokens of data.
What if we switch to gpt-4-32k? It can process up to 32,768 tokens! Let’s assume that it’s enough for our purposes. What’s the price for one request going to be? GPT-4 with 32K context costs $0.06 / 1K tokens. Thus it’s $2+ per request.

Image description

Here Embedding come into play.

Embeddings

Data Chunks

To fit the limits or not spend all budget to 32K requests, let’s provide ChatGPT with the most relevant data. To do so, let’s split all data into small chunks and store it in the PostgreSQL database:

Image description

Now, based on the user’s question, we need to find the most relevant chunk in our database. Here Embeddings API can help us. It gets a text and returns a vector (array of 1536 numbers).

Image description

Thus, we generate a vector for each chunk via Embeddings API and save it to DB.

response = openai_client.embeddings(
  parameters: {
    model: 'text-embedding-ada-002',
    input: 'Rubyroid Labs has been on the web and mobile...'
  }
)

response.dig('data', 0, 'embedding') # [0.0039921924, -0.01736092, -0.015491072, ...]
Enter fullscreen mode Exit fullscreen mode

That’s how our database looks now:

Image description

Code

rails g model Item page_name:string text:text embedding:vector{1536}
rake db:migrate
Enter fullscreen mode Exit fullscreen mode

Migration:

class CreateItems < ActiveRecord::Migration[7.0]
  def change
    create_table :items do |t|
      t.string :page_name
      t.text :text
      t.vector :embedding, limit: 1536

      t.timestamps
    end
  end
end
Enter fullscreen mode Exit fullscreen mode

Model:

class Item < ApplicationRecord
  has_neighbors :embedding
end
Enter fullscreen mode Exit fullscreen mode

Rake task (lib/tasks/index_data.rake):

DATA = [
  ['React Native Development', 'Rubyroid Labs has been on the web and mobile...'],
  ['Dedicated developers', 'Rubyroid Labs can give you a team of dedicated d...'],
  ['Ruby on Rails development', 'Rubyroid Labs is a full-cycle Ruby on Rails...'],
  # ...
]

desc 'Fills database with data and calculate embeddings for each item.'
task index_data: :environment do
  openai_client = OpenAI::Client.new

  DATA.each do |item|
    page_name, text = item

    response = openai_client.embeddings(
      parameters: {
        model: 'text-embedding-ada-002',
        input: text
      }
    )

    embedding = response.dig('data', 0, 'embedding')

    Item.create!(page_name:, text:, embedding:)

    puts "Data for #{page_name} created!"
  end
end

Enter fullscreen mode Exit fullscreen mode

Run rake task:

rake index_data
Enter fullscreen mode Exit fullscreen mode

Vector

What is a vector? Simply, a vector is a tuple, or in other words, an array of numbers. For example, [2, 3] . In two-dimensional space, it can refer to a dot on the scalar plane:

Image description
2d vector on the scalar plane

The same applies to three and more dimensional spaces:

Image description

If we had 2d vectors, not 1536d vectors, we could display them on the scalar plane like this:

Image description

How to find the most relevant chunks

So, the app receives the following question: “How long has RubyroidLabs been on the mobile software market?”. Let’s calculate its vector as well.

response = openai_client.embeddings(
  parameters: {
    model: 'text-embedding-ada-002',
    input: 'How long has RubyroidLabs been on the mobile software market?'
  }
)

response.dig('data', 0, 'embedding') # [0.009017303, -0.016135506, 0.0013286859, ...]

Enter fullscreen mode Exit fullscreen mode

And display it on the scalar plane:

Image description

Now we can mathematically find the nearest vectors. No AI is needed for this task. That’s what we previously set up PGVector for.

nearest_items = Item.nearest_neighbors(
  :embedding, question_embedding,
  distance: "euclidean"
)
context = nearest_items.first.text
Enter fullscreen mode Exit fullscreen mode

And now, just put this context to the Chat API as we already did previously.

message_content = <<~CONTENT
  Answer the question based on the context below, and
  if the question can't be answered based on the context,
  say \\"I don't know\\".

  Context:
  #{context}

  ---

  Question: #{question}
CONTENT

# a call to Chat API


Enter fullscreen mode Exit fullscreen mode

Here it is 🎉

Image description

Our chat answers are based on all the information we provided. Moreover, it almost doesn’t spend additional money per question but provides a better answer. However, you have to pay once for calculating embeddings when initializing the database. For 300K tokens with Ada v2, it costs just $0.03.

Image description

Rubyroid Labs collaborates with businesses all around the world to integrate OpenAI into their activities. If you want to alter your chatbot or other conversational interface, please contact us.

Summary
Let’s wrap it up:

  1. Split the data you have into small chunks. Calculate an embedding for each chunk.
  2. Save chunks with corresponding embeddings to a vector DB, e.g., PostgreSQL plus PGVector.
  3. The app initialization is done. Now you can receive a question from a user. Calculate embedding for this question.
  4. Get a chunk from the DB with the nearest vector to the questions vector.
  5. Send a question to Chat API, providing the chunk from the previous step.
  6. Get an answer from Chat API and display it to the user 🎉

Image description

The complete chat logic extracted to a separate class:

# frozen_string_literal: true

class AnswerQuestion
  attr_reader :question

  def initialize(question)
    @question = question
  end

  def call
    message_to_chat_api(<<~CONTENT)
      Answer the question based on the context below, and
      if the question can't be answered based on the context,
      say \\"I don't know\\".

      Context:
      #{context}

      ---

      Question: #{question}
    CONTENT
  end

  private

  def message_to_chat_api(message_content)
    response = openai_client.chat(parameters: {
      model: 'gpt-3.5-turbo',
      messages: [{ role: 'user', content: message_content }],
      temperature: 0.5
    })
    response.dig('choices', 0, 'message', 'content')
  end

  def context
    question_embedding = embedding_for(question)
    nearest_items = Item.nearest_neighbors(
      :embedding, question_embedding,
      distance: "euclidean"
    )
    context = nearest_items.first.text
  end

  def embedding_for(text)
    response = openai_client.embeddings(
      parameters: {
        model: 'text-embedding-ada-002',
        input: text
      }
    )

    response.dig('data', 0, 'embedding')
  end

  def openai_client
    @openai_client ||= OpenAI::Client.new
  end
end

# AnswerQuestion.new("Yours question..").call

Enter fullscreen mode Exit fullscreen mode

What else can be done to improve answers quality:

  • Chunk size. Find the best size for a data chunk. You can try splitting them into small ones, get the closest N from the database and connect them to one context. Conversely, you can try to create big chunks and retrieve only the one - the closest.
  • Context length. With gpt-3.5-turbo you can send 4,096 tokens. With gpt-3.5-turbo-16k - 16,384 tokens. With gpt-4-32k up to 32,768 tokens. Find whatever fits your needs.
  • Models. There are a slew of AI models that you can use for Embeddings or Chat. In this example, we used gpt-3.5-turbo for Chat and text-embedding-ada-002 for Embeddings. You can try different ones.
  • Embeddings. OpenAI Embeddings API is not the only way to calculate embeddings. There are plenty of other open-source and proprietary models that can calculate embeddings.

Top comments (0)