Andrew Panfilov

Posted on May 22

Chatbot Prototype: Architectural Proposal

#architecture #proposal #chatbot #llm

Intro

This article presents an example of an architectural proposal for creating an intelligent chatbot prototype with LLM. It can serve as useful learning material for those who have never written such documents, and for those who have written similar proposals, it may enrich their experience. In any case, I would be glad to hear readers' comments and discuss what should be added to or corrected in this document.

Task Definition

For a large university, it is necessary to create a chatbot to which employees can ask questions in free form regarding students, attendance, statistics, aggregation, and various database queries about students. The university attempted to create a system based on Text-to-SQL on its own. However, in three out of ten cases, the generated SQL query returned incorrect results, and in seven out of ten cases, it returned correct results. This accuracy level does not satisfy the university, so a prototype chatbot needs to be created to respond to university employees’ questions about students with 100% accuracy. This proposal describes a system capable of providing the most accurate information about students from the university database.

Analysis

Terms and Acronyms

Component refers to a runtime entity, it can be deployed independently.
Deployable artifact is the application code as it runs on production: compiled, built, bundled, minified, optimized, and so on. Most often, it's a single binary or a bunch of files compressed in an archive. One can store and version an artifact. An artifact should be configurable in order to be deployable in any environment. For example, if one needs to deploy to staging and production servers, one should be able to use the same artifact. Only the configuration must change, not the artifact itself.
Personal Identifiable Information (PII) refers to any data that could potentially identify a specific individual. This information can include direct identifiers, such as name, social security number, driver’s license number, and passport number, which can directly recognize an individual. It also encompasses indirect identifiers when combined with other information, such as date of birth, place of birth, and mother’s maiden name, that can be used to deduce an individual's identity.
Docker is a set of a platform as a service (PaaS) products that use OS-level virtualization to deliver software in packages called containers.
Container is a standard unit of software that packages up code and all its dependencies so the application runs quickly and reliably from one computing environment to another.
Docker Compose is a tool designed for defining and orchestrating multi-container Docker applications. It allows users to use a YAML file to configure an application’s services, networks, and volumes. By utilizing a single command, users can then initiate and activate all the services outlined in their configuration. It streamlines the deployment process, enabling the management of the entire lifecycle of an application through straightforward commands.
SQL query is a request for data or information from a database table or combination of tables in SQL (Structured Query Language). This language is used for managing and manipulating relational databases. SQL queries can perform a variety of tasks, including retrieving specific information from a database, inserting new data, updating existing data, and deleting data. Queries are constructed using specific SQL syntax and can range from simple commands to retrieve all records from a database table to complex queries involving multiple operations like joins, subqueries, and aggregate functions to extract and manipulate data according to specific requirements.
Chatbot is a software application designed to simulate conversation with human users, especially over the Internet. Chatbots can function in a wide range of environments, including websites, mobile apps, messaging platforms, and telephone services. They are used for various purposes such as customer service, information acquisition, and entertainment. Chatbots can be simple, based on pre-defined scripts to handle specific tasks, or more complex, utilizing natural language processing and artificial intelligence to engage in more open-ended conversations and learn from interactions to improve their responses over time.
Snowflake database refers to Snowflake Inc.'s cloud-based data warehousing service. It is a fully managed service that allows organizations to store and analyze data using cloud-based hardware and software. Snowflake's architecture is unique because it separates storage and compute, enabling users to scale storage and compute independently. This means organizations can adjust their storage capacity and computational power based on their current needs without affecting the other.
SPA (Single Page Application) is a web application or website that interacts with the user by dynamically rewriting the current page rather than loading entire new pages from the server. This approach enables a more fluid and faster user experience, as it minimizes the amount of data transferred between the server and the client, reduces loading times, and provides a seamless interaction similar to desktop applications. SPAs use AJAX and HTML5 to asynchronously load content and update the webpage with new data from the web server, without the need for page refreshes.

Requirements

This article uses system-centric requirements, which are not the same as user-centric user stories and may not contain any business value. At the same time, it is not a contradiction or opposite approach to user-centric user stories – a user story that mentions business value may contain several system-centric requirements.

Additionally, it is essential to understand the difference between functional and non-functional requirements.

From https://en.wikipedia.org/wiki/Non-functional_requirement:

“In systems engineering and requirements engineering, a non-functional requirement (NFR) is a requirement that specifies criteria that can be used to judge the operation of a system, rather than specific behaviors. They are contrasted with functional requirements that define specific behavior or functions. The plan for implementing functional requirements is detailed in the system design. The plan for implementing non-functional requirements is detailed in the system architecture, because they are usually architecturally significant requirements.

Broadly, functional requirements define what a system is supposed to do and non-functional requirements define how a system is supposed to be. Functional requirements are usually in the form of "system shall do ", an individual action or part of the system, perhaps explicitly in the sense of a mathematical function, a black box description input, output, process and control functional model or IPO Model. In contrast, non-functional requirements are in the form of "system shall be ", an overall property of the system as a whole or of a particular aspect and not a specific function. The system's overall properties commonly mark the difference between whether the development project has succeeded or failed.

Non-functional requirements are often called "quality attributes" of a system. Other terms for non-functional requirements are "qualities", "quality goals", "quality of service requirements", "constraints", "non-behavioral requirements", or "technical requirements". Informally these are sometimes called the "ilities", from attributes like stability and portability. Qualities—that is no n-functional requirements—can be divided into two main categories:

1) Execution qualities, such as safety, security and usability, which are observable during operation (at run time).

2) Evolution qualities, such as testability, maintainability, extensibility and scalability, which are embodied in the static structure of the system.”

Principles:

Written once, read many times
1. Specifications of an IT system (like other texts), are written once and are read many times over the lifespan of the system. Hence two observations become clear: reading time matters more than writing time, and upfront quality is free. I.e. it makes sense to optimize the text for readability and comprehensibility, even if that means the author has to put in more effort.
No Nobel prize in literature
1. Specifications are technical texts written for one purpose only: the successful transfer of meaning from one person to another, over distance and time. The author will never win a Nobel prize in literature for it. Therefore it does not make sense to put the effort into an aesthetic, pleasurable text.
Specifications are for systems
1. System specifications describe what is required from the system. This should not at all prevent you from describing what the users do (in the sense of a business process description), but you want to do that in other parts of the documentation (use cases, for instance).
Detailed written documentation has very low communication effectiveness
1. Therefore writing detailed functional requirements might not be the best of ideas. Consider NOT doing it if you expect a high rate of requirements creeps. Assume a change rate of 2-3% per month if you do not know the specific change rate of your project (which actually IS high, though widely experienced).

Here is a well-structured explanation of a requirements format:

Large image: https://dev-to-uploads.s3.amazonaws.com/uploads/articles/6pz78jmxtw2uki01afre.png

Links:

1. User's authentication

The system must provide a user with the capability to request a student related information in free-form textual communication with the chatbot if the following condition is true:

the user was successfully logged in with the provided login password

2. Free-form request's bucketing

The system must recognize to what predefined SQL query relates a free-form textual request about student information if all of the following conditions are true:

The user sent the request through the chatbot
The request semantically belongs to student information

3. Recognition failure

The system must reply to a user with recognition failure if all of the following conditions are true:

The user sent the request through the chatbot
The request semantically doesn't belong to any predefined SQL queries that existed in the system

4. Correctness confirmation recognition

The system must provide a user with the capability to confirm the correctness of predefined SQL query recognition if all of the following conditions are true:

The user sent the request through the chatbot
The request semantically belongs to a predefined SQL query that existed in the system
The chatbot sent a confirmation question to the user, and the user confirmed that the free-form request belongs to the recognized SQL query

5. Additional parameters input

The system must provide a user with the capability to input additional parameters for the recognized SQL query in a dialogue with the chatbot if the following condition is true:

Chatbot successfully recognized predefined SQL query based on a free-form textual request from the user

6. Predefined SQL query results

The system must reply to a user with a result of a predefined SQL query invocation to Snowflake in tabular textual format in a chatbot dialogue if all of the following conditions are true:

The user sent the request through the chatbot
The request was recognized as belonging to one of the predefined SQL queries
The predefined SQL query was sent to the Snowflake database
Snowflake database responded with non-empty data

Predefined queries to Snowflake database

Student missed 3 or more periods in prior instructional week
Student misses same day of the week multiple times
Student Suspended

Design

How to read a component diagram

For cognitive load reduction, a minimalist visual language is used to depict a static structure of deployable artifacts (called components) with links between them.

Here is a subset of visual means of the standard UML component diagram, with only four elements: component, interface, and two types of connections: one for "provides" semantics and one for "uses" semantics.

This type of diagram does not allow any visual means to be treated as flows. No arrows are permitted. A sequence diagram should depict time-related communications, protocols, and data flow.

Chatbot AI Prototype components diagram

Large image: https://dev-to-uploads.s3.amazonaws.com/uploads/articles/9mprx3z9jm82rkihe9wc.png

To simplify the prototype creation and ensure an efficient setup process, all the back-end components of the system will be deployed on a single machine using Docker Compose.

Components:

User's Browser: The entry point for the user's interaction with the ChatBot SPA. It's where the user types their queries in free-form text.
Nginx: Acts as a reverse proxy, load balancer, and static asset server. It also handles basic authentication using login/password to ensure only authorized users can interact with the ChatBot.
ChatBot API: The core component that processes user requests. It's a stateless backend service, possibly written in Python or NodeJS, that communicates with both the OpenAI API for natural language understanding and the Snowflake database for data retrieval. It also interacts with the Postgres database to fetch SQL query templates.
Postgres: Stores SQL query templates with placeholders. These templates are used by the ChatBot to construct SQL queries based on the user's request and additional parameters provided during the dialogue.
OpenAI API: Provides the natural language processing capabilities required to understand the user's free-form text requests and map them to the appropriate SQL query template from the Postgres database.
Snowflake: The primary data storage where actual student attendance data is kept. Once the ChatBot constructs an SQL query, it's executed against this database to retrieve the requested information.

Chatbot AI Prototype process flow

Large image: https://dev-to-uploads.s3.amazonaws.com/uploads/articles/chxjvl5gegv4kbo4kwir.png

The sequence diagram outlines the process flow of a user interacting with a ChatBot to retrieve data from a Snowflake database, using predefined SQL query templates stored in a Postgres database. The interaction involves several phases, from the initial request to the display of results.

This sequence diagram does not contain a process of preparation and filling Postgres with predefined queries with templates with descriptions and embeddings.

Here's a breakdown of each phase:

1. Initial Phase

The user inputs the chatbot's web address into their browser. The browser requests the Single Page Application (SPA) for the chatbot from the Nginx server. Nginx returns the necessary SPA assets (HTML, JavaScript, CSS) to the browser. The user types a free-form textual message into the chatbot interface. The browser sends this message to the chatbot through Nginx, which forwards the authenticated request with the message to the ChatBot API.

2. Predefined Query Recognition Phase

The ChatBot API requests an embedding vector for the message from the OpenAI API. The ChatBot API component asks Postgres for the closest predefined query description based on the user's message. After receiving the query description, the ChatBot API requests a dialogue descriptor from Postgres. The API constructs a prompt using the dialogue descriptor, predefined query description, and user's message. The prompt is sent to the OpenAI API to generate a next dialogue replica, which is used to confirm the correctness of the query recognition with the user.

3. Predefined Query Parameters Fill Phase

Upon user confirmation, the ChatBot API fetches the SQL template for the recognized predefined query from Postgres. The API requests a specific dialogue descriptor related to the predefined query. Using this descriptor, the API builds another prompt and sends it to the OpenAI API to generate dialogue for requesting additional parameters from the user. The user supplies the additional parameters requested by the chatbot.

4. Predefined Query Invocation Phase

The ChatBot API component constructs the final SQL query for invocation in Snowflake using the template filled with parameters provided by the user. This SQL query is executed against the Snowflake database. Snowflake returns the query results, which the ChatBot API formats and sends back to the browser to be displayed to the user.

Tasks breakdown

Predefined Snowflake SQL queries preparation
1. Query templates composing.
2. Embeddings generation.
Predefined Snowflake SQL descriptions preparation.
Postgres DDL composing (example)
1. vector extension setup.
2. DDL for documents table.
3. DDL for document_sections table.
4. DDL for chatbot_dialogues table.
5. DDL for llm_conversations table.
Docker-compose.yml composing.
ChatBot API back-end
1. GitHub repository with CI/CD (GitHub Actions) preparation.
2. Initial codebase composing.
ChatBot SPA
1. GitHub repository with CI/CD (GitHub Actions) preparation.
2. Initial codebase composing.
Nginx configuration composing.
Provisioning
1. GitHub repository with CI/CD (GitHub Actions) preparation.
2. Terraform scripts composing.
3. Hetzner VPS setup.

DEV Community