Sai Kiran Gajjala

Posted on Jan 26

Generative AI Powered QnA & Visualization Chatbot

Generative AI, since the arrival of ChatGPT, has become an indispensable tool, streamlining my workflow and enhancing creativity. For me, it has become as essential as electricity or the internet, making it almost impossible to imagine life without it. The past year has been thrilling, filled with countless discoveries in the world of generative AI. It's been a bit overwhelming at times, but my interest in this technology has kept me going. I have delved into various tools, frameworks, design patterns, and approaches within the generative AI field. This blog post aims to share some of the learnings I've gained in 2024.

Movie AI Bot: A Generative AI Application
This blog post showcases a full-stack Proof-of-Concept (POC) application called Movie AI Bot that demonstrates the power of generative AI. Movie AI Bot allows users to interact with a movie database using natural language and receive informative responses, along with data visualizations.

This full-stack GenAI-powered application demonstrates how to leverage generative AI within a client-server architecture. It allows users to:

Ask questions about the movie database in natural language and receive responses in natural language, formatted in markdown.
View dynamic visualizations of data trends generated in real time.

This project serves as a concrete example of integrating generative AI capabilities into a client-server web application. It showcases how AI can be employed to query an operational database in natural language, providing a practical implementation. The project utilizes the *sample_mflix * database from MongoDB Atlas as an example

Tech Stack
This POC is developed using a simple client-server architecture.

Front End: A React application that leverages React-Chatbotify library to easily integrate a chatbot GUI. It also uses the Plotly library to display the charts/visualizations. The generative AI implementation and details are entirely abstracted from the front end. The front-end application depends on a single REST endpoint of the backend application.
Back End: A Python Fast API project that leverages langgraph for running AI Agents.
Database: A MongoDB Atlas cluster with sample_flix database
Orchestration Framework: Langgraph
Foundation Models: OpenAI's GPT-4o as Large Language Model (LLM) and OpenAI's GPT-4o-mini as Small Language Model (SLM)
Debugging & Troubleshooting: Langsmith

Implementation
The project uses LangGraph to orchestrate AI agents for tasks like Q&A and Generative Business Intelligence (BI). Below is an overview of the key agents and their functions:

The key agents are explained as follows

1. QnA Agent

This agent understands user queries in natural language, retrieves data from the MongoDB database, and generates natural language responses leveraging the tool-calling agent feature of langchain. It performs the following tasks:

Dynamically generates a MongoDB aggregation pipeline query using a predefined tool.
Executes the generated query on MongoDB Atlas.
Retrieves the relevant data from MongoDB.
Summarizes the data and responds to the user in natural language.

This agent is powered by an LLM, OpenAI's GPT-4o, ensuring the generation of MongoDB pipeline queries with better accuracy.

QnA Flow

2. Visualization Agents

The main objective of these agents is to create visualizations for user queries related to movie trends. These agents perform various functions, including:

Rephrasing the user query to better suit visualization purposes.
Fetching data from MongoDB based on the rephrased query.
Dynamically writing Python code to generate Plotly design and data JSONs.
Executing the Python code and providing the front end with visualization data in Plotly format.

These agents are also powered by an LLM, OpenAI's GPT-4o, to ensure precise generation of both MongoDB pipeline queries and Python code.

The following langsmith link shows this flow.
Visualization Flow

3. Router Agent

The lightweight agent is to serve as a gateway to the backend, dynamically routing user queries to the appropriate downstream agents, such as the QnA or Visualization agents. This agent is powered by an SLM, specifically OpenAI's GPT-4o-mini for efficient and cost-effective task routing.

Memory Management

To maintain conversational context and handle follow-up queries, short-term memory is enabled for all agents. This can be extended to long-term memory by persisting context in a database, enhancing interactions over time.

Improvements

Slow Response Times: While OpenAI's GPT-4o and GPT-4o mini offer impressive accuracy, their inference speeds are not so good, resulting in slow response times. To address this, consider exploring alternative LLM providers like Groq Cloud, Fireworks AI, or Together AI. These providers offer faster inference speeds with open-source models like LLAMA 3.3 and DeepSeek which maintains accuracy close to OpenAI's models. This can significantly improve the user experience.
LLM Code Execution: Running LLM-generated code on the same server as the application is not recommended and this can be prevented by running the code on the Sandbox environments such as Modal and E2B
Token Streaming: Currently, the chatbot waits for the backend API to generate all response tokens before displaying them to the user. This can lead to a poor user experience. To improve this, token streaming capabilities of the LLMs alongside WebSockets can be implemented. This allows for a more interactive experience by delivering the generated tokens incrementally as they are produced.

Links

Source Code:
https://github.com/saikiran-gajjala/langgraph-agents

Instructions to run the application:
Refer to the following Readme.md file for instructions on running the application.
https://github.com/saikiran-gajjala/langgraph-agents/blob/main/README.md

Demo Video Link
Refer to the following link to watch the demo video recording.
https://github.com/saikiran-gajjala/langgraph-agents/blob/main/MovieBot-Demo.mp4

Future Scope

In the upcoming blogs, this POC can be further extended to other features and technologies

Integration of human-in-the-loop features within Langgraph, enabling human oversight and intervention in the AI decision-making process for performing some actions/tasks.
Integration of the Graph RAG for efficiently fetching the unstructured data.
Integrating with Open source models such as DeepSeek, and LLAMA small language models by deploying them locally using Ollama.

Get n8n VPS hosting 3x cheaper than a cloud solution

Get fast, easy, secure n8n VPS hosting from $4.99/mo at Hostinger. Automate any workflow using a pre-installed n8n application and no-code customization.

Start now

DEV Community

Generative AI Powered QnA & Visualization Chatbot

1. QnA Agent

2. Visualization Agents

3. Router Agent

Memory Management

Improvements

Links

Future Scope

Get n8n VPS hosting 3x cheaper than a cloud solution

Top comments (0)

This site is powered by Heroku

Read next

SVG essentials. Basic shapes and path

The Value of C++ Tutorials in Developing Efficient and Scalable Code

Webdevs Are Melting 🫠 Typescript Will Be 10x Faster. Thanks To Go Language

Build Once ! Deploy Many :Ensuring Consistency Across Environments with Angular Docker

Okay