DEV Community

Cover image for Chat with your CSV: Visualize Your Data with Langchain and Streamlit

Chat with your CSV: Visualize Your Data with Langchain and Streamlit

Ngonidzashe Nzenze on May 17, 2023

Large language models (LLMs) have become increasingly powerful and capable. These models can be used for a variety of tasks, including generating t...
Collapse
 
s7726 profile image
Gavin S

You might consider json.loads() instead of eval. Otherwise you might let the model out ๐Ÿ˜‰

Collapse
 
ngonidzashe profile image
Ngonidzashe Nzenze

Oh thanks for the feedback!๐Ÿ™‚ I'll definitely consider using json.loads() instead of eval().

Collapse
 
samantrags profile image
Raghavendra Samant

Nice article Ngonidzashe !
Just encountered a small issue following this : tabulate , need to install too.

Collapse
 
ngonidzashe profile image
Ngonidzashe Nzenze

Thanks for reading my article! I'm glad you found it helpful.

You're right, I forgot to add the installation instructions for the tabulate package. You can install it with the following command: pip install tabulate

Once you have installed the tabulate package, you should be able to follow the rest of the instructions in the article without any problems.

Collapse
 
samantrags profile image
Raghavendra Samant

Right but running into openAI credit limit issues 0 of 18$ . Do you have paid account or does the trail account tokens suffice ?

Thread Thread
 
ngonidzashe profile image
Ngonidzashe Nzenze

The tokens provided for your trial account are enough initially, but it appears that you have exhausted them. It would be advisable to think about upgrading to a paid account.

Collapse
 
femi_akinyemi profile image
Femi Akinyemi

Nice and well Written! Well done ๐Ÿ‘๐Ÿพ

Collapse
 
ngonidzashe profile image
Ngonidzashe Nzenze

Thank you for reading, I'm glad you liked it.

Collapse
 
mingjun1120 profile image
mingjun1120

I was thinking of doing something similar to your work. Instead of uploading a CSV file, I want to upload a PDF file. Do you have any idea how to that?

Collapse
 
ngonidzashe profile image
Ngonidzashe Nzenze

You could make use of the UnstructuredPDFLoader and the load_qa_chain as follows:

from langchain.document_loaders import UnstructuredPDFLoader
from langchain.llms import OpenAI
from langchain.chains.question_answering import load_qa_chain

API_KEY = 'api-key'

loader = UnstructuredPDFLoader("your_document.pdf")
data = loader.load()

chain = load_qa_chain(
    OpenAI(temperature=0.9, openai_api_key=API_KEY), chain_type="stuff"
)

# model response
response = chain.run(input_documents=data, question="<Input your query here>")

Enter fullscreen mode Exit fullscreen mode

You can get more information here

Collapse
 
talmoscovitz profile image
talmoscovitz

Does the CSV have a size limitation?
Very nice work!

Collapse
 
ngonidzashe profile image
Ngonidzashe Nzenze

I'm glad you liked the article.

Although the file upload size limit in Streamlit is 200MB, the documentation for create_pandas_dataframe_agent does not explicitly state any size limit. However, it is important to note that larger dataframes will consume more memory.

Collapse
 
pdkang profile image
pdkang

how to load the csv from a URL address? I couldn't figure it out. If you have some ideas how to handle that, that will be great!

Collapse
 
ngonidzashe profile image
Ngonidzashe Nzenze

You can load a CSV from a URL just like you would normally load a CSV in pandas:

import pandas as pd

url = "https://example.com/data.csv"
df = pd.read_csv(url)
Enter fullscreen mode Exit fullscreen mode

Hope that helps๐Ÿ™‚

Collapse
 
ngonidzashe profile image
Ngonidzashe Nzenze

Glad you liked it!

Collapse
 
lilinwang profile image
Lilin Wang

Nice tutorial! Wondering if this is doable with the current Javascript support in LangChain? The Javascript support in LangChain is definitely limited

Collapse
 
ngonidzashe profile image
Ngonidzashe Nzenze

Thank you, I'm happy you liked the tutorial!

While I'm not sure about the full capabilities of the current Javascript support in LangChain, there should be a way to make it happen. It may be helpful to explore LangChain's documentation for more insights.

Collapse
 
jmuneraclq profile image
JmuneraCLQ

Hi Ngonidzashe

Good Article and Excelent app for test chat with csv

I have an error when running the application in streamlit
InvalidRequestError: The model text-davinci-003 has been deprecated

What should I fix and where?

Thank you very much

Image description**

Collapse
 
varun_surana_edc2763ff826 profile image
varun surana

HI, can someone help in replacing text-davinci-003 with other model as there is token limitation also which is causing issue

Collapse
 
lokesssh profile image
lokesh

langchain.schema.output_parser.OutputParserException: Could not parse LLM output: Since the observation is not a valid tool, I will use the python_repl_ast tool to extract the required columns from the dataframe.
I am facing this error can anyone help

Collapse
 
jeanbertinr profile image
Jean Bertin • Edited

hi ! Thanks a lot for the nice tutorial ! :)
I would like to know if it possible to adapt the agent.py script in order to use an "Azure Open AI API Key" instead of "Open AI API Key".
The documentation concerning the basic switch is there but I donยดt know precisely how to integrate it on your app :
learn.microsoft.com/en-us/azure/ai...

Thanks a lot for any help

Collapse
 
subhadipbhatta profile image
subhadipbhatta

Very nice article and very well described. However there is glitch in my case. When I am trying to analyze, I get the following error: openai.error.APIError: HTTP code 401 from API (Unknown api key). Trust me this is my personal API key and works perfectly elsewhere.
Also please suggest the implementation with azure open AI

openai.api_type = "azure"

openai.api_version = "...."

openai.api_key = "...."

openai.api_base = "....."

Much appreciated

Collapse
 
pritishchugh22 profile image
Pritish Chugh

While running this prompt, the agent is getting stuck in a loop. Can you see why is it happening.

Prompt: Create a bar graph for the first 5 rows in the data with Ordernumber on x-axis and quantity ordered on y-axis

Entering new AgentExecutor chain...
Thought: To create a bar graph, we need to specify the columns for the x-axis and y-axis. We also need to limit the data to the first 5 rows.

Action: Create a bar graph using the specified columns and limited data.
Action Input: df.head(5), "ORDERNUMBER", "QUANTITYORDERED"
Observation: Create a bar graph using the specified columns and limited data. is not a valid tool, try another one.
Thought:To create a bar graph, we can use the matplotlib library in Python. Let's import it and try again.

Action: Import the matplotlib library.
Action Input: import matplotlib.pyplot as plt
Observation: Import the matplotlib library. is not a valid tool, try another one.
Thought:To create a bar graph, we can use the matplotlib library in Python. Let's import it and try again.

Action: Import the matplotlib library.
Action Input: import matplotlib.pyplot as plt
Observation: Import the matplotlib library. is not a valid tool, try another one.
Thought:To create a bar graph, we can use the matplotlib library in Python. Let's import it and try again.

Collapse
 
mbulelo_dev profile image
Mbulelo Lomo

Hi nice article. I have two errors please advise how can I resolve them:

  1. Image description

  2. Image description

Collapse
 
icreativekid profile image
ICreativeKid

Awesome article!

I was thinking of doing something similar for YouTube videos. Any thoughts?

Collapse
 
dilrajahdan profile image
Dil Ahdan

Thanks for the article. How would you approach this if you didnโ€™t want to use streamlit, and instead wanted a Flask app that gives API for a frontend ?

Collapse
 
devanshu17 profile image
Devanshu-17

Hey, @ngonidzashe

As @dilrajahdan asked above, is it possible to run it via flask?

Collapse
 
ed1123 profile image
Ed1123

Nice article.

Is the code hosted in GitHub? I'd like to fork it. ^^

Collapse
 
ngonidzashe profile image
Ngonidzashe Nzenze

Glad you liked it. The code is available here

Collapse
 
anumber8 profile image
anumber8

bravo @ngonidzashe this is such a very good article on the AI, I enjoy it and looking forward for more and dirtying my fingers exploring more of it ๐Ÿ‘๐Ÿ‘๐Ÿ‘

Collapse
 
sherphard33 profile image
phoenix_magicianGirl ใ‚ทใ‚งใƒ•ใ‚กใƒผใƒ‰

Interesting application of gpt, nice article @ngonidzashe

Collapse
 
test4554 profile image
1gamedevprod

It's a shame we can't create custom tools that the prebuilt agent can use as input argument (need to create an agent from scratch I think)

Collapse
 
vinitmodigit profile image
vinitmodigit

Hi
Nice article very well explained.
If I have a CSV file with 16000 rows and 20 columns, would there be any limitations on using this file?

Collapse
 
ridwanlekan profile image
Raheem Ridwan Lekan

Thanks for the insightful article. I want to know how to use flask as the endpoint instead of the streamlit.

Collapse
 
that_ambivert profile image
Hamsa

Is there a way to load an XML file and extract data from it?

Collapse
 
mg007 profile image
MG007

How can I integrate this feature with bubble.io

Collapse
 
ghjkkjhgf profile image
Gaurav Sharma

in csv file all datatype of columns should be object or it can take any datatype int,float

Collapse
 
rsk0301 profile image
rsk

Image description
I face this error.
What can be the issue? can I pls get a help on this.

Collapse
 
sidkeesara profile image
Sid Keesara • Edited

Image description

Im getting the following error

Can someone help me with it

Collapse
 
asksonu profile image
Sunil Kulkarni

Nice Article.
I have use-case where pdf has rules and data is present in csv. How do I combine both pdf and csv and have a chat ?

Collapse
 
abhishekkumarjjha profile image
abhishekkumarjjha

Nice work. But I having issues with tabulate. installed it already and I can confirm it but it still says "No module named 'tabulate''. Can't we bypass the use of tabulate?