I have been wondering how LLMs can read and analyze files for a while now. So I decided to learn how they did it, and my research led me to tools. I've been looking for a good anime to watch, so I figured, how about I learn how I can create a tool that can recommend some good shows? So I grabbed an anime dataset from Kaggle and went about creating the tool.
It took me a while to figure out the best way to make use of the tool, but hey, I think it was worth it. In this article, we’ll cover:
- Creating an agent using Langchain
- Tool calling through the agent
- Safety considerations when working with tools
- Adding memory to the agent
- Returning a structured response
- Caching in Streamlit
The idea behind the app is pretty simple: we have an Excel file with anime ratings. We make use of pandas, specifically the data frame query
method, to filter through the data and return a result. This result is fed to the agent, and it returns a response based on the result. The agent's job, therefore, is to come up with queries that are fed to the tool and to give a response based on the result returned by the tool.
Initial setup
In a virtual environment, install the required packages with:
pip install langchain langchain-core langchain-community langgraph pandas openpyxl streamlit
- streamlit: For building the interactive web application.
- langchain, langgraph, langchain-core, langchain-community: The core libraries for building the agent, defining tools, and managing the application flow.
- langchain-openai: To use OpenAI models like GPT-4o-mini.
-
pandas, openpyxl: For reading and querying the Excel data (
.xlsx
file).
I am using python-environ
to manage my environmental variables; if you wish to do the same, install it with pip install python-environ
All the code is available on my github
Create two files:
-
chat.py
- for the streamlit interface -
llm_utils.py
- all our agent code
Inside llm_utils.py
, add the following code:
import re, os
from langchain_core.tools import tool
from langchain.chat_models import init_chat_model
from langgraph.prebuilt import create_react_agent
import pandas as pd
from langchain_core.messages import HumanMessage, SystemMessage
from typing import Union, List, Literal, Optional
import streamlit as st
from pydantic import BaseModel, Field
import environ
env = environ.Env()
environ.Env.read_env()
APIKEY = env("OPENAI_API_KEY")
if not os.environ.get("OPENAI_API_KEY"):
os.environ["OPENAI_API_KEY"] = APIKEY
We import all the necessary packages that we are going to be using in this project. Make sure you have a .env
file with your OpenAI key.
Safety Considerations
Before we proceed, we need to talk a bit about safety when working with large language models. You see, the query
method is capable of executing Pythonic code, which means if the model were to provide a malicious query, there could be problems. If, for example, the model were to give an input query like
__import__('os').system('rm -rf /')
This query tries to delete everything. The result could be quite catastrophic; therefore, we have to take some safety precautions. Just telling the model not to return any malicious code is not enough. In this demonstration, I am going to create a function that determines whether a query is safe to execute, returning an exception otherwise. This function won't deal with all possibilities; therefore, I suggest you take a look at the best practices when you're implementing your own solution.
In the llm_utils.py
file, add the following code:
def is_safe_query(query_string):
"""
Checks query safety before execution
Return boolean. True if string is safe and false otherwise
"""
dangerous_keywords = [
"import",
"eval",
"exec",
"execfile",
"globals",
"locals",
"open",
"compile",
"input",
]
if "__" in query_string or ";" in query_string:
return False
# Check for potentially dangerous words
for keyword in dangerous_keywords:
pattern = r"\b" + re.escape(keyword) + r"\b"
if re.search(pattern, query_string, re.IGNORECASE):
return False
return True
Creating the tool
Creating a tool in LangChain is quite straightforward; we just need to add the @tool
decorator to the function, which makes the function callable by the agent:
@tool
def read_pandas_dataframe(expression: str, limit: Optional[int] = None) -> str:
"""Query a pandas DataFrame using a safe string expression and return filtered results as a JSON string.
Reads data from 'anime-dataset-2023.xlsx', evaluates a query expression,
optionally takes the first 'limit' rows using the .head() method,
and returns a JSON string (records orientation) representing the subset of rows that match the given conditions. Includes a basic security check.
"""
df = pd.read_excel("anime-dataset-2023.xlsx")
if not is_safe_query(expression):
raise ValueError("Invalid query string")
result = df.query(expression, engine="python", inplace=False)
if limit is not None and isinstance(limit, int) and limit > 0:
# Apply the head method to limit the number of rows
result = result.head(limit)
if result.empty:
return "[]"
elif isinstance(result, pd.Series):
return result.to_frame().to_json(orient="records")
else:
return result.to_json(orient="records")
The tool reads the anime dataset Excel file and executes the query method on the resultant data frame.
Note that the function also takes an optional argument limit
. Sometimes the query returns a large number of results, which can be too large an input for the agent's context; therefore, only the first few results should be returned by calling the head
method.
When a user asks a question (e.g., "Recommend me an action anime with a score above 8.5"), the agent, guided by the system message, determines it needs data. It formulates a query expression (like 'Genres.str.contains("Action", case=False) and Score >= 8.5'
) and calls the read_pandas_dataframe
tool with that expression as the argument.
The tool executes the query on the data frame and returns the resulting (filtered) data as JSON back to the agent. The agent then uses this data to formulate its final response to the user.
Note that the read_pandas_dataframe
function first checks if the query is safe to execute.
Creating the agent
At the heart of the application, we have the agent. I am going to make use of a prebuilt agent offered by langgraph. In llm_utils.py
add the following code to create the agent:
def create_llm_agent():
model = init_chat_model("gpt-4o-mini", model_provider="openai")
return create_react_agent(
model=model,
tools=[read_pandas_dataframe]
)
In this example, I am using the OpenAI API, but you can use any of the other providers. Make sure to check the official documentation for details. The create_llm_agent
function initializes the chat models first, then creates the agent with the create_react_agent
function, which takes the model and the tools as an argument.
The system message
The system message serves to instruct the model on its responsibility or what purpose it serves. In our case, it's going to help us get the hottest anime on the planet. Maybe not because the dataset is from 2023😅.
In llm_utils.py
add the following code:
def get_system_message():
message = SystemMessage(
"""
Role: You are a helpful Anime Recommendation Assistant.
Tool Access: You can query a large Excel dataset containing anime data using the `pandas` library and `df.query`. Key columns available are:
- `Name`
- `Score`
- `Genres`
- `Synopsis`
- `Type` (e.g., "TV", "Movie", "OVA")
- `Episodes`
- `Image URL`
Primary Task: Respond to user requests by querying this dataset and providing relevant anime recommendations or information.
CRITICAL LIMITATIONS:
1. Maximum of 20 Results Per Query**: Due to the dataset's size, all queries must return 20 rows or fewer. Larger result sets will cause system failure.
2. Minimize Tool Usage: Plan queries to be specific and efficient. Avoid broad, vague searches that force you to retry multiple times. Strive to get accurate results in a single call.
3. Query Safety: Only generate valid, secure `df.query` strings. Invalid syntax or logic will break the tool.
Query Construction Rules:
- Be Specific: Use `and` to combine multiple filters and narrow results.
- Use Ranges: Apply ranges for scores and episode counts to limit query size (e.g., `Score >= 8.0 and Score < 8.5`).
- Targeted String Matching:
- Use `==` for exact values (`Name == "Naruto"`).
- Use `.str.contains(..., case=False)` for flexible matches in `Genres` or partial name searches — but always pair with other filters.
- Handling Comma-Separated Genres:
The `Genres` column contains comma-separated strings like `"Action, Adventure, Fantasy"`.
When filtering by genre:
- To find anime that match *both* genres (AND logic):
```
python
'Genres.str.contains("Action", case=False) and Genres.str.contains("Adventure", case=False)'
```
- To find anime that match either genre (OR logic):
```
python
'Genres.str.contains("Action", case=False) or Genres.str.contains("Adventure", case=False)'
```
- For better accuracy (avoiding partial word matches like "Action-Drama"), consider using word boundaries with regex:
```
python
'Genres.str.contains(r"\\bAction\\b", case=False)'
```
- Quote Escaping: Use single quotes to wrap the whole query, and double quotes inside it (e.g., `'Name == "Bleach"'`).
If a user request is vague (e.g., “Show me action anime” or “What's popular?”) and will likely return more than 20 results — You are free to ask them for more specific details.
Good Query Examples (Safe and Specific):
- Exact Match
`'Name == "Death Note"'`
*(Very specific, 1 result expected)*
- High-Rated Movie
`'Type == "Movie" and Score >= 8.7'`
*(Highly rated films only)*
- Top-Scoring Action Shows
`'Genres.str.contains("Action", case=False) and Score >= 9.0'`
*(Very narrow score filter to limit results)*
- Multiple Genres (AND)
`'Genres.str.contains("Adventure", case=False) and Genres.str.contains("Fantasy", case=False) and Score >= 8.0'`
- Multiple Genres (OR)
`'Genres.str.contains("Romance", case=False) or Genres.str.contains("Drama", case=False)'`
- Short Action Series
`'Genres.str.contains("Action", case=False) and Episodes <= 13 and Score >= 7.5'`
*(Filters based on episode count and score)*
- Score Bracket Strategy
`'Score >= 8.0 and Score < 8.5'`
*(Prevents too broad a score range)*
- Compact Sci-Fi OVAs
`'Type == "OVA" and Genres.str.contains("Sci-Fi", case=False) and Episodes <= 6 and Score >= 7.5'`
"Top N Strategy" for Ranked Recommendations (e.g., Top 5 Romance Anime):
1. Start narrow:
`'Genres.str.contains("Romance", case=False) and Score >= 9.0'`
2. If result count < 5, expand range slightly:
`'Genres.str.contains("Romance", case=False) and Score >= 8.5 and Score < 9.0'`
3. Combine small result sets **locally**, sort by `Score`, return top 5.
Output Formatting:
If you are returning a dataframe, make sure to return it as a string that can be converted into a dictionary using json.loads. Use the tool to get information from the file,
whether it be the file columns or the actual data. When the user asks about a particular anime, return the image and description.
You can return a modified description to give better explaination.
Final Guidelines:
- Always prioritize specific, narrow queries to stay within the 20-row limit.
- Never make multiple broad queries when one well-structured query will work.
- Ask users for more input when queries are too general.
- Only return results in safe JSON format or as **natural conversation summaries**.
"""
)
return message
The system message is pretty long, but that's to ensure that we get the results we want. Feel free to modify the prompt as you see fit.
Returning structured data
Langchain models can return structured data, which is useful since sometimes we want to be able to display images of the anime. In this case, I want the model to return a pydantic object:
# llm_utils.py
class ImageCard(BaseModel):
"""Represents an image card with metadata."""
title: str = Field(description="Title of the image card")
description: str = Field(description="Description of the image")
image_url: str = Field(description="URL of the image")
class ResponseClass(BaseModel):
"""Response to a prompt with dynamic data types."""
response_type: Literal["text", "image", "dataframe"] = Field(
description="Type of response (text, image, or dataframe)"
)
# Union type for response_data with discriminator based on response_type
response_data: Union[
str, # For "text" type
List[ImageCard], # For "image" type (list of image cards)
] = Field(..., description="Response data matching the response_type")
I have defined two pydantic classes, the ImageCard
class, which will hold data for an anime, that is, it's title, description and image url. This will allow us to easily display the anime in a card.
The ResponseClass
class has two fields:
- response_type: This specifies which response the model is returning between the three option(text, image, and dataframe). This will help us know how to render the response in the frontend
-
response_data: The actual data the model is returning, which can be a string or a list of
ImageCard
objects for when the model returns multiple anime in cards.
Getting the model response
Now we define a function that allows us to get the response from the model:
def get_model_response(llm, user_input: str):
user_message = HumanMessage(user_input)
messages = [user_message]
return llm.invoke({"messages": messages})
The chat interface
And now for the chat interface. I'm going to keep it pretty simple with a text input at the bottom of the page and a sidebar with only one button that will allow us to clear the chat:
In chat.py
add the following code:
import streamlit as st
from llm_utils import (
get_model_response,
ResponseClass,
create_llm_agent,
get_system_message,
)
import pandas as pd
import json
# --- Page Configuration (Optional) ---
st.set_page_config(page_title="LLM Chat", layout="wide")
llm = create_llm_agent()
@st.cache_resource
def invoke_system_message():
system_message = get_system_message()
llm.invoke({"messages": [system_message]})
invoke_system_message()
st.title("Anime Bot🤖")
st.caption("This chat can display text, dataframes, and image cards.")
def get_llm_response(user_prompt):
"""
Calls the LLM with the user prompt and handles any exceptions.
"""
try:
response = get_model_response(llm, user_prompt)
return response
except Exception as e:
# Return a basic error response in the expected format
return {"response_type": "text", "response_data": f"Error: {str(e)}"}
# --- Chat History Management ---
if "messages" not in st.session_state:
st.session_state.messages = []
# --- Display Previous Messages ---
for message in st.session_state.messages:
with st.chat_message(message["role"]):
# Check the type of content to display
if message["role"] == "user":
# User messages are always text
st.write(message["content"])
else:
# For assistant, handle different response types
content = message["content"]
if content.response_type == "text":
st.write(content.response_data)
elif content.response_type == "dataframe":
# Convert the DataFrameDict to a pandas DataFrame
if hasattr(content.response_data, "data"):
# It's a DataFrameDict object with .data attribute
df = pd.DataFrame(json.loads(content.response_data.data))
elif (
isinstance(content.response_data, dict)
and "data" in content.response_data
):
# It's a dict with "data" key
df = pd.DataFrame(content.response_data["data"])
else:
# Try to convert directly
df = pd.DataFrame(json.loads(content.response_data))
st.dataframe(df, use_container_width=True)
elif content.response_type == "image":
for card in content.response_data:
st.image(card.image_url, caption=card.title)
st.subheader(card.title)
st.write(card.description)
else:
# Fallback for unknown types
st.write(f"Unsupported content type: {content['response_type']}")
# --- Handle User Input ---
if prompt := st.chat_input(
"Ask the LLM something (e.g., 'show me anime with high ratings', 'recommend me an anime')"
):
# 1. Add user message to chat history and display it
st.session_state.messages.append({"role": "user", "content": prompt})
with st.chat_message("user"):
st.write(prompt)
# 2. Get LLM response
with st.spinner("LLM is thinking..."):
response = get_llm_response(prompt)
try:
llm_response = response["structured_response"]
except KeyError as e:
llm_response = ResponseClass(
response_type="text",
response_data="Failed to return a structured output, query could not be executed successfully.",
)
# 3. Add LLM response to chat history and display it
st.session_state.messages.append({"role": "assistant", "content": llm_response})
with st.chat_message("assistant"):
# Check the type of content to display
if llm_response.response_type == "text":
st.write(llm_response.response_data)
elif llm_response.response_type == "dataframe":
# Convert the DataFrameDict to a pandas DataFrame
df = pd.DataFrame(json.loads(llm_response.response_data))
st.dataframe(df, use_container_width=True)
elif llm_response.response_type == "image":
for card in llm_response.response_data:
# Handle both dictionary and ImageCard object formats
if hasattr(card, "image_url"):
# It's an ImageCard object
st.image(card.image_url, caption=card.title)
st.subheader(card.title)
st.write(card.description)
else:
# It's a dictionary
st.image(card["image_url"], caption=card["title"])
st.subheader(card["title"])
st.write(card["description"])
else:
# Fallback for unknown types
st.error(f"Unsupported response type: {llm_response['response_type']}")
# --- Optional: Add a button to clear history ---
if st.sidebar.button("Clear Chat History"):
st.session_state.messages = []
st.rerun() # Rerun the app to reflect the cleared state
In the next section we are going to add memory to the agent, but first, let me explain what's going on with @st.cache_resource
.
The @st.cache_resource
decorator tells Streamlit to store the result of the decorated function (initialized agent in this case). On subsequent runs of the script (e.g., when the user sends a new message), if the function is called again with the same inputs (which it is, in this case), Streamlit reuses the stored object instead of re-executing the function. Benefits of this include:
- Speed: Avoids the time cost of re-creating the agent or memory object.
- Cost: Avoids potentially re-running initialization code that might call external APIs (though the agent creation itself might be cheap, the principle applies).
-
State Preservation: Ensures you're interacting with the same memory object and agent instance across user interactions within a session. We also cache the initial system message invocation (
invoke_system_message
) to avoid sending it repeatedly.
Adding memory to the agent
For the last section, we are going to add memory to the agent. We are going to be making use of a simple in-memory checkpointer that comes with LangGraph. See its documentation for more detail, including how to use different persistence backends (e.g., SQLite or Postgres).
In llm_utils.py
, modify the create_llm_agent
and the get_model_response
functions as follows:
# llm_utils.py
@st.cache_resource
def create_llm_agent(_memory):
model = init_chat_model("gpt-4o-mini", model_provider="openai")
return create_react_agent(
model=model,
tools=[read_pandas_dataframe],
response_format=ResponseClass,
checkpointer=_memory,
)
def get_model_response(llm, user_input: str, config):
user_message = HumanMessage(user_input)
messages = [user_message]
return llm.invoke({"messages": messages}, config)
We add an argument to the create_llm_agent
function: _memory
. We give it an underscore to indicate to @st.cache_resource
that this argument does not need to be cached.
We add a config
parameter to the get_model_response
. This config parameter is used to handle multiple conversations at the same time in an application.
Config looks like this:
config = {"configurable": {"thread_id": "abc123"}}
The thread_id
is a unique identifier for each conversation.
And now, make modifications to the chat.py
.
Make the following import:
from langgraph.checkpoint.memory import MemorySaver
Then add the following after st.set_page_config
:
#chat.py
config = {"configurable": {"thread_id": "abc123"}}
@st.cache_resource
def get_memory():
return MemorySaver()
memory = get_memory()
We use @st.cache_resource
to avoid recreating the memory object on subsequent runs.
Now add the memory
as the parameter when we call create_llm_agent
:
llm = create_llm_agent(memory)
Next we modify invoke_system_message
to pass the config as an argument:
@st.cache_resource
def invoke_system_message():
system_message = get_system_message()
llm.invoke({"messages": [system_message]}, config) # passing in config
Lastly, in the get_llm_response
function, we pass the config to get_model_response
:
def get_llm_response(user_prompt):
"""
Calls the LLM with the user prompt and handles any exceptions.
"""
try:
response = get_model_response(llm, user_prompt, config) # Passing config
return response
except Exception as e:
# Return a basic error response in the expected format
return {"response_type": "text", "response_data": f"Error: {str(e)}"}
Try it out
The application is now fully functional and ready to be tried out.
Run the application with streamlit run chat.py
Here are some example prompts that I gave it:
Prompt: Hey, I'm looking to watch a romance anime, with around 12 episodes and with a bit of action. Do you have any suggestions?
Result:
Prompt: Give me more detail about the second title you suggested
Result:
Prompt: Are there any other anime similar to elfen lied?
Result:
Hmm, I think I'll watch Berserk😁.
Conclusion
I hope this article has helped you get a better understanding of how to use tools with agents. Being able to use tools with large language is definitely a game changer. I found it pretty fun to figure out. Happy tooling, folks!
Top comments (0)