DEV Community

Cover image for Building a LlamaIndex App with Streamlit to Query Custom Data
alisdairbr for Koyeb

Posted on • Originally published at koyeb.com

Building a LlamaIndex App with Streamlit to Query Custom Data

LlamaIndex is a data framework that makes it simple to build production-ready applications from your data using LLMs. Specifically, LlamaIndex specializes in context augmentation, a technique of providing custom data as context for queries to generalized LLMs allowing you to inject your specific contextual information without the trouble and expense of fine-tuning a dedicated model.

In this guide, we will demonstrate how to build an application with LlamaIndex and Streamlit, a Python framework for building and serving data-based applications, and deploy it to Koyeb. The application will deploy an example web app that allows users to ask questions about custom data. In our example, this custom text will be the story "The Gift of the Magi" by O. Henry.

You can deploy and preview the example application from this guide with our LlamaIndex One-Click app or by clicking the Deploy to Koyeb button below:

Deploy to Koyeb

Be sure to set the OPENAI_API_KEY environment variable during configuration. You can consult the repository on GitHub to find out more about the example application that this guide uses.

Requirements

To successfully follow and complete this guide, you need:

  • Python 3.11 installed on your local computer.
  • A GitHub account to host your Llama-Index application.
  • A Koyeb account to deploy and run the preview environments for each pull request.
  • An OpenAI API key so that our application can send queries to OpenAI.

Steps

To complete this guide and deploy a LlamaIndex application, you'll need to follow these steps:

  1. Set up the project directory
  2. Install project dependencies and fetch custom data
  3. Create the LlamaIndex application
  4. Test the application
  5. Create a Dockerfile
  6. Deploy to Koyeb

Set up the project directory

To get started, create and then move into a project directory that will hold the application and assets we will be creating:

mkdir example-llamaindex
cd example-llamaindex
Enter fullscreen mode Exit fullscreen mode

Next, create and activate a new Python virtual environment for the project. This will isolate our project's dependencies from system packages to avoid conflicts and offer better reproducability:

python -m venv venv
source venv/bin/activate
Enter fullscreen mode Exit fullscreen mode

Your virtual environment should now be activated.

Install project dependencies and fetch custom data

Now that we are working within a virtual environment, we can begin to install the packages our application will use and set up the project directory.

First, install the LlamaIndex and Streamlit packages so that we can use them to build the application. We can also take this opportunity to make sure that the local copy of pip is up-to-date:

pip install --upgrade pip llama-index streamlit
Enter fullscreen mode Exit fullscreen mode

After installing the dependencies, record them in a requirements.txt file so that we can install the correct versions for this project at a later time:

pip freeze > requirements.txt
Enter fullscreen mode Exit fullscreen mode

Next we will download the story that we will be using as context for our LLM prompts. We can download a PDF copy of "The Gift of the Magi" by O. Henry from TSS Publishing a platform for short fiction that hosts free short stories.

Create a data directory to hold the contextual data for our application and then download a copy of the story by typing:

mkdir data
curl -L https://theshortstory.co.uk/devsitegkl/wp-content/uploads/2015/06/Short-stories-O-Henry-The-Gift-of-the-Magi.pdf -o data/gift_of_the_magi.pdf
Enter fullscreen mode Exit fullscreen mode

You should now have a PDF file that we cat load into our application and attach as context to our LLM queries.

Create the LlamaIndex application

We have everything in place to start writing our LlamaIndex application. Create an app.py file in your project directory and paste in the following content:

# app.py
import os.path
import streamlit as st

from llama_index.core import (
    VectorStoreIndex,
    SimpleDirectoryReader,
    StorageContext,
    load_index_from_storage,
)


# check if storage already exists
PERSIST_DIR = "./storage"
if not os.path.exists(PERSIST_DIR):
    # load the documents and create the index
    documents = SimpleDirectoryReader("data").load_data()
    index = VectorStoreIndex.from_documents(documents)
    # store it for later
    index.storage_context.persist(persist_dir=PERSIST_DIR)
else:
    # load the existing index
    storage_context = StorageContext.from_defaults(persist_dir=PERSIST_DIR)
    index = load_index_from_storage(storage_context)

query_engine = index.as_query_engine()

# Define a simple Streamlit app
st.title('Ask Llama about "The Gift of the Magi"')
query = st.text_input("What would you like to ask? (source: data/gift_of_the_magi.pdf)", "What happens in the story?")

# If the 'Submit' button is clicked
if st.button("Submit"):
    if not query.strip():
        st.error(f"Please provide the search query.")
    else:
        try:
            response = query_engine.query(query)
            st.success(response)
        except Exception as e:
            st.error(f"An error occurred: {e}")
Enter fullscreen mode Exit fullscreen mode

Let's go over what the application is doing.

It begins by importing the basic packages and modules it will use to create the application. This includes Streamlit (aliased as st) as well as functionality from LlamaIndex for loading data and indexes from directories, building vector indexes, and attaching contexts.

Next, we set up some semi-persistent storage for the index files. This logic helps us avoid creating an index from our data document every time we run the application by storing index information in a storage directory the first time it is evaluated. The application can load the index data from the storage directory directly on subsequent runs to increase performance.

Afterward the index is created or loaded, we create a query engine based on it and begin constructing the application frontend with Streamlit. We add a input field that will be translated into our query and then display the results upon submission.

Test the application

We can test that the application works as expected on your local machine.

First, set and export the OPENAI_API_KEY environment variable using your OpenAI API key as the value:

export OPENAI_API_KEY="<YOUR_OPENAI_API_KEY>"
Enter fullscreen mode Exit fullscreen mode

Next, run the application by typing:

streamlit run app.py
Enter fullscreen mode Exit fullscreen mode

This will start the application server. Navigate to http://127.0.0.1:8501 in your web browser to view the page prompting for your questions about "The Gift of the Magi". You can submit the default query or ask any other questions you have about the story.

When you are finished, press CTRL-C to stop the server.

Create a Dockerfile

During the deployment process, Koyeb will build our project from a Dockerfile. This gives us more control over the version of Python, the build process, and the runtime environment. The next step is to create this Dockerfile describing how to build and run the project.

Before we begin, create a basic .dockerignore command. This will define files and artifacts that we don't want to copy over into the Docker image. In this case, we want to avoid copying the venv/ and storage/ directories since the image will install dependencies and manage cached index files at runtime:

# .dockerignore
storage/
venv/
Enter fullscreen mode Exit fullscreen mode

Next, create a Dockerfile with the following contents:

# Dockerfile
FROM python:3.11-slim

WORKDIR /app
COPY . .
RUN pip install --requirement requirements.txt && pip cache purge

ARG PORT
EXPOSE ${PORT:-8000}

CMD streamlit run --server.port ${PORT:-8000} app.py
Enter fullscreen mode Exit fullscreen mode

The image uses the 3.11-slim tag of the official Python Docker image as its starting point. It defines a directory called /app as the working directory and copies all of the project files inside. Afterwards, it installs the dependencies from the requiremnts.txt file.

The PORT environment variable is also defined as a build variable. This allows us to set it at build time to adjust the port that the image will listen on. We use this value in the EXPOSE instruction and again in the main streamlit command we run with the CMD instruction. Both values will use port 8000 as a fallback if PORT is not defined explicitly.

Publish the repository to GitHub

The application is almost ready to deploy. We just need to commit the changes to Git and push the repository to GitHub.

In the project directory, initialize a new Git repository by running the following command:

git init
Enter fullscreen mode Exit fullscreen mode

Next, download a generic .gitignore file for Python projects from GitHub:

curl -L https://raw.githubusercontent.com/github/gitignore/main/Python.gitignore -o .gitignore
Enter fullscreen mode Exit fullscreen mode

Add the storage/ directory to the .gitignore file so that Git ignores the cached vector index files:

echo "storage/" >> .gitignore
Enter fullscreen mode Exit fullscreen mode

You can also optionally add the Python runtime version to a runtime.txt file if you want to try to build the repository from the Python buildpack instead of the Dockerfile:

echo "python-3.11.8" > runtime.txt
Enter fullscreen mode Exit fullscreen mode

Next, add the project files to the staging area and commit them. If you don't have an existing GitHub repository to push the code to, create a new one and run the following commands to commit and push changes to your GitHub repository:

git add :/
git commit -m "Initial commit"
git remote add origin git@github.com:<YOUR_GITHUB_USERNAME>/<YOUR_REPOSITORY_NAME>.git
git branch -M main
git push -u origin main
Enter fullscreen mode Exit fullscreen mode

Note: Make sure to replace <YOUR_GITHUB_USERNAME>/<YOUR_REPOSITORY_NAME> with your GitHub username and repository name.

Deploy to Koyeb

Once the repository is pushed to GitHub, you can deploy the LlamaIndex application to Koyeb. Any changes in the deployed branch of your codebase will automatically trigger a redeploy on Koyeb, ensuring that your application is always up-to-date.

To get started, open the Koyeb control panel and complete the following steps:

  1. Click Create App in the Koyeb control panel.
  2. Select GitHub as the deployment option.
  3. Choose the GitHub repository and branch containing your application code. Alternatively, you can enter our public LlamaIndex example repository into the Public GitHub repository at the bottom of the page: https://github.com/koyeb/example-llamaindex.
  4. Name your service, for example llamaindex-service.
  5. Choose Dockerfile as the builder for your project.
  6. Choose an Instance of size micro or larger.
  7. Expand the Advanced section and click Add Variable to configure a new environment variable. Create a variable called OPENAI_API_KEY. Select the Secret type and choose Create secret in the value. In the form that appears, create a new secret containing your OpenAI API key.
  8. Name the App, for example example-llamaindex.
  9. Click the Deploy button.

Koyeb will clone the GitHub repository and use the Dockerfile file to build a new container image for the project. Once the build is complete, a container will be started from the image to run your application.

Once the deployment is healthy, visit your Koyeb Service's subdomain (you can find this on your Service's detail page). It will have the following format:

https://<YOUR_APP_NAME>-<KOYEB_ORG_NAME>.koyeb.app
Enter fullscreen mode Exit fullscreen mode

You should see your LlamaIndex application's prompt, allowing you to ask questions about the story and get responses from the OpenAI API.

Conclusion

In this guide, we discussed how to build and deploy an LLM-based web app to Koyeb using LlamaIndex and Streamlit. The application loads a story from a PDF on disk and sends this as additional context when submitting user-supplied queries. This allows you to customize the focus of the query without having to fine-tune a model for the purpose.

This tutorial demonstrates a very simple implementation of these technologies. To learn more about how LlamaIndex can help you use LLMs to answer questions about your own data, take a look at the LlamaIndex documentation.

Top comments (0)