raphiki for Technology at Worldline

Posted on Dec 9, 2023

RAG Tutorial: Exploring AnythingLLM and Vector Admin

#rag #anythingllm #vectoradmin #llm

This tutorial is designed with a dual purpose in mind. Firstly, it introduces two highly innovative open-source projects from Mintplex Labs. These are AnythingLLM, an enterprise-grade solution engineered for the creation of custom ChatBots, inclusive of the RAG pattern, and Vector Admin, a sophisticated admin GUI for the effective management of multiple vectorstores.

The second aim of this tutorial is to guide you through the deployment of local models, specifically for text embedding and generation, as well as a vectorstore, all designed to integrate seamlessly with the aforementioned solutions. For this, we'll be utilizing LocalAI in conjunction with Chroma.

So, strap in and let's embark on this informative journey!

Installing the Chroma Vectorstore

The process begins with cloning the official repository and initiating the Docker container.



git clone https://github.com/chroma-core/chroma.git
cd chroma
docker compose up -d --build

To verify the availability of the vectorstore, we connect to its API documentation located at: http://localhost:8000/docs

Using this API, we proceed to create a new collection, aptly named 'playground'.



curl -X 'POST' 'http://localhost:8000/api/v1/collections?tenant=default_tenant&database=default_database' 
  -H 'accept: application/json' 
  -H 'Content-Type: application/json' 
  -d '{ "name": "playground", "get_or_create": false}'

Following this, we check the result to ensure proper setup.



curl http://localhost:8000/api/v1/collections

[
  {
    "name": "playground",
    "id": "0072058d-9a5b-4b96-8693-c314657365c6",
    "metadata": {
      "hnsw:space": "cosine"
    },
    "tenant": "default_tenant",
    "database": "default_database"
  }
]

Implementation of LocalAI

Next, our focus shifts to establishing the LocalAI Docker container.



git clone https://github.com/go-skynet/LocalAI
cd LocalAI
docker compose up -d --pull always

Once the container is operational, we embark on downloading, installing, and testing two specific models.

Our first model is the sentence-transformers embedding model from Bert: MiniLM L6.



curl http://localhost:8080/models/apply 
  -H "Content-Type: application/json" 
  -d '{ "id": "model-gallery@bert-embeddings" }'

curl http://localhost:8080/v1/embeddings 
  -H "Content-Type: application/json" 
  -d '{ "input": "The food was delicious and the waiter...",
        "model": "bert-embeddings" }'

{
  "created": 1702050873,
  "object": "list",
  "id": "b11eba4b-d65f-46e1-8b50-38d3251e3b52",
  "model": "bert-embeddings",
  "data": [
    {
      "embedding": [
        -0.043848168,
        0.067443006,
    ...
        0.03223838,
        0.013112408,
        0.06982294,
        -0.017132297,
        -0.05828256
      ],
      "index": 0,
      "object": "embedding"
    }
  ],
  "usage": {
    "prompt_tokens": 0,
    "completion_tokens": 0,
    "total_tokens": 0


 }
}

Subsequently, we explore the LLM: Zephyr-7B-β from Hugging Face, a refined version of the foundational Mistral 7B model.



curl http://localhost:8080/models/apply 
  -H "Content-Type: application/json" 
  -d '{ "id": "huggingface@thebloke__zephyr-7b-beta-gguf__zephyr-7b-beta.q4_k_s.gguf", 
        "name": "zephyr-7b-beta" }'

curl http://localhost:8080/v1/chat/completions 
  -H "Content-Type: application/json" 
  -d '{ "model": "zephyr-7b-beta", 
        "messages": [{
          "role": "user", 
          "content": "Why is the Earth round?"}], 
        "temperature": 0.9 }'

{
  "created": 1702050808,
  "object": "chat.completion",
  "id": "67620f7e-0bc0-4402-9a21-878e4c4035ce",
  "model": "thebloke__zephyr-7b-beta-gguf__zephyr-7b-beta.q4_k_s.gguf",
  "choices": [
    {
      "index": 0,
      "finish_reason": "stop",
      "message": {
        "role": "assistant",
        "content": "\nThe Earth appears round because it is
actually a spherical body. This shape is a result of the 
gravitational forces acting upon it from all directions. The force 
of gravity pulls matter towards the center of the Earth, causing 
it to become more compact and round in shape. Additionally, the 
Earth's rotation causes it to bulge slightly at the equator, 
further contributing to its roundness. While the Earth may appear 
flat from a distance, up close it is clear that our planet is 
indeed round."
      }
    }
  ],
  "usage": {
    "prompt_tokens": 0,
    "completion_tokens": 0,
    "total_tokens": 0
  }
}

Deploying and Setting Up AnythingLLM

Having successfully installed the models and vectorstore, our next step is to deploy the AnythingLLM application. For this, we will utilize the official Docker image provided by Mintplex Labs.



docker pull mintplexlabs/anythingllm:master
docker run -p 3001:3001 mintplexlabs/anythingllm:master

Access to the application is achieved through navigating to http://localhost:3001, where we can begin the configuration process using the intuitive GUI.

In the configuration, we opt for the LocalAI backend, accessible via the http://172.17.0.1:8080/v1 URL, and integrate the Zephyr model. It's noteworthy that AnythingLLM also supports other backends such as OpenAI, Azure OpenAI, Anthropic Claude 2, and the locally available LM Studio.

Following this, we align our embedding model with the same LocalAI backend, ensuring a cohesive system.

Next, we select the Chroma vector database, using the URL http://172.17.0.1:8000. It’s important to mention that AnythingLLM is also compatible with other vectorstores such as Pinecone, QDrant, Weaviate, and LanceDB.

Customization options for AnythingLLM include the possibility of adding a logo to personalize the instance. However, for the sake of simplicity in this tutorial, we will skip this step. Similarly, while there are options for configuring user and rights management, we will proceed with a streamlined, single-user setup.

We then proceed to create a workspace, aptly named "Playground," reflecting the name of our earlier Chroma collection.

The AnythingLLM start page is designed to offer initial instructions to the user in a chat-like interface, with the flexibility to tailor this content to specific needs.

From our "Playground" workspace, we can upload documents, further expanding the capabilities of our setup.

We monitor the logs to confirm that AnythingLLM is effectively inserting the corresponding vectors into Chroma.



Adding new vectorized document into namespace playground
Chunks created from document: 4
Inserting vectorized chunks into Chroma collection.
Caching vectorized results of custom-documents/techsquad-3163747c-a2e1-459c-92e4-b9ec8a6de366.json to prevent duplicated embedding.
Adding new vectorized document into namespace playground
Chunks created from document: 8
Inserting vectorized chunks into Chroma collection.
Caching vectorized results of custom-documents/techsquad-f8dfa1c0-82d3-48c3-bac4-ceb2693a0fa8.json to prevent duplicated embedding.

This functionality enables us to engage in interactive dialogues with the documents.

An interesting feature of AnythingLLM is its ability to display the content that forms the basis of its responses.

In conclusion, AnythingLLM and each workspace within it offer a range of configurable parameters. These include the system prompt, response temperature, chat history, and the threshold for document similarity, among others, allowing for a customized and efficient user experience.

Installing and Configuring Vector Admin

To complete our architecture, we now focus on installing the Vector Admin GUI, which serves as a powerful tool for visualizing and managing the vectors stored by AnythingLLM in Chroma.

The installation process involves utilizing Docker containers provided by Mintplex Labs: one for the Vector Admin application and another for a PostgreSQL database, which stores the application's configuration and chat history.



git clone https://github.com/Mintplex-Labs/vector-admin.git
cd vector-admin/docker/
cp .env.example .env

We modify the .env file to adjust the server port from 3001 to 3002, avoiding a conflict with the port already in use by AnythingLLM. On Linux systems, it is also necessary to set the default Docker gateway IP address for the PostgreSQL connection string.



SERVER_PORT=3002
DATABASE_CONNECTION_STRING="postgresql://vectoradmin:password@172.17.0.1:5433/vdbms"

Additionally, we configure the SYS_EMAIL and SYS_PASSWORD variables to define credentials for the first GUI connection.

Given the change in the default port, we also reflect this modification in both the docker-compose.yaml and Dockerfile.

After configuring the backend, we turn our attention to the frontend installation.



cd ../frontend/
cp .env.example .env.production

In the .env.production file, we update the port to align with the Docker gateway.



GENERATE_SOURCEMAP=false
VITE_API_BASE="http://172.17.0.1:3002/api"

With these settings in place, we build and launch the Docker containers.



docker compose up -d --build vector-admin

Accessing the GUI is straightforward via http://localhost:3002. The initial connection utilizes the SYS_EMAIL and SYS_PASSWORD values specified in the .env file. These credentials are only required for the first login to create a primary admin user from the GUI and start configuring the tool.

The first step in the GUI is to create an organization, followed by establishing a Vector Database Connection. For the database type, we select Chroma, although Pinecone, QDrant, and Weaviate are also compatible options.

After synchronizing workspace data, the documents and vectors stored in the "playground" collection within Chroma become visible.

Details of these vectors are also accessible for in-depth analysis.

A note on functionality: editing vector content directly via Vector Admin is currently limited as it utilizes OpenAI's embedding model. Since we opted for s-BERT MiniLM, this capability is not available. Had we chosen OpenAI's model, uploading new documents and embedding vectors directly into Chroma would have been possible.

Vector Admin also boasts additional features like user management and advanced tools, including automatic drift detection in similarity searches, upcoming snapshots, and migration capabilities between organizations (and, by extension, vectostores).

This tool is particularly admirable for its capacity to grant full control over vectors, simplifying their management considerably.

That concludes our exploration for today. As demonstrated, Mintplex Labs' tools, AnythingLLM and Vector Admin, facilitate the straightforward setup of the RAG pattern, empowering users to interact with documents conversationally. These projects are actively evolving, with new features on the horizon. Therefore, it is worthwhile to regularly check their roadmap and begin leveraging these tools to engage with your files interactively.

DEV Community

RAG Tutorial: Exploring AnythingLLM and Vector Admin

Installing the Chroma Vectorstore

Implementation of LocalAI

Deploying and Setting Up AnythingLLM

Installing and Configuring Vector Admin

Top comments (0)

Read next

LLM APIs vs. Self-Hosted Models: Finding the Best Fit for Your Business Needs

Navigating the world of Harry Potter with Knowledge Graphs

Detecting Hallucinations in LLMs with Discrete Semantic Entropy and Perplexity

Local LLMs: The Future of Private AI Computing? A Complete Guide for 2024