DEV Community: Ashish Abraham

AI-Powered Social Search Using Qdrant DB

Ashish Abraham — Fri, 29 Dec 2023 05:38:57 +0000

In the digital age where data is the new oil, search engines have become the compass guiding us through an ocean of information. Google has always been zealous in its mission to “Organize the world’s information making it useful and accessible”. As the internet flooded the globe with data, individuals have progressively emphasized concentrating solely on particular subjects, while reducing diversions from others. From planning a holiday to a self-diagnosis of diseases, search is playing a pivotal role in the modern world for people to make everyday decisions.

With the advent of machine learning in this domain, searching has evolved a lot. Keyword searches have now become neural embeddings that give greater accuracy by finding the context rather than specific words. Search words and tricks are no longer used in search engines as people have moved to queries and conversations. Instead of going for “Travel destinations in Thailand”, users are now going for “Plan me a 3-day tour in Thailand”. Neural embeddings have ushered in innovative searching methods like audio and visual search which would otherwise be difficult. Now machines can easily connect an image with a word as they are embedded in analogous vector spaces.

Social Search

Social media has seen a major part of this search boom as people wanted to tap into the latest trends, topics and updates from their friends, influencers and brands. Facebook came up with advanced algorithms for users to get real-time updates and personalized feed. Instagram came up with hashtags for people to stay close to topics of interest which later went to all other platforms including Twitter. To strike gold in the business, it was imperative to serve users with tailored content in their searches. As social media expanded along with the internet, advances in AI became more and more important to build better recommendation and search engines.

Qdrant

Qdrant is an open-source vector database. This Rust-built platform seamlessly integrates powerful search, matching, and recommendation functionalities into your development workflow. Unlike other vector databases, Qdrant shines with its intuitive API, tailor-made for crafting sophisticated recommendation engines and blazing-fast searches. This developer-friendly approach is further amplified by pre-built client libraries for Python and beyond, seamlessly bridging the gap between your code and Qdrant’s capabilities. Scaling effortlessly with cloud compatibility, Qdrant empowers you to work with diverse data types, opening doors to a world of possibilities for your vector-powered applications.

Prerequisites

The tutorial is tested on these Python libraries. Make sure about the versions while working.

transformers: 4.36.1
qdrant-client: 1.7.0
sentence_transformers: 2.2.2
torch: 2.1.2
pandas
streamlit

Set Up Qdrant

Make sure you have docker installed and keep the docker engine running if you are using a local environment. Qdrant can be installed by downloading its docker image.

! docker pull qdrant/qdrant

Run the Qdrant docker container using the command.

! docker run -p 6333:6333 \
    -v $(pwd)/qdrant_storage:/qdrant/storage \
    qdrant/qdrant

Alternatively, you can start the container from the docker desktop console.

Then only you will be able to start the Qdrant client.

In this tutorial, we will be building a search engine designed to sift through tweets from the FIFA World Cup Qatar 2022.

Populating the Database

I will be fetching the tweets from a dataset in Kaggle. You can also use Twitter API to fetch the tweets in real time. Find the dataset here.

Import the dataset as a Pandas dataframe and isolate only the required columns.

import pandas as pd
import numpy as np


data=pd.read_csv("tweets1.csv")
data = data[["Tweet Id", "Tweet Content", "Username"]]
data

Import the required libraries to create the database. We will begin by creating a collection. In the context of a vector database, a collection is the designated storage area for vector embeddings, documents, and any associated metadata. It’s analogous to a table in a traditional relational database. These collections are specifically engineered to manage data where each entry is depicted as a vector in a multi-dimensional space.

from qdrant_client import QdrantClient
from qdrant_client.http import models
qdrant_client = QdrantClient(host="localhost", port=6333)
collection_name = "FIFA-22"
qdrant_client.recreate_collection(
    collection_name=collection_name,
    vectors_config=models.VectorParams(size=768, distance=models.Distance.COSINE)
)

We determine the dimensions of the vectors within this collection. In this case, each vector has 768 dimensions, mirroring the 768-dimensional output of the model we’re employing.

Create Embeddings

Before storing our data in a vector database, we must convert them into embeddings. Embeddings are numerical or vector representations of data which enables them to be represented in a multidimensional space, thus conserving the context and meaning of the data.
The embedding model we will be using here is the all-mpnet-base-v2. It is a high-performance embedding model that maps text to a 768-dimensional vector space. This model is especially beneficial for text data, as it can perform functions such as semantic search, clustering, and assessing sentence similarity, thus offering superior quality embeddings for text data.

from sentence_transformers import SentenceTransformer


model = SentenceTransformer('all-mpnet-base-v2')

Add to Database

Qdrant stores the vectors as points, which have the following structure.

id: The ID of the document object.
vector: The embedding vector of the document object.
payload: The original content of the document object. Iterate through the dataset and convert to embeddings.

points_list = []
i=0
for index, row in data.iterrows():
    embedding = model.encode(row['Tweet Content'])
    point_dict = {
        "id": i,
        "vector": embedding,
        "payload": {"tweet": row['Tweet Content'], "user": row['Username']},
    }
    points_list.append(point_dict)
    i+=1

Be sure to check the datatypes of id, vector and embeddings before pushing them into the database. If not right, it may show an invalid JSON response error. Store the extracted points to the vector database.

qdrant_client.upsert(collection_name=collection_name, points=points_list)

Search Engine

Now, as the database is created, let’s build the search engine. Import the required libraries for the search engine and the UI.

import streamlit as st
import qdrant_client
from sentence_transformers import SentenceTransformer

Define the collection we will be searching for in the database. Use the collection we just created above.

# Connect to Qdrant
qdrant_client = qdrant_client.QdrantClient(host="localhost", port=6333)
collection_name = "FIFA-22"

Define the embedding model for converting the queries.

model = SentenceTransformer('all-mpnet-base-v2')

The query in the search bar will be converted to embeddings using the same model we used to embed the data.

query_vector = model.encode(search_query, convert_to_tensor=False)

Qdrant offers a wide range of high-speed and efficient APIs for performing search and retrieval operations. Use the search function from Qdrant to search through the embeddings. This function compares each embedding to the query embedding using cosine similarity approach. This feature from Qdrant is very efficient in computing similarities between vectors and fetching similar contents.

results = qdrant_client.search(
      collection_name=collection_name,
      query_vector=query_vector.tolist(), 
      limit=3 
)

The parameter limit is set to 3 — to show only the top 3 matching results. The query embedding is converted to a Python list before using it in the search function.

Combine the search engine into a simple UI using the Streamlit library in Python. Feel free to use your own approaches to display the search bar, results, and other elements in the UI.

qdrant_client = qdrant_client.QdrantClient(host="localhost", port=6333)
collection_name = "FIFA-22"
model = SentenceTransformer('all-mpnet-base-v2') 


st.title("Qdrant Search Engine")
st.subheader("Search through the voices from FIFA World Cup 2022 in Twitter ⚽🏆✨")


search_query = st.text_input("Enter your search query:")


if search_query:
    try:


        query_vector = model.encode(search_query, convert_to_tensor=False)


        results = qdrant_client.search(
            collection_name=collection_name,
            query_vector=query_vector.tolist(), 
            limit=3 
        )


        # Display results
        st.write("Search Results:")
        for result in results:
            st.write(f"- Tweet ID: {result.id}")
            st.write(f"- Tweet Content: {result.payload['tweet']}")


    except Exception as e:
        st.error(f"Error occurred during search: {e}")

Run the file using the command.

! streamlit run filename.py

Here are some results.

Wrapping Up

You have learned how to build a simple search engine using the Qdrant vector database. The powerful APIs allow for efficient and effective searching through large amounts of data. I encourage you to try out other functions and algorithms offered by the Qdrant discovery API for optimizing the search results. Hope you found it useful! Feel free to share your thoughts and feedback.😊
👉 LinkedIn
✏️ Checkout my blog on Medium

References

Qdrant Documentation
Sentence Transformers
Steamlit Documentation

Finding Answers in Complex Standardization Documents Using Qdrant

Ashish Abraham — Sat, 23 Dec 2023 18:32:10 +0000

In a recent survey conducted by cnvrg.io, they found that 44% of up-and-coming firms and startups consider generative AI to be indispensable to their growth. At present, owing to their constraints, a number of them opt for LLMaaS (LLM as a Service). This involves purchasing AI technology from top-tier companies for their use. Despite the fact that many lack the necessary infrastructure and expertise to construct their own LLMs, the report forecasts that 48% of these firms will adopt a build approach in the near future. This implies utilizing open-source resources and customizing them to develop their own models for specific applications. Large language models have become an essential part of enterprise solutions in such a short span of time.

RAG

Considering the huge potential and value of LLMs, using them straight out of the box is not advisable. When we query an LLM on a specific topic continuously with ordinary prompts, the LLM may provide irrelevant responses as it lacks the precision of that topic due to the vast and wide range of content it is trained on. This drawback is called “Hallucination” and can be usually solved by providing the LLM with context while using it. This process involves providing the LLM with content from which it can draw conclusions. This enables the model to generate responses relevant to the context and reflects well with what the user expects. This is what we call Retrieval Augmented Generation (RAG).
Usually this is the case with standardization documents like the ISO standards or the IEEE standards. Those are long and complex documents – and reading them to find some information we seek and get exactly what we want is nearly impossible. That is where RAG can help us. RAG’s capabilities are fundamentally rooted in the retrieval process. This enables the model to broaden its understanding beyond the pre-existing or pre-trained data, accessing a large pool of information that is current or specific to the context.

Qdrant

Qdrant, a vector database and search engine crafted in Rust, is an open-source platform. It serves as a comprehensive solution for developers aiming to incorporate vector similarity search, matching, and recommendations into their applications. Its user-friendly nature for recommendation systems and searching sets it apart from other vector databases, mainly due to its extensive API functionalities. In addition, it offers pre-built client libraries for Python and several other programming languages. Qdrant is highly scalable with cloud compatibility and accommodates a broad spectrum of data types.

In this tutorial, we will explore a question-answering RAG system for such documents using the Qdrant vector database.

The System Workflow
Prerequisites
Creating the Knowledge Base
Retrieving Context
Generating Responses
Conclusion
References

The System Workflow

The workflow of the Qdrant-based RAG system will be as follows:

Obtain the user’s question.
Transform the user’s question into a semantically equivalent vector representation using an embedding model.
Use Qdrant’s built-in functions to calculate the similarity between the query vector and the content vectors in the database, and fetch the top-k related content.
Use the retrieved context and the user’s question as input for an LLM.
The LLM generates the appropriate response.

Prerequisites

The directory structure for the project is as shown.

.
├── app.py
├── docs
│ ├── ASM-standards.pdf
│ ├── ASM.pdf
├── .env
├── ingest.py
└── requirements.txt

The tutorial is tested on these Python libraries. Make sure about the versions while working. Install these using requirements.txt.

transformers: 4.36.1
qdrant-client: 1.7.0
langchain: 0.0.350
sentence_transformers: 2.2.2
huggingface_hub: 0.19.4
PyPDF2
protobuf: 4.25.1
torch: 2.1.2

I’ve utilized large language models that operate efficiently on a CPU with decent performance and a minimum of 8GB RAM. However, superior specifications are recommended. If you’re considering using other large language models, a cloud-based environment might be necessary.

Set Up Qdrant

Make sure you have docker installed and keep the docker engine running if you are using a local environment. Qdrant can be installed by downloading its docker image.

!docker pull qdrant/qdrant

Run the Qdrant docker container using the command.

!docker run -p 6333:6333 \
    -v $(pwd)/qdrant_storage:/qdrant/storage \
    qdrant/qdrant

Alternatively, you can start the container from the docker desktop console.

Then only you will be able to start the Qdrant client in the Python files.

Creating the Knowledge Base

LLMs work on encoded vector representations of real-world content called embeddings. So, before we use or store the data from the document, we must convert them into embeddings. The retrieval knowledge base of the RAG system will be the vector database and it will store all content we have in the form of embeddings. We will use the ASM standardization document for the demo.
Import required libraries.

ingest.py

import os
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.document_loaders import DirectoryLoader
from langchain.document_loaders import PyPDFLoader
from qdrant_client import QdrantClient
from qdrant_client.http import models
import torch
from sentence_transformers import SentenceTransformer

We’ll begin by creating a collection. In database terms, a collection is a cluster of data, with each individual piece referred to as a document. The dimensions of the vectors within the collection are defined by us. For this scenario, each vector possesses 348 dimensions, corresponding to the 348-dimensional output of the model we’re utilizing.

ingest.py

qdrant_client = QdrantClient(host='localhost', port=6333)
my_collection = "ASM"
qdrant_client.recreate_collection(
    collection_name=my_collection,
    vectors_config=models.VectorParams(size=384, distance=models.Distance.COSINE)
)

Extract the text from the document using the PyPDFLoader library. To enhance the search process, the document is divided into several sections. This strategy assists in the effective extraction of the most pertinent information. We employ a tool from LangChain, known as RecursiveCharacterTextSplitter, to break the document down into 700-character segments, with each segment overlapping the next by 50 characters.

ingest.py

loader = PyPDFLoader("docs/ASM-standards.pdf")
documents = loader.load()
text_splitter = RecursiveCharacterTextSplitter(chunk_size=700, chunk_overlap=50)
texts = text_splitter.split_documents(documents)

Define the embedding model. Here we are using Mini-LM-L6. MiniLM-L6-v2 is specifically designed for efficient text encoding, making it well-suited for generating dense vector representations of documents or text segments.

ingest.py

model = SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2')

For each document object from texts, create the embedding vector. Qdrant stores the vectors as points which have the following structure.

id: The ID of the document object.
vector: The embedding vector of the document object.
payload: The original content of the document object.

ingest.py

points_list = []
i=0
for text in texts:
    embedding = model.encode(text.page_content)
    point_dict = {
        "id": i+1,
        "vector": embedding,
        "payload": {"text": text.page_content},
    }
    points_list.append(point_dict)
    i+=1

Store the extracted points to the vector database.

ingest.py

qdrant_client.upsert(collection_name=my_collection, points=points_list)

Now run the file ingest.py.

Retrieving Context

As we discussed earlier, the document is long and contains a lot of standardization rules. If we choose to know about the standards of only one industry or commodity, we need to extract only related content from the database. This should be given to the LLM to get the desired response. The LLM used for question-answering here is a fine-tuned version of BERT.
Import required libraries.

app.py

import torch
from transformers import pipeline
from sentence_transformers import SentenceTransformer
from qdrant_client import QdrantClient
from qdrant_client.http import models
from typing import List

Retrieve the collection we just created from the database.

app.py

collection_name = "ASM"
qdrant_client = QdrantClient(host='localhost', port=6333)
collections = qdrant_client.get_collections()

Define the models we are going to utilize for the question-answering and the embedding function.

app.py


model_name = "bert-large-uncased-whole-word-masking-finetuned-squad"
embedding_model = SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2')

Now create the task pipeline. The pipeline is configured to handle question-answering tasks; prepare it to receive questions and context as input and return answers. The task parameter is set to question answering.

app.py

reader = pipeline("question-answering", model=model_name, tokenizer=model_name)

Now, define the function which fetches relevant context from the database.

app.py


def get_context(query: str, top_k: int) -> List[str]:
    """
    Get the relevant context from the database for a given query


    Args:
        query (str): What do we want to know?
        top_k (int): Top K results to return


    Returns:
        context (List[str]):
    """
    try:
        encoded_query = embedding_model.encode(query).tolist()  


        result = qdrant_client.search(
            collection_name=collection_name,
            query_vector=encoded_query,
            limit=top_k,
        ) 


        context = [
            [x.payload["text"]] for x in result
        ]  
        return context


    except Exception as e:
        print({e})

The function encodes the query into an embedding and searches the whole database for similarity using the search function of Qdrant. This feature from Qdrant is very efficient in computing similarities between vectors and fetching similar contents. The context is retrieved from the database and returned by the function.

Generating Responses

Define the function to generate responses based on context and the query.

app.py


def get_response(query: str, context: List[str]):
    """
    Extract the answer from the context for a given query


    Args:
        query (str): _description_
        context (list[str]): _description_
    """
    results = []
    for c in context:
        answer = reader(question=query, context=c[0])
        results.append(answer)

    results = sorted(results, key=lambda x: x["score"], reverse=True)
    for i in range(len(results)):
        print(f"{i+1}", end=" ")
        print(
            "Answer: ",
            results[i]["answer"],
            "\n  score: ",
            results[i]["score"],
        )

In this process, the pipeline is supplied with the query and context. The LLM utilizes these inputs to derive and structure the responses. The outcomes are then arranged according to the scores obtained from the reader model, with a higher score indicating a more pertinent response.

Here are some sample queries and responses.

app.py

query = "What is a mandated mechanical test for Welded Ferritic-Martensitic Stainless-Steel Pipe?"
context = get_context(query, top_k=1)
print("Context: {}\n".format(context))
get_response(query, context)

Output:

query = "What is the minimum size of calibration hole in the reference standard?"

Output:

You can also use LangChain prompt templates to create better prompts that would yield better responses. It can be structured like this.

"""Use the following pieces of information to answer the user's question.
If you don't know the answer, just say that you don't know, don't try to make up an answer.


Context: {context}
Question: {question}


Only return the helpful answer below and nothing else.
Helpful answer:
"""

I’ll leave the exploration up to you. Don’t forget to delete the collection after use as it consumes a lot of resources when run locally.

client.delete_collection(collection_name=collection_name)

Find the complete code here.

Wrapping Up

Congratulations! You have learned how to convert your documents to vector embeddings and store them in the Qdrant vector database. You have also seen how we can query the document easily using the RAG approach without having to go through the whole complex content in the standardization document. Try scaling this using advanced chains from LangChain and frameworks like LlamaIndex based on the Qdrant vector database. I hope you enjoyed this tutorial and found it useful. Thank you for reading and happy coding!

References

Qdrant Documentation
Sentence Transformers
GitHub

DEV Community: Ashish Abraham

AI-Powered Social Search Using Qdrant DB

Social Search

Qdrant

Prerequisites

Set Up Qdrant

Populating the Database

Create Embeddings

Add to Database

Search Engine

Wrapping Up

References

Finding Answers in Complex Standardization Documents Using Qdrant

RAG

Qdrant

Table of Contents

The System Workflow

Prerequisites

Set Up Qdrant

Creating the Knowledge Base

Retrieving Context

Generating Responses

Wrapping Up

References