DEV Community: Niranjan Akella

Building a Reverse Image Search Engine Using Qdrant Vector Search

Niranjan Akella — Tue, 30 Jan 2024 02:14:52 +0000

_I know… It’s cool, right? It’s like you have cracked Google!

In this article, you are going to build your very own personalized reverse image search engine using an open-source image-embedding model and Qdrant vector database, which has been gaining serious attention in the AI community lately.

Buckle up, folks! Let’s begin.

Each of us must have used Google Images at least once, either to search for similar images, identify the source, or find higher resolution versions by uploading images instead of using text queries.

But, do you ever question how Google Images is able to comprehend the given image and bring together all similar images present on the Internet?
Or, how does it know that all these images are related?
Or, how could it achieve that accuracy in projecting the matched images?

Yeah, I was there too. But not after what I discovered.

Introduction

As the world around us becomes increasingly visual, the demand for efficient ways to search and access information through images continues to grow rapidly. Currently trending text-based search methods have their limitations when it comes to accurately describing complex scenes or objects (imagine describing a beautiful scene in front of you – it’s pretty difficult).

That’s why Reverse Image Search, a groundbreaking technology like Google Images, can now allow users to upload or drop an image into a search engine to locate similar or identical photos – and represents a major leap forward in the field of visual search.

Harnessing the power of AI takes efficiency and convenience to a whole new level. By integrating AI capabilities with Reverse Image Search Engine, you can elevate your search experience further. Imagine having the ability to build your own personalized search engine right on your laptop, something that can be your own personalized Google.

This doesn’t just streamline tasks but also empowers users with a tailored and efficient image search solution. And I could achieve this with the help of Qdrant Vector DB and Open-AI’s open-source model.

Let’s sail through the creation of this awesome Reverse Image Search Engine using OpenAI’s latest and greatest open-sourced CLIP model coupled with the sheer might of Qdrant’s Vector Database.

This process is divided into the following sections:

Environment setup.
Data-Preprocessing & Populating the Qdrant Vector Database.
Gradio Interface setup.
Testing Reverse Image Search.

Setting the stage!

Initiating the setup for this project begins with fetching the Docker container image and, subsequently, running it on your local Docker daemon. (Ensure you launch the Docker application beforehand.)

Pull the Qdrant client container from the Docker Hub. Then run the container using the following command, which will host the application at localhost:6333.

docker pull qdrant/qdrant
docker run -p 6333:6333 -p 6334:6334 \
 -v $(pwd)/qdrant_storage:/qdrant/storage:z \
   qdrant/qdrant

NOTE: If you are running on Windows, kindly replace $(pwd) with your local path.

Next comes the most important step of all: ‘Use an Environment’. You need to have an independent environment when performing experiments. Let's create a Python environment (I used Anaconda) and install the basic dependencies necessary to run the AI model from the requirements.txt file provided in the Git-Gist.

conda create -n qdrant -y
pip install -r requirements.txt #file provided in Git-gist

Everything is set, let’s kick off the show.

Data Pre-Processing and Populating the Vector Database

For this project I’ve used the ‘alessiocorrado99/animals10’ dataset, which is a collection of 10 different animal images from Kaggle.

First let's install the Kaggle package; for that open Jupyter Notebook in vscode and install the package using pip install kaggle
Obtain your Kaggle API key: You can generate it on Kaggle by going to your account settings and under the ‘API’ section, click on ‘Create New API Token’. This will download a file named ‘kaggle.json’ which holds the credentials required.
Move the downloaded ‘kaggle.json’ file to your project directory.
Open the terminal and run the following command to download the dataset above mentioned: kaggle datasets download -d alessiocorrado99/animals10
After downloading, you may need to unzip or extract the contents of the downloaded file for further processing.

In my previous article, I have used OpenAI’s CLIP model, specifically ‘openai/clip-vit-base-patch32’. This model is tailored for zero-shot image classification and yields a (1,512)-dimensional feature embedding for each image. And it doesn’t stop there. Being pre-trained on images and their corresponding captions, it aligns both textual and visual contexts within the same embedding tensor space. This implies that whether you input text or an image, you will receive a (1,512)-dimensional embedding tensor. So, I am using the same model here to generate my image embeddings.

The beauty of the CLIP model lies in its ability to map both image and text data into the same embedding space, as shown in the image above.

When we input a query in the form of text, we use a tokenizer to break it down into token_ids. Afterward, we utilize the get_text_features method from the model class to create an embeddings tensor, resulting in a feature tensor with dimensions (1, 512). On the other hand, if the input query is an image, we employ the processor to prepare and convert the image into a format suitable for the model. Then, by using the get_image_features method from the model class, we generate an image embedding tensor with dimensions (1, 512).

Here we have an image processor to prepare images before feeding them to the model. This processor performs cosmetic changes to the data so that it is consumable by the model. Once the model is loaded, you'll have to create a Qdrant client that connects to the local Docker container running the Qdrant Vector DB. The vector size is set to 512 aligning with the output embedding feature tensor shape from the model, which is (1, 512).

Fill the VectorDB by processing each image in the dataset in a loop, extracting their features using the CLIP model that maps the features to a fixed dimensional embedding space, and then upload the resulting embeddings to the 'animals_img_db' data collection in Qdrant's VectorDB.

client = QdrantClient("localhost", port=6333)
print("[INFO] Client created...")
root_dir = "new_dataset" 


for subdir, dirs, files in os.walk(root_dir):
   for file in files:
       #look only for image files with jpeg extension
       if  file.endswith(".jpeg"): 
           image_path = os.path.join(subdir, file)
           try:
               image = Image.open(image_path) 
               image_dataset.append(image) 
           except Exception as e:
               print(f"Error loading image {image_path}: {e}")

print("[INFO] Loading the model...")
model_name = "openai/clip-vit-base-patch32"
tokenizer = AutoTokenizer.from_pretrained(model_name)
processor = AutoProcessor.from_pretrained(model_name)
model = AutoModelForZeroShotImageClassification.from_pretrained(model_name)

print("[INFO] Creating qdrant data collection...")
client.create_collection(
   collection_name="animals_img_db",
   vectors_config=models.VectorParams(size=512, distance=models.Distance.COSINE),


)

print("[INFO] Creating a data collection...")
records = []
for idx, sample in tqdm(enumerate(image_dataset), total=len(image_dataset)):
   processed_img = processor(text=None, images = sample, return_tensors="pt")['pixel_values']
   img_embds = model.get_image_features(processed_img).detach().numpy().tolist()[0]
   img_px = list(sample.getdata())
   img_size = sample.size
   records.append(models.Record(id=idx, vector=img_embds, payload={"pixel_lst":img_px, "img_size": img_size}))

for i in range(30,len(records), 30):
   print(f"finished {i}")
   client.upload_records(
       collection_name="animals_img_db",
       records=records[i-30:i],
   )

NOTE: if you notice, I'm not just preserving the image embeddings; I'm also storing the pixel values and image size in the vector payload. This information will be handy later for reconstructing the image to display on the Gradio app. To gain a clearer understanding of the experiment's flow, refer to the 'Data Flow' illustration provided in the subsequent section.

Now that our data is prepared and comfortably stored in Qdrant's VectorDB, let's develop an application to interact with it and retrieve information using Qdrant's Semantic Search functionality.

Gradio Interface Setup

I’m using Gradio to quickly create a functional app with an appealing UI. You may ask, why? Well, because it provides a readymade UI bundle that's easy to set up and perfect for swift demonstrations and, come on, it’s awesome. Coding through it is quite straightforward to be honest.

To put in simple terms, our application will take an image as input from the user. We'll then vectorize the image by generating image embeddings using the 'get_image_features' method from the CLIP-model class.

client = QdrantClient("localhost", port=6333)
print("[INFO] Client created...")

#loading the model
print("[INFO] Loading the model...")
model_name = "openai/clip-vit-base-patch32"
tokenizer = AutoTokenizer.from_pretrained(model_name)
processor = AutoProcessor.from_pretrained(model_name)
model = AutoModelForZeroShotImageClassification.from_pretrained(model_name)

# Gradio Interface
iface = gr.Interface(
   title="Building a Reverse Image Search Engine Using Qdrant Vector Search",
   description="by Niranjan Akella",
   fn=process_text,
   inputs=gr.Image(label="Input Image"),
   outputs=gr.Gallery(label="Relevant Images"), 
)
iface.launch()

Finally, we'll use the vectorized image as a query to perform semantic search over the ‘animals_img_db' collection present in Qdrant VectorDB and retrieve top 5 matches utilizing the semantic search method from Qdrant's client class.

def process_text(image):
   processed_img = processor(text=None, images = image, return_tensors="pt")['pixel_values']
   img_embeddings = model.get_image_features(processed_img).detach().numpy().tolist()[0]
   hits = client.search(
       collection_name="animals_img_db",
       query_vector=img_embeddings,
       limit=5,
   )

   images = []
   for hit in hits:
       img_size = tuple(hit.payload['img_size'])
       pixel_lst = hit.payload['pixel_lst']


       # Create an image from pixel data
       new_image = Image.new("RGB", img_size)
       new_image.putdata(list(map(lambda x: tuple(x), pixel_lst)))
       images.append(new_image)

   return images

NOTE: The complete code is shared at the end of this post along with the link to Git-gist.

You can directly run the Gradio application from the terminal using the Python runtime python3 app.py.

Conclusion

In a nutshell, I successfully integrated the capabilities of OpenAI's CLIP model for generating image embeddings with Qdrant's semantic search feature through its vector database, which facilitates efficient reverse image searches that replicate a widely used Google Images function (which we all love, by the way). I showcased the extrordinariness (yup, I made up that word) of artificial intelligence in conjunction with an innovative vector database such as Qdrant. I presented a practical demonstration using Gradio, an intuitive application allowing users to perform reverse image searches by submitting query images effortlessly.

This article is a comprehensive tutorial, guiding you to construct your very own personalized Reverse Image Search Engine employing AI and Qdrant's VectorDB.

Semantic Search Over Satellite Images Using Qdrant

Niranjan Akella — Fri, 29 Dec 2023 04:07:48 +0000

Build your very own image search engine

Know about me & reach me out at: LinkedIn or X 🤝

Do you ever wonder how Google Photos and Apple Photos are able to understand images?
Or, how do they allow you to search for images based on what you ‘type’?
Or, how Google’s very own image search works?

Well, I cracked it!

More Than Just an Introduction

In this new mind-boggling project, I was able to mimic this very ability of such powerful platforms right on my local system.

Creativity and imagination go hand-in-hand. We should always indulge in imaginative thought experiments that spark creativity, and this is one such thought experiment that has been teasing me for quite some time. I am happy to share that I have succeeded to some extent in satisfying my intellectual thirst through the help of Qdrant & OpenAI’s open-sourced model.

In this article, I’ll be exploring the creation of a semantic image search engine using OpenAI's latest and greatest open-sourced CLIP model coupled with the sheer might of Qdrant’s Vector Database.

This project is divided into the following sections:

Environment Setup
Data Pre-processing & Populating Vector Database
Embedding Feature-Vector-Driven Semantic Search Over Vector Database for Active Image Retrieval

Environment Setup

I always love to organize my projects with a proper structure, which makes them easier to review later on. Similarly, I believe you also prefer to keep your projects straightforward and manageable.

Pro Tip:
I prefer to divide my AI projects this way:

model_type/
|----project_title/
      |----demo/
    |----recorded_demo.mp4
    |----stable_build/
      |----exp_<experiment_number>/
            |----data/
          |----raw/
          |----processed_training_data/
|----model/
    |----metrics/
          |----classification_reports/
          |----performance_scores/
      |----README.md

The first step in preparing the environment for this project involves pulling the Docker container image and then executing it on your local Docker daemon. [Don't forget to launch the Docker application first].

Pull the Qdrant client container from the Docker Hub. Then run the container using the following command, which will host the application at localhost:6333.

docker pull qdrant/qdrant
docker run -p 6333:6333 -p 6334:6334 \
 -v $(pwd)/qdrant_storage:/qdrant/storage:z \
   qdrant/qdrant

_NOTE: If you are running on Windows, kindly replace $(pwd) with your local path.
_

Next comes the most important step of all: ‘Use an Environment’. You need to have an independent environment when performing experiments or, else, you will surely fall into a blackhole like Matthew McConaughey in the film ‘Interstellar’.

So, let’s create a Python environment (I used Conda) and install the following basic dependencies necessary to run the AI model.

conda create -n qdrant -y
pip install qdrant-client sentence-transformers accelerate tqdm datasets gradio

Now that we are all set, Let’s begin the show!

Data Pre-Processing and Populating Vector Database

For this project, I’ve used the ‘arampacha/rsicd’ dataset, a collection of diverse satellite images from Hugging Face. We leverage the datasets library from Hugging Face to load the training split of the dataset.

import datasets

print("[INFO] Loading dataset...")
ds = datasets.load_dataset('arampacha/rsicd', split='train')

Now comes the AI.

I have browsed through a pile of models to find the one that best fits my need for generating feature-focused embeddings from satellite images, as well as creating text embeddings that can be utilized later for semantic search.

I settled on OpenAI's CLIP model, specifically 'openai/clip-vit-base-patch32'. This model is tailored for zero-shot image classification and yields a (1,512)-dimensional feature embedding for each image. And it doesn’t stop there. Being pre-trained on images and their corresponding captions, it aligns both textual and visual contexts within the same embedding tensor space. This implies that whether you input text or an image, you will receive a (1,512)-dimensional embedding tensor.

The elegance of the CLIP model lies in its ability to map both image data and textual data to the same embedding space as illustrated in the above image.

If the input query is textual, we can use the tokenizer to tokenize it and create token_ids. Subsequently, we can generate an embeddings tensor using the get_text_features method from the model class. This process will result in an embedding feature tensor with the shape (1,512).

If the input query is an image, we can use the processor to process and convert the image into a format suitable for the model. Following this, we can generate an image embedding tensor with the shape (1, 512) using the get_image_features method from the model class.

Hence, it functions as a versatile model capable of generating either image embeddings or text embeddings depending on our specific use case. The key advantage is the consistent dimensionality of both embedding types, whether text or image. Pre-trained to understand the interconnected feature distribution between an image and its captions, the model stands as the optimal choice for text-to-image or image-to-image searches.

OpenAI’s comment on CLIP model:
‘If the task of a dataset is classifying photos of dogs vs cats, we check for each image whether a CLIP model predicts the text description “a photo of a dog” or “a photo of a cat” is more likely to be paired with it.’

from transformers import AutoTokenizer, AutoProcessor, AutoModelForZeroShotImageClassification

print("[INFO] Loading the model...")
model_name = "openai/clip-vit-base-patch32"
tokenizer = AutoTokenizer.from_pretrained(model_name)
processor = AutoProcessor.from_pretrained(model_name)
model = AutoModelForZeroShotImageClassification.from_pretrained(model_name)

Here, we have a tokenizer that is used to tokenize text and a processor to prepare images which are consumable by the model.

After loading the model, you will need to instantiate a Qdrant client tethering to the local docker container running the Qdrant application. Create a Qdrant data collection that would be hosting the vectorized data. We set the vector size to be 512 since the output embedding feature tensor from the model is of shape (1,512).

from qdrant_client import QdrantClient
from qdrant_client.http import models

client = QdrantClient("localhost", port=6333)
print("[INFO] Client created...")

print("[INFO] Creating qdrant data collection...")
client.create_collection(
    collection_name="satellite_img_db",
    vectors_config=models.VectorParams(size=512, distance=models.Distance.COSINE),
)

Populate the VectorDB by processing each image in the dataset, extract its features using the CLIP model, and upload the resulting embeddings to Qdrant’s ‘satellite_img_db’ VectorDB data collection.

Note: If you observe closely, I am not only saving the image embeddings but also storing the image pixel values and image size in the vector payload. I will use this information later to reconstruct the image for display on the Gradio app. To better understand the flow of the experiment, do check out the ‘Data Flow’ illustration that I made in the following section.

from tqdm import tqdm
import numpy as np

print("[INFO] Creating a data collection...")
records = []
for idx, sample in tqdm(enumerate(ds), total=len(ds)):
    processed_img = processor(text=None, images=sample['image'], return_tensors="pt")['pixel_values']
    img_embds = model.get_image_features(processed_img).detach().numpy().tolist()[0]
    img_px = list(sample['image'].getdata())
    img_size = sample['image'].size 
    records.append(models.Record(id=idx, vector=img_embds, payload={"pixel_lst": img_px, "img_size": img_size, "captions": sample['captions']}))

#uploading the records to client
print("[INFO] Uploading data records to data collection...")
#It's better to upload chunks of data to the VectorDB 
for i in range(30,len(records), 30):
    print(f"finished {i}")
    client.upload_records(
        collection_name="satellite_img_db",
        records=records[i-30:i],
    )


print("[INFO] Successfully uploaded data records to data collection!")

Embedding Feature-Vector-Driven Semantic Search Over Vector Database for Active Image Retrieval

Now that we have our data ready and chilling in Qdrant’s VectorDB, let’s build an app to interact with it and retrieve information through Qdrant’s Semantic Search functionality.

I will be using Gradio to build a quick functional application with a beautiful UI. Why? Because it comes with a prebuilt UI bundle that is easy to set up and great for quick demos. Coding through it is a breeze. Just visit hugging face spaces and you will understand what I mean.

To put it in simple terms – all we need in this application is to consume a text input from the user, vectorize the text by generating text-embeddings using the ‘get_text_features’ method from the model class, then using the vectorized text as query we perform semantic search over the vectorDB utilizing the search method from Qdrant’s client class.

def process_text(text):
    inp = tokenizer(text, return_tensors="pt")
    text_embeddings = model.get_text_features(**inp).detach().numpy().tolist()[0]
    hits = client.search(
        collection_name="satellite_img_db",
        query_vector=text_embeddings,
        limit=1,
    )

    for hit in hits:
        img_size = tuple(hit.payload['img_size'])
        pixel_lst = hit.payload['pixel_lst']

        new_image = Image.new("RGB", img_size)
        new_image.putdata(list(map(lambda x: tuple(x), pixel_lst)))

    return new_image

iface = gr.Interface(
    title="Semantic Search Over Satellite Images Using Qdrant Vector Database",
    description="by Niranjan Akella",
    fn=process_text,
    inputs=gr.Textbox(label="Input prompt"),
    outputs=gr.Image(type="pil", label="Satellite Image"),
)

iface.launch()

Note: The complete code is shared at the end along with the link to Git-gist.

You can directly run the Gradio application from the terminal using the Python runtime python3 app.py

Scope

The scope of this experiment doesn’t end here. In this project, I have built a text-to-image search engine, but it is also possible to build an image-to-image search engine using the processor of the CLIP model. I highly recommend you to experiment with that and reach out to me on LinkedIn or X to discuss more about it.

Image search demo:

Conclusion

In this project, I successfully combined the power of OpenAI's CLIP model for image embeddings with Qdrant’s Semantic Search functionality over its vector database for efficient semantic search, trying to mimic a very popular Google Photos/Apple Photos functionality. I demonstrated the power of AI coupled with a powerful VectorDB like Qdrant along with a working demo using Gradio application providing a user-friendly interface for semantic image search based on textual queries. This article serves as a wonderful guide for building your own image search engine using AI+Qdrant’s VectorDB – combining advanced open-sourced AI models and a scalable vector database.

Here's the code

Exploring Personalized Shopping Experiences with Qdrant’s Discovery API

Niranjan Akella — Thu, 28 Dec 2023 09:19:22 +0000

Improve personalized product discovery with Qdrant’s Discovery API on a Streamlit web app

Introduction

In the rapidly growing e-commerce landscape, delivering personalized shopping experiences has become one of the key differentiators for success. In this article, we dive deep into the functionality of Qdrant’s new Discovery API, mainly focusing on context search, to unlock tailored product recommendations. To demonstrate this new approach, I’ll walk you through an implementation of a Streamlit App that I have developed, seamlessly integrating with Qdrant’s vector database.

Trust me, it’s fun! ✌️

Discovery API

Qdrant’s Discovery API introduces the concept of “context,” which is a powerful tool for splitting the vector space efficiently, where the context comprises positive-negative pairs, effectively dividing the space into sub-zones based on given user preferences. The search mechanism in this context gives priority to points that actually belong to the positive zones while avoiding the negative ones.

There are mainly 2 types of search:

Discovery Search: Utilizes target and context pairs to find points that are closest to the target but constrained by the provided context pair. This is ideal for combining multimodal, vector-constrained searches.
Context Search: Uses only the context pairs to identify points residing in the best zone possible while minimizing loss. Particularly effective when a target is absent, guiding the search based on areas with fewer negative examples.

Flexibility

Arrangement of the positive and negative examples in context pairs is quite flexible, offering the freedom to experiment with different permutation techniques based on the model and data. The speed of search is linearly related to the number of examples, providing efficiency in exploring vast datasets.

Let’s Get to Work…

Environment Setup

Start the local Docker application on your system and pull the latest Qdrant client container from Docker Hub. Then run the container on port 6333.

NOTE: If you are running on Windows, kindly replace $(pwd) with your local path.

docker pull qdrant/qdrant
docker run -p 6333:6333 -p 6334:6334 \
  -v $(pwd)/qdrant_storage:/qdrant/storage:z \
    qdrant/qdrant

You can also access a beautiful Qdrant server dashboard through localhost:6333/dashboard, where you can browse through and interact with your data.
I always prefer to create a Conda environment for most of my projects. It helps me differentiate between projects well and manage various environments easily. conda create -n qdrant

conda create -n qdrant -y

Install the necessary dependencies for the project.

pip install qdrant-client sentence-transformers tqdm datasets streamlit

Populating Vector Data Collection

For populating the vector database, we create an “e-shopping” vector data-collection that hosts the vectorized form of our dataset cloned from Hugging Face hub through the datasets library.

Initialize a Qdrant client instance to communicate with the Qdrant service running on your local machine.

qdrant = QdrantClient("localhost", port=6333)

Download the desired dataset from the HF library. I have chosen ‘products-2017’ as my sample dataset for demonstration.

dataset = load_dataset("wdc/products-2017", split='train')

Process the dataset by extracting specific fields (‘title_left’, ‘description_left’) and create a list of dictionaries.

data = []
fields = ['title_left', 'description_left']
for i in tqdm(range(len(dataset)), total=len(dataset)):
if dataset[i]['description_left']:
data.append({field: dataset[i][field] for field in fields})

To generate embedding vectors of the product description, I am using the all-new “gte-small” encoder model from the GTE collection that is currently trending in the AI community, which is a pre-trained Sentence Transformer model. This model generates a 384-dimensional feature representation for a given input text and is quite fast on inference. If you wish to experiment with bigger models like RoBERTa or DeBERTa, please do; they might fetch better results.

encoder = SentenceTransformer('thenlper/gte-small')

Create a Qdrant data collection named ‘e-shopping’ with COSINE as the distance metrics.

qdrant.recreate_collection(     
    collection_name="e-shopping",     
    vectors_config=models.VectorParams( size=encoder.get_sentence_embedding_dimension(),
    distance=models.Distance.COSINE))

Iterating through the processed data while encoding text descriptions and uploading vectors along with associated data to the newly created Qdrant collection.

records = [] 
for idx, sample in tqdm(enumerate(data), total=len(data)):     
    if sample['description_left']:         
        records.append(models.Record( id=idx, vector=encoder.encode(sample["description_left"]).tolist(), 
        payload=sample)) 
qdrant.upload_records(collection_name="e-shopping", records=records)

NOTE: Complete code snippet is given at the end

Streamlit App

Streamlit is one of my favorites when it comes to quick prototyping of ideas. In this app, users can select their preferred gadgets and receive personalized recommendations utilizing Qdrant’s Discovery API.

Let’s quickly walk through the app development.

Initialize the Qdrant client, establish a link to the local docker container, and load the ‘gte-small’ transformer model for generating the embedding for the desired context.

qdrant = QdrantClient("localhost", port=6333)
encoder = SentenceTransformer('thenlper/gte-small')

Then we simply create a quick multi-select to allow the end-user to choose their preferred gadgets. Subsequently, we conduct a discovery search, taking into account the provided positive selections and contrasting them with negative choices made by the end-user. We integrate these choices and load them as context pairs. This enables us to uncover the top three recommendations through the Discovery API.

def main():
    st.title("Discovery API for Personalized Shopping Experience")
    st.subheader("By Niranjan Akella")
    # Multi-select checkbox for the first category
    choice_of_gadget = st.multiselect(
        "Pick gadgets you like",
        Gadget,
        default=["Intel Processor", "Graphics Card"],
        key="category1"
    )

    disliked_gadgets = list(set(Gadget) - set(choice_of_gadget))

    # "Personalize" button to trigger a function
    if st.button("Personalize"):
        personalize_function(choice_of_gadget, disliked_gadgets)

def personalize_function(choice_of_gadget, disliked_gadgets):
    contexts = [models.ContextExamplePair(positive=encoder.encode(l).tolist(), negative=encoder.encode(d).tolist()) for (l, d) in list(zip(choice_of_gadget, disliked_gadgets))]

    discovered_products = qdrant.discover(collection_name='e-shopping', context=contexts, limit=3)
    st.write("Top Recommended Products:")
    for product in discovered_products:
        st.write(f"Title: {product.payload['title_left']}")
        st.write(product.payload['description_left'])
        st.write("\n")

Sample Discovered Choices Based on User Preferences:

User choice: [Intel Processor, Graphics Card]

Top recommended products with a discovery limit of 2 on Streamlit:

“Intel Core i7–6900K 4.0 GHz: Intel Core i7–6900K, Intel Core i7–6xxx, LGA 2011-v3, PC, i7–6900K, DDR4-SDRAM, 64-bit”
“AJA Kona 4 — video capture adapter PCIe 2.0 x8” “ AJA x8 KONA Design Graphic/Video Cards CDWG.com”

Conclusion

Qdrant’s new Discovery API opens up novel possibilities for creating personalized shopping experiences. You can seamlessly integrate it into your applications to offer your end-users a unique tailored experience that goes beyond the basic product recommendations that we generally see. This article covers everything that you need to know about the Discovery API, providing you with proper detailed explanations accompanied by relevant code snippets at every step possible. As an on-field AI/ML engineer, I am positive that Qdrant is the next big thing in handling vector databases and coming up with new approaches in this specific domain.

Here’s the complete code, folks!!

That’s a wrap, folks!

Until next time, keep hustling and keep innovating. If you wanna catch up, feel free to reach out on LinkedIn or X