DEV Community

Cover image for How to Build a Crystal Image Search App with Vector Search
Aaron Ploetz for DataStax

Posted on • Edited on • Originally published at datastax.com

How to Build a Crystal Image Search App with Vector Search

There are lots of ways to leverage generative AI (GenAI) in a variety of business use cases at companies of all sizes. In this post, we will explore how a store selling crystals and precious stones can use DataStax’s RAGStack to help their customers to identify and find certain crystals. Specifically, we will walk through creating an application designed to help the customers of Healing House Energy Spa (owned by the author’s wife). This will also demonstrate how small businesses can take advantage of GenAI.

What is RAGStack?

RAGStack is DataStax’s Python library that’s designed to help developers build advanced GenAI applications based on retrieval-augmented generation (RAG) techniques. These applications require developers to configure and access data parsers, large language models (LLMs), and vector databases.

With RAGStack, developers can increase their productivity with GenAI toolsets by interacting with them through a single development stack. DataStax’s integrations with many commonly used libraries and providers enable developers to prototype and build applications faster than ever before. All of this happens on top of DataStax Astra DB, which is DataStax’s powerful, multi-region vector database (as shown in Figure 1).

Image description

Figure 1 - A high-level view of the Crystal Search application architecture, showing how it leverages RAGStack.

As Astra DB is a key component of RAGStack, we should spend some time discussing vector databases. These are special kinds of databases capable of storing vector data in native structures. When we build RAG applications, we interact with an LLM by using a “vectorized” version of our data. Essentially, the vectors returned are a numerical representation of the individual elements or “chunks” of our data. We will discuss this process in more detail below.

The Crystal Search application

Here we'll walk through how to build up a simple web application to search an inventory of crystals (and other precious stones). We’ll load our data from a CSV file, and then query it using a Flask-based web application with navigation drop-downs and a search-by-image function.

The crystals themselves have several properties:

  • Name What the crystal is known as.
  • Image The filename of the on-disk image of the crystal.
  • Chakras One or more of the seven centers of spiritual power in the human body that the crystal can help attune.
  • Birth month People with certain birth months will be more receptive to this crystal.
  • Zodiac sign People born under certain zodiac signs will be more receptive to this crystal.
  • Mohs hardness A measure of the crystal’s resistance to scratching.

For our drop-down navigation, we will use a crystal’s recommended chakras, birth month, and zodiac signs. The remaining properties will be added to the collection’s metadata (except for the image itself, which will be used to generate the crystal’s vector embedding).

We will use the CLIP model to generate our vector embeddings. CLIP (Contrastive Language-Image Pre-training) is a sentence transformer model (developed by OpenAI) used to store both images and text in the same vector space. The CLIP model is pre-trained with images and text descriptions, and enables us to return results using an approximate nearest neighbor (ANN) algorithm. Leveraging CLIP in this way allows us to support an “identify this crystal” function, where users will be able to search with a picture from their device.

Requirements

Before building our application, let’s make sure that we properly configure our development environment. We will start by making sure that our Python version is at least on version 3.9. We will also need the following libraries (and versions), as specified in our [requirements.txt](https://github.com/aar0np/crystalSearch/blob/main/requirements.txt) file.

  • Flask==2.3.2
  • Flask-WTF==1.2.1
  • sentence-transformers==2.2.2
  • ragstack-ai==0.8.0
  • python-dotenv==1.0.0
pip install -r requirements.txt
Enter fullscreen mode Exit fullscreen mode

Flask directory structure

As we are working with a Flask web application, we will need the following directory structure, with crystalSearch as the “root” of the project:

crystalSearch/
      templates/
      static/
            images/
            input_images/
            web_images/
Enter fullscreen mode Exit fullscreen mode

DataStax Astra DB

First, we need to sign up for a free account with DataStax Astra DB, and create a new vector database. Once we have our Astra DB vector database, we will make note of the token and API endpoint. We will define those as environment variables in the next section.

Environment variables

For our application to run properly, we'll need to set some environment variables:

  • ASTRA_DB_API_ENDPOINT - Connection endpoint for our Astra DB vector database instance.
  • ASTRA_DB_APPLICATION_TOKEN - Security token used to authenticate to our Astra DB instance.
  • FLASK_APP - The name of the application’s primary Python file in a Flask web project.
  • FLASK_ENV - Indicates to Flask if the application is in development or production mode.

Of course, the easiest way to do that is with an .env file. Our .env file, should look something like this:

ASTRA_DB_API_ENDPOINT=https://notreal-blah-4444-blah-blah-region.apps.astra.datastax.com
ASTRA_DB_APPLICATION_TOKEN=AstraCS:NotReal:ButYourTokenWillLookSomethingLikeThis
FLASK_APP=crystalSearch
FLASK_ENV=development
Enter fullscreen mode Exit fullscreen mode

Setting the FLASK_APP variable to “crystalSearch” is important, as it tells Flask which Python module is the primary entrypoint to the application.

crystalLoader.py

With our database and environment all set up, we can build our Python data loader. Create a new Python file named crystalLoader.py, and set up its imports like this:

import csv
import json

from os import path, environ
from dotenv import load_dotenv
from PIL import Image
from astrapy.db import AstraDB
from sentence_transformers import SentenceTransformer
Enter fullscreen mode Exit fullscreen mode

We will start by bringing in the environment variables from our .env file:

basedir = path.abspath(path.dirname(__file__))
load_dotenv(path.join(basedir, '.env'))
Enter fullscreen mode Exit fullscreen mode

Next, we will pull in the application endpoint and token, instantiate a database connection object, and then create a new collection named “crystal_data”:

# Astra connection
ASTRA_DB_APPLICATION_TOKEN = environ.get("ASTRA_DB_APPLICATION_TOKEN")
ASTRA_DB_API_ENDPOINT= environ.get("ASTRA_DB_API_ENDPOINT")

db = AstraDB(
    token=ASTRA_DB_APPLICATION_TOKEN,
    api_endpoint=ASTRA_DB_API_ENDPOINT,
)

# create "collection"
col = db.create_collection("crystal_data", dimension=512, metric="cosine")
Enter fullscreen mode Exit fullscreen mode

Note that our collection will have a vector capable of supporting 512 dimensions, so that it matches the dimensions of the vector embeddings created with the CLIP model. Astra DB supports the use of ANN searches with a cosine, dot product, or Euclidean algorithm. For our purposes, a cosine-based ANN will be fine.

Next, we will define some constants to help our loader:

model = SentenceTransformer('clip-ViT-B-32')
IMAGE_DIR = "static/images/"
CSV = "gemstones_and_chakras.csv"
Enter fullscreen mode Exit fullscreen mode

These will instantiate the clip-ViT-B-32 model locally, define a location for our images, and data filename, respectively.

Now let’s open the CSV file in a with block and initialize the data reader:

with open(CSV) as csvHandler:
    crystalData = csv.reader(csvHandler)
    # skip header row
    next(crystalData)
Enter fullscreen mode Exit fullscreen mode

Our CSV file has a header row that we will skip at read-time. The next() function (from Python’s CSV library) is an easy way to iterate over it.

With that complete, we can now use a for loop to work through the remaining lines in the file. We will first read the line’s image column. As our application is very image-centric, we do not want to spend time processing a line if it doesn’t have a valid image. We will use an if conditional to make sure that the file referenced by image column is both:

  • not empty
  • a valid file that exists
for line in crystalData:

    image = line[1]
    # Only load crystals with images
    if image != "" and path.exists(IMAGE_DIR + image):
        # map columns
        gemstone = line[0]
        alt_name = line[2]
        chakras = line[3]
        phys_attributes = line[4]
        emot_attributes = line[5]
        meta_attributes = line[6]
        origin = line[7]
        description = line[8]
        birth_month = line[9]
        zodiac_sign = line[10]
        mohs_hardness = line[11]
Enter fullscreen mode Exit fullscreen mode

If the image for each line in the CSV file is indeed valid, we will then map the remaining columns to local variables.

Two of our variables, chakras and mohs_hardness, will require some extra processing before being written into Astra DB. Our chakra data comes from the file as a comma-delimited list. Crystals can affect multiple chakras. Therefore, we will need to reconstruct it into an array with each item wrapped in quotation marks, so that it is recognized as valid JSON. To do that, we will simply replace the commas with double-quoted commas:

            # reformat chakras to be more JSON-friendly
            chakras = chakras.replace(', ','","')
Enter fullscreen mode Exit fullscreen mode

This will not make it valid JSON on its own, so we will account for that later when we write the chakra data.

Precious stones all have a rating on the Mohs hardness scale, which indicates its resistance to scratches. While some crystals in our data set have a value of a single integer, several do occupy a range on the scale (with the minimum listed first), indicating a maximum and a minimum Mohs hardness. We will split-out these values, and store them as mohs_min_hardness and mohs_max_hardness, respectively. Do note that sometimes the mohs_hardness column will have a value of “Variable” or “Varies,” so we will account for that possibility as well:

            # split out minimum and maximum mohs hardress
            mh_list = mohs_hardness.split('-')
            mohs_min_hardness = 1.0
            mohs_max_hardness = 9.0
            if mh_list[0][0:4] != 'Vari':
                mohs_min_hardness = mh_list[0]
                mohs_max_hardness = mh_list[0]
                if len(mh_list) > 1:
                    mohs_max_hardness = mh_list[1]
Enter fullscreen mode Exit fullscreen mode

With our data prepared, we can now build each crystal’s text and metadata properties:

            metadata = (f"gemstone: {gemstone}")

            text = (<em>f</em>"gemstone: {gemstone}| alternate name: {alt_name}| physical attributes: {phys_attributes}| emotional attributes: {emot_attributes}| metaphysical attributes: {meta_attributes}| origin: {origin}| maximum mohs hardness: {mohs_max_hardness}| minimum mohs hardness: {mohs_min_hardness}")
Enter fullscreen mode Exit fullscreen mode

Next, we can load the crystal’s image using Pillow (Python’s image processing library) and generate a vector embedding for it with the encode() function from our CLIP model:

            img_emb = model.encode(Image.open(IMAGE_DIR + image))

Enter fullscreen mode Exit fullscreen mode

With all that complete, we are ready to build our local JSON document as a string:

            strJson = (f' {{"_id":"{image}","text":"{text}","chakra":["{chakras}"],"birth_month":"{birth_month}","zodiac_sign":"{zodiac_sign}","$vector":{str(img_emb.tolist())}}}')
Enter fullscreen mode Exit fullscreen mode

Finally, we can convert each crystal’s data to JSON and write it into Astra DB:

            doc = json.loads(strJson)
            col.insert_one(doc)
Enter fullscreen mode Exit fullscreen mode

crystalSearch.py

To demonstrate the visual aspects of Crystal Search, we will stand-up a simple web application using Flask. This interface will have a few simple components, including dropdowns (for navigation) and a way to upload an image for searching.

Note: As web front-end development is not the focus, we’ll skip the implementation details. For those who are interested, the code can be accessed in the project repository listed at the end of this post.

astraConn.py

Now that our data has been loaded, we can build the Crystal Search application. First, we will construct the astraConn module, which will act as an abstraction layer for our interactions with the Astra DB vector database. We will create a new file named astraConn.py and add the following two imports:

import os

from astrapy.db import AstraDB
Enter fullscreen mode Exit fullscreen mode

Next, we will pull-in our ASTRA_DB_APPLICATION_TOKEN and ASTRA_DB_API_ENDPOINT variables from our system environment, and instantiate them locally:

ASTRA_DB_APPLICATION_TOKEN = os.environ.get("ASTRA_DB_APPLICATION_TOKEN")
ASTRA_DB_API_ENDPOINT= os.environ.get("ASTRA_DB_API_ENDPOINT")
Enter fullscreen mode Exit fullscreen mode

This module will have a few different methods that will be called by our application, but we won’t want to rebuild our database connection each time. Therefore, we will create two global variables (db and collection) to keep data pertaining to our database cached:

db = None
collection = None
Enter fullscreen mode Exit fullscreen mode

The first method that we will define will be the init_collection() method. This method will be called by every other method in this module. It will first initiate global scope access for the db and collection variables. Its primary function will be to instantiate the db object if it is null or “None.” This way, an existing connection object can be reused. The code for this method is shown below:

def init_collection(table_name):
    global db
    global collection

    if db is None:
        db = AstraDB(
            token=ASTRA_DB_APPLICATION_TOKEN,
            api_endpoint=ASTRA_DB_API_ENDPOINT,
        )

    collection = db.collection(table_name)
Enter fullscreen mode Exit fullscreen mode

Note that the collection variable will be instantiated on every call. This allows us the flexibility to access different collections in Astra DB with the same database connection information.

For our application, there are three ways that we will perform reads on our data. We will search by vector, query by id, and then query by three additional properties that we are going to build into dropdowns in our web application.

First, we will build the get_by_vector() method. This asynchronous method will accept a collection name, a vector embedding, and a maximum (limit) number of results to be returned (defaulting to 1). After initializing our database and collection, we will invoke the vector_find() method with the vector_embedding, the limit, and the list of fields from the collection that we want to receive. We will then return the results to the calling method.

async def get_by_vector(collection_name, vector_embedding, limit=1):
    init_collection(collection_name)

    results = collection.vector_find(vector_embedding.tolist(), limit=limit, fields={"text","chakra","birth_month","zodiac_sign","$vector"})
    return results
Enter fullscreen mode Exit fullscreen mode

Our get_by_id() method will be similar to the previous one, but will work quite differently under the hood. This method is also meant to be called asynchronously, and accepts a collection name as well as the identifier to be queried. As querying by a unique identifier is deterministic, we can invoke the find_one() method with a filter for the specific id, as shown below:

async def get_by_id(collection_name, id):
    init_collection(collection_name)

    result = collection.find_one(filter={"_id": id})
    return result
Enter fullscreen mode Exit fullscreen mode

This method will return a single JSON document as the result.

Finally, get_by_dropdowns() is an asynchronous method that will return all matching rows based on the values of three properties: chakras, birth month, and zodiac sign. First, we will build an array to hold our conditions. This is necessary because not every dropdown is going to be used each time. That way we can dynamically build our conditions based on the state of the dropdowns at query-time.
async def get_by_dropdowns(collection_name, chakra, birth_month, zodiac_sign):

init_collection(collection_name)

    conditions = []

    if chakra != "--Chakra--":
        condition_chakra = {"chakra": {"$in": [chakra]}}
        conditions.append(condition_chakra)

    if birth_month != "--Birth Month--":
        condition_birth_month = {"birth_month": birth_month}
        conditions.append(condition_birth_month)

    if zodiac_sign != "--Zodiac Sign--":
        condition_zodiac_sign = {"zodiac_sign": zodiac_sign}
        conditions.append(condition_zodiac_sign)

    crystal_filter = ""

    if len(conditions) > 2:
        crystal_filter = {"$and": [{"$and": [conditions[0], conditions[1]]}, conditions[2]]}
    elif len(conditions) > 1:
        crystal_filter = {"$and": [conditions[0], conditions[1]]}
    elif len(conditions) > 0:
        crystal_filter = conditions[0]
    else:
        return 

    results = collection.find(crystal_filter)
    return results
Enter fullscreen mode Exit fullscreen mode

Once the conditions array is built, we can then build crystal_filter to use as our JSON query string. To pass a filter with multiple conditions through Astra DB’s Data API, we need to build a nested conditional statement.

A single condition could be sent as a filter on its own. But two would need to use the $and operator. If we were to hard-code our filter, it would be similar to this example:

crystal_filter = {"$and": [{"birth_month": "October"}, {"zodiac_sign": "Libra"}]}
Enter fullscreen mode Exit fullscreen mode

Of course, this also means that three conditions would require a nested $and (one $and inside of another), like this:

crystal_filter = {"$and": [{"$and": [{"birth_month": "October"}, {"zodiac_sign": "Libra"}]}, {"chakra": {"$in": ["Heart"]}}]}
Enter fullscreen mode Exit fullscreen mode

Note that as each crystal’s chakra property is an array, we need to use the $in operator.

crystalServices.py

Next, we will create a new file named crystalServices.py with the following imports:

import json
import os

from astraConn import get_by_vector
from astraConn import get_by_id
from astraConn import get_by_dropdowns
from sentence_transformers import SentenceTransformer
from PIL import Image

Enter fullscreen mode Exit fullscreen mode

We will also define some local variables for our image directory, the name of our collection in Astra DB, and our CLIP model:

INPUT_IMAGE_DIR = "static/input_images/"
DATA_COLLECTION_NAME = "crystal_data"
model = None
Enter fullscreen mode Exit fullscreen mode

Our service layer will expose two asynchronous methods. The first method that we will build, will be named get_crystals_by_image, and it will accept an image filename as a parameter. It will be primarily responsible for generating a vector embedding from an image, using the embedding to invoke a vector similarity search, and returning the results to the view. This method will need the model global variable, and instantiate it if required:

async def get_crystals_by_image(file_path):
    global model

    if model is None:
        model = SentenceTransformer('clip-ViT-B-32')
Enter fullscreen mode Exit fullscreen mode

Next, we will define our result set variable as an empty dictionary. Then we will load the image, generate an embedding for it, and use it to call the get_by_vector() method from (astraConn.py):

    results = {}        
    img_emb = model.encode(Image.open(INPUT_IMAGE_DIR + file_path))
    crystal_data = await get_by_vector(DATA_COLLECTION_NAME, img_emb, 3)

    if crystal_data is not None:
        for crystal in crystal_data:
            id = crystal['_id']
            results[id] = parse_crystal_data(crystal)

    return results
Enter fullscreen mode Exit fullscreen mode

Finally, we will process and return the vector search results. Note that the parse_crystal_data() method does much of the heavy-lifting of building the result set. We will construct that method toward the end of this module.

We will now move on to the get_crystals_by_facets() method. This method accepts the values taken from three dropdown lists containing data for chakras, birth month, and zodiac sign. Similar to the prior method, we will define an empty dictionary for the results and perform a query on our data, before processing and returning the results:

async def get_crystals_by_facets(chakra, birth_month, zodiac_sign):
 results = {}
 crystal_data = await get_by_dropdowns(DATA_COLLECTION_NAME, chakra, birth_month, zodiac_sign)

 if crystal_data is not None:
  for crystal in crystal_data['data']['documents']:
   id = crystal['_id']
   results[id] = parse_crystal_data(crystal)

 return results
Enter fullscreen mode Exit fullscreen mode

There are also two additional code blocks required to more easily transfer our data back up to the view layer. The first is the parse_crystal_data() method. This method is fairly straightforward in that it takes the raw crystal data as a parameter, and processes each property into a new object of the Crystal class. As the final part of this module, we also need to add the Crystal object class. They will not be shown here, but both of these definitions can be found at the end of the crystalServices.py module.

Demo

Let’s see this in action. We will run the application with Flask. The complete code listed above (including all of the front end components) can be found in this GitHub repository.

To run the application, we will use the following command:

flask run -p 8080
Enter fullscreen mode Exit fullscreen mode

If it starts correctly, Flask should display the application name, address and port that it is bound to:

 * Serving Flask app 'crystalSearch'
 * Debug mode: off
WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
 * Running on http://127.0.0.1:8080

Press CTRL+C to quit
Enter fullscreen mode Exit fullscreen mode

If we navigate to that address in a browser, we should see a simple web page with a search interface at the top, and three differently-colored dropdowns in the left navigation. If we select values for the dropdowns and click on the “Find Crystals” button, we should see crystals matching those values returned (Figure 2).

Image description
Figure 2 - Results for crystals matching the dropdown values where chakra is “Heart”, birth month is “October,” and zodiac sign is “Libra.”

Of course, we can also search with an image. Perhaps we have a picture of a crystal that we cannot identify. We can click on the “Choose File” button, select our image, and then click “Search” to see what the closest matches are. If our picture is of a black obsidian crystal, we will see results similar to Figure 3.

Image description
Figure 3 - Results for crystals matching our image of a black obsidian crystal.

Conclusion

In this article, we have demonstrated another possible use case for an image-based search built with RAGStack and Astra DB. We walked through this very unique use case, how to configure the development environment, load and query data using CLIP, and build an application to leverage image-based vector embeddings. We also showed how to use the Astra DB Data API to implement a simple product faceting approach using dropdowns.

As the world continues to embrace GenAI, we will surely see more and more creative use cases spanning multiple industries. Searching by images using CLIP is one of the ways in which we are pushing the boundaries of conventional data applications. With solutions like RAGStack and Astra DB, DataStax continues to help you build the next generation of applications.

Do you have an idea for a great use of GenAI? Pull down RAGStack and start using Astra DB with a free account today!

Top comments (0)