DEV Community

Cover image for From Image to Vector: Building Image Similarity Search with Python and MySQL
Sanjay Ghosh
Sanjay Ghosh

Posted on

From Image to Vector: Building Image Similarity Search with Python and MySQL

Modern applications increasingly rely on vector embeddings to search and compare data such as text, images, and audio.
For example:

  • finding visually similar images
  • semantic document search
  • recommendation systems

In this article we will:

  • Convert an image to a vector embedding
  • Store the vector in MySQL
  • Compare images using vector similarity

Stacks used:

  • Python
  • sentence-transformers
  • PyTorch
  • MySQL vector support

Final Result

After storing the vectors in MySQL we can run a similarity query.
Example output:
+------------+-------------+
| image_name | similarity |
+------------+-------------+
| img/t3.jpg | 1.00 |
| img/t2.jpg | 0.96 |
| img/t1.jpg | 0.96 |
| img/t4.jpg | 0.92 |

System Architecture

The overall pipeline is simple:

  • An image is processed by a multimodal embedding model
  • The model converts the image into a numerical vector
  • The vector is stored in MySQL
  • SQL queries compute similarity between vectors

AI → ML → DL → Neural Networks → LLM

Let’s briefly explain each level.

Artificial Intelligence (AI)

Artificial Intelligence refers to systems designed to simulate aspects of human intelligence, such as:

  • learning
  • reasoning
  • decision making
  • problem solving

Early AI systems (often called Good Old-Fashioned AI) did not involve learning. They relied on predefined rules.

Example:
IF humidity > 60% THEN alert

Machine Learning (ML)

Machine Learning is a subset of AI where systems learn patterns from data rather than being explicitly programmed.

There are three main types of machine learning:

Supervised Learning

Uses labeled data, where each input has a known correct output.
Example: Predicting house prices.
Inputs: house size, location
Output: predicted price

Unsupervised Learning

The data is not labeled. The system tries to discover patterns or structure in the data on its own.
Example:
Customer purchase behavior analysis.
For instance, the system might detect that customers who buy Product P1 often also buy Product P2.

Reinforcement Learning

Works through interaction with an environment using a reward/penalty mechanism.
The system learns through trial and error, continuously improving its decisions based on feedback.
An agent interacts with the environment repeatedly to achieve the optimal outcome.

Deep Learning (DL)

Deep Learning is a subset of Machine Learning.
It uses multi-layer neural networks to learn complex patterns in large datasets.
Deep learning is especially powerful for:

  • images
  • speech
  • language processing

Neural Networks (NN)

Neural Networks are the core architecture used in deep learning.
They consist of nodes (neurons) organized into layers:

  • Input layer
  • Hidden layers
  • Output layer

Each neuron performs a mathematical operation based on the following equation:
y = Σ (xi * wi) + b
Where:

  • xi = input value
  • wi = weight (importance of the input)
  • b = bias (a learnable scalar parameter)
  • y = output

Data flows through the network in a feed-forward manner, and the network learns by adjusting weights during training.

Types of Neural Networks

  1. Artificial Neural Networks (ANN)
    Basic layered neural networks where information flows in one direction.

  2. Convolutional Neural Networks (CNN)
    Designed for image data. CNNs extract spatial features such as:
    edges
    textures
    shapes

  3. Recurrent Neural Networks (RNN)
    These networks contain loops and can remember previous inputs. They are commonly used for sequence data such as text or speech.

Large Language Models (LLM)

Large Language Models are a type of deep learning model trained on massive text datasets.

They are capable of:

  • understanding language
  • generating human-like text
  • producing embeddings for semantic similarity

Process Overview with practical example

  1. Convert an image to a vector embedding
  2. Store the vector in MySQL (version 9.0 or later)
  3. Run queries to compare vectors
  4. Identify which images are most similar

Environment Setup

Create a Python Virtual Environment

Install the virtual environment package:
sudo apt install python3.12-venv

Create the environment:
python3 -m venv venv

Activate it
Linux/macOS
source venv/bin/activate

Windows
venv\Scripts\activate

Install Required Libraries

pip install sentence-transformers pillow torch

Create the MySQL Table

Create a table with a vector column (createtable.sql).

    /*This SQL is to create the table with Vector , MySQL Version 9.6.0
     *The size of the Vector is set to (512) ,
     * The INSERT statement has 512 floating point number
     * and this is the correct match for Common Model Source - CLIP / Truncated OpenAI */
    CREATE TABLE image_embeddings (
        id INT PRIMARY KEY AUTO_INCREMENT,
        image_name VARCHAR(255),
        embedding VECTOR(512) -- Example for a 512-dimensional float vector
    );
Enter fullscreen mode Exit fullscreen mode

Python Script: Convert Image to Vector

Write a Python script to convert images into vectors.
(transformer_image-to-vector.py)

import glob
import sys
import os

# Test it!
if len(sys.argv) > 2:
    imageDir = sys.argv[1]
    sqlFilePath = sys.argv[2]
    if os.path.exists(sqlFilePath):
        os.remove(sqlFilePath)
else:
    print("Usage: python3 transformer_image-to-vector.py <image directory> <path of the output sql")
    sys.exit();

from sentence_transformers import SentenceTransformer
from PIL import Image
import json


# 1. Load the CLIP model (this is the modern equivalent of imgbeddings)
# It downloads about 600MB on the first run.
model = SentenceTransformer('clip-ViT-B-32')

#This is the Function to generate the sql from the image and write to a desired file
def get_sql_insert(image_path,file_path_to_write):
    try:
        print("Will write the image {} to file: {}".format(image_path,file_path_to_write))
        # 2. Load and process the image
        img = Image.open(image_path)

        # 3. Generate the vector (embedding)
        # We use .tolist() to turn the math array into a Python list
        embedding = model.encode(img).tolist()

        # 4. Format the list as a string for SQL [0.1, 0.2, ...]
        vector_string = json.dumps(embedding)

        # 5. Create the SQL command
        sql = "--This sql is genrated from the image using  the model SentenceTransformer('clip-ViT-B-32')\n"
        sql = sql + f"INSERT INTO image_embeddings (image_name, embedding) VALUES ('{image_path}', STRING_TO_VECTOR('{vector_string}'));\n"
        with open(file_path_to_write, "a") as file:
            file.write(sql)
        return ("Successfully Wrote the image {} to file: {}".format(image_path,file_path_to_write))

    except Exception as e:
        return f"Error: {e}"

#It will take all the .jpg files from the image directory (First parameter of this python script)
#and write to the desired file (2nd parameter of this python script)
for lst in glob.glob(f"{imageDir}/*.jpg"):
    print(get_sql_insert(lst,sqlFilePath))
Enter fullscreen mode Exit fullscreen mode

Usage:
python3 transformer_image-to-vector.py <image_directory> <output_sql_file>
The script generates SQL statements that insert vectors into the MySQL table. Run the output SQL file to store the vectors in MySQL Database.

Comparing Image Vectors

Once the vectors are stored, we can compare them using vector similarity.
At the moment, some MySQL vector functions are still evolving.
For example, direct usage of VECTOR_DISTANCE() may have limitations in certain versions (9.60).
So we implement custom helper functions.

Custom MySQL Functions

Two helper functions are created:

my_dot_product(vector, vector)

DELIMITER //

CREATE FUNCTION my_dot_product(v1 VECTOR, v2 VECTOR)
RETURNS FLOAT DETERMINISTIC
BEGIN
    DECLARE total FLOAT DEFAULT 0;
    DECLARE i INT DEFAULT 0;
    DECLARE dim INT;
    -- Convert Vector to String, then to JSON
    DECLARE j1 JSON DEFAULT CAST(VECTOR_TO_STRING(v1) AS JSON);
    DECLARE j2 JSON DEFAULT CAST(VECTOR_TO_STRING(v2) AS JSON);

    SET dim = JSON_LENGTH(j1);

    WHILE i < dim DO
        SET total = total + (JSON_EXTRACT(j1, CONCAT('$[', i, ']')) * JSON_EXTRACT(j2, CONCAT('$[', i, ']')));
        SET i = i + 1;
    END WHILE;

    RETURN total;
END //

DELIMITER ;
Enter fullscreen mode Exit fullscreen mode

my_vec_norm(vector)

DELIMITER //

CREATE FUNCTION my_vec_norm(v VECTOR)
RETURNS FLOAT DETERMINISTIC
BEGIN
    DECLARE total FLOAT DEFAULT 0;
    DECLARE i INT DEFAULT 0;
    DECLARE dim INT;
    DECLARE j JSON DEFAULT CAST(VECTOR_TO_STRING(v) AS JSON);
    DECLARE val FLOAT;

    SET dim = JSON_LENGTH(j);

    WHILE i < dim DO
        SET val = JSON_EXTRACT(j, CONCAT('$[', i, ']'));
        SET total = total + (val * val);
        SET i = i + 1;
    END WHILE;

    RETURN SQRT(total);
END //

DELIMITER ;
Enter fullscreen mode Exit fullscreen mode

These functions allow us to compute cosine similarity between vectors.

Querying Image Similarity

Now we run a query to compare stored images against a target image.

SET @target = STRING_TO_VECTOR('[0.07557862997055054, 0.3899604380130768, -0.149032324552536, 0.06481198966503143, 0.17365998029708862, 0.1696033775806427, -0.009633079171180725, 0.026114463806152344, 0.13283401727676392, -0.048670023679733276, 0.2853511571884155, -0.47713860869407654, 0.3193468451499939, -0.4782599210739136, 0.19979414343833923, -0.4020282030105591, -0.17893710732460022, -0.16028517484664917, 0.11166025698184967, -0.4441247880458832, 0.0721852108836174, -0.015377134084701538, 0.06420326232910156, -0.5317407846450806, 0.011108562350273132, 0.7078717350959778, -0.15541546046733856, 0.06636792421340942, -0.1235189437866211, 0.17564985156059265, -0.2749885618686676, -0.021165557205677032, 0.24020248651504517, 0.4338991641998291, 0.3174438774585724, -0.2427300065755844, -0.15270879864692688, -0.13448885083198547, -0.03258641064167023, 1.4516336917877197, 0.2088334858417511, -0.07482234388589859, -0.060038354247808456, -0.09696637094020844, 0.2288440763950348, -0.45773571729660034, -0.442022442817688, -0.1333482265472412, 0.20572619140148163, 0.022368118166923523, 0.2268296778202057, 0.0717138946056366, -0.18176983296871185, -0.08150641620159149, -0.3742990493774414, -0.08863221853971481, 0.24395513534545898, 0.14680302143096924, 0.45115935802459717, 0.07179692387580872, 0.13299871981143951, -0.6536737680435181, -0.10842005163431168, 0.46789559721946716, -0.09359163045883179, -0.029734283685684204, -0.4210904538631439, 0.44143062829971313, -0.2495618760585785, 0.16764293611049652, -0.0010016188025474548, 0.12750700116157532, -0.09728701412677765, -0.33462393283843994, -0.19563797116279602, 0.38896551728248596, -0.07021686434745789, 0.05845819413661957, -0.2159923017024994, 0.22142955660820007, 0.08275012671947479, 0.08141633123159409, 0.011051833629608154, -0.0470392107963562, 0.2281772345304489, 0.1281975507736206, -0.2673131823539734, 0.09869228303432465, -0.46927496790885925, 0.5890671014785767, -0.26286429166793823, 0.029046714305877686, -6.433596611022949, 0.8787387609481812, -0.3467825651168823, 0.3790345788002014, 0.04438466578722, 0.651203453540802, -0.34149351716041565, -0.8458610773086548, 0.034017741680145264, -0.2053813338279724, 0.35773321986198425, 0.04032140225172043, 0.30020737648010254, -0.20292143523693085, -0.6659493446350098, -0.14945490658283234, -0.08128952980041504, 0.037803277373313904, -0.24363872408866882, -0.5452983975410461, -0.15123391151428223, 0.06196553260087967, 0.0037244856357574463, 0.37201881408691406, 0.3830801248550415, 0.038900259882211685, 0.5477883815765381, -0.2165623903274536, -0.12358544021844864, 0.6665405035018921, 0.08692356944084167, -0.7012655138969421, -0.2776182293891907, -0.10287892073392868, -0.16323958337306976, 0.29088079929351807, 0.17843754589557648, -0.041389890015125275, -0.206215500831604, 0.8713011741638184, 0.10298433899879456, 0.8146358132362366, 0.5424628257751465, 0.2975153625011444, -0.013563252985477448, 0.13421256840229034, -0.3498159945011139, -0.3292694687843323, -0.2611740529537201, 0.026821553707122803, -0.4656805992126465, 0.733379602432251, 0.09556902945041656, 0.031042248010635376, 0.3466207981109619, 0.470894992351532, -0.24531006813049316, 0.09481766074895859, -0.07527203112840652, -0.3095394968986511, 0.6393522024154663, -0.09749913960695267, -0.11673454940319061, -0.17871969938278198, -0.1564812809228897, 0.2511930465698242, -0.13378417491912842, 0.17615115642547607, -0.06127139925956726, -0.04496309161186218, -0.35364609956741333, -0.08319900929927826, 0.14410080015659332, -0.2627147138118744, 0.9161733388900757, 0.37756675481796265, 0.1632954180240631, 0.0905647948384285, -0.10490548610687256, 0.33364954590797424, -0.05363786220550537, -0.6529805064201355, 0.26767054200172424, 0.26338034868240356, -1.0595732927322388, -0.22944216430187225, -0.26375362277030945, -0.1988820880651474, -0.12509888410568237, 0.24070680141448975, -0.2588941156864166, -0.33498746156692505, 0.126882404088974, 0.17642360925674438, 0.040883928537368774, -0.0052263736724853516, -0.16343313455581665, 0.0562138706445694, 0.2096858024597168, 0.09584870934486389, -0.11106163263320923, 0.1729261875152588, 0.20461152493953705, 0.12263501435518265, 0.27892982959747314, -0.6255460381507874, -0.764202892780304, 0.14997950196266174, 0.16245627403259277, -0.37481117248535156, -0.06923999637365341, 0.6191018223762512, -0.2948871850967407, 0.261602520942688, -0.04764917492866516, -0.15079966187477112, -0.5841816067695618, 0.21257054805755615, -0.21506810188293457, -0.04135539382696152, 0.5817644596099854, 0.12262402474880219, 0.42174848914146423, 0.1265665739774704, 0.01666194200515747, 0.13379910588264465, -0.5832018256187439, -0.3252071738243103, 0.22216439247131348, 0.4046194851398468, 0.20799392461776733, -0.07600507140159607, -0.3153631389141083, -0.10026045888662338, 0.2151670753955841, -0.15998244285583496, -0.498496949672699, -0.04490187019109726, -0.3262564539909363, 0.2954603433609009, -0.07442381978034973, 0.24110326170921326, 0.17248889803886414, 0.45136961340904236, 0.13689523935317993, -0.4876601994037628, 0.050953567028045654, -0.5781422853469849, 0.39940330386161804, 0.08282990008592606, 0.12615060806274414, -0.29397282004356384, -0.1800389140844345, 0.6491589546203613, -0.46026864647865295, 0.35799282789230347, -0.2924331724643707, -0.43013110756874084, -0.3808341324329376, 0.55094313621521, 0.03232079744338989, -0.462185800075531, -0.313692569732666, 0.07338142395019531, 0.16393449902534485, -0.039127856492996216, 0.6098682880401611, -0.16981320083141327, -0.35463374853134155, -0.4267345666885376, -0.09102936089038849, 1.8269031047821045, 0.20107018947601318, 0.15080709755420685, 0.3966326117515564, 0.14836283028125763, 0.43229472637176514, -0.12651363015174866, 0.008955806493759155, -0.02638685703277588, 0.13089591264724731, 0.07453787326812744, -0.26478222012519836, -0.22489425539970398, 0.24618230760097504, 0.17845292389392853, -0.0707186758518219, -0.062373995780944824, -0.06923562288284302, 0.47168073058128357, -0.07979298382997513, -0.07240656018257141, 0.20423418283462524, 0.4307517111301422, -0.9086780548095703, 0.40964275598526, -0.23126669228076935, -0.28451400995254517, -0.020541366189718246, 0.2600404620170593, -0.10833749175071716, 0.036447957158088684, -0.36018478870391846, -0.12711986899375916, 0.2606499195098877, 0.2881711423397064, 0.7029435634613037, -0.1414911448955536, -0.0848516896367073, 0.3585105538368225, -0.04511311650276184, 0.24639412760734558, 0.4116198420524597, -0.4193349778652191, 0.5370025038719177, 0.4218825697898865, 0.03302404284477234, 0.04051119089126587, -0.4930671155452728, 0.12944979965686798, 0.8145080804824829, -0.49908819794654846, 0.19380804896354675, 0.09682479500770569, 0.4217352271080017, -0.5215371251106262, -0.06526830792427063, -0.26663315296173096, 0.2522737979888916, 1.213325023651123, 0.0012408196926116943, -0.2426946759223938, -0.28965428471565247, -0.3634025752544403, -0.13846488296985626, -0.3301095962524414, -0.12310884147882462, -0.03835451602935791, -0.7434549331665039, 0.24088844656944275, -0.17097029089927673, -0.13022029399871826, -0.007050663232803345, 0.06718186289072037, 0.45522329211235046, 0.10973918437957764, 0.012622185051441193, 0.16777338087558746, -0.05771467462182045, 0.19996578991413116, -0.24137984216213226, 0.24456733465194702, -0.11382054537534714, 0.010699808597564697, 0.19115638732910156, -0.0028515905141830444, -0.09564647823572159, 0.09341803193092346, 0.2944413125514984, 0.23297476768493652, -0.17356669902801514, 0.39457371830940247, -0.31496915221214294, 0.23409241437911987, -0.12645113468170166, -0.6129994988441467, -0.15042293071746826, -0.26337844133377075, 0.3586566150188446, 0.015362411737442017, -0.1516532152891159, 0.22309523820877075, -1.4553773403167725, 0.622140645980835, -0.5070102214813232, -0.7885435223579407, 0.1319749355316162, -0.446965754032135, -0.16092371940612793, 0.3529052436351776, 0.007851272821426392, 0.09744635224342346, 0.20603105425834656, -0.08496123552322388, 0.3391745984554291, 0.0328497439622879, 1.0089284181594849, 0.3865002393722534, 0.8738011121749878, 0.06040278077125549, -0.015890859067440033, -0.21144962310791016, 0.5995208621025085, 0.32733339071273804, 0.9662830829620361, 0.3483368158340454, 0.08898067474365234, 0.887071430683136, 0.3696739375591278, 0.19864115118980408, 0.03314075618982315, 0.03536325320601463, -0.12863528728485107, 0.06904687732458115, -0.24006515741348267, -0.012378469109535217, -0.5082074403762817, -0.7163709402084351, -0.1123851016163826, 0.010127410292625427, 0.4355129599571228, -0.6237277984619141, 0.2890245020389557, 0.5209721326828003, -0.15595661103725433, 0.08429825305938721, 0.09569454938173294, -0.23599457740783691, -0.16896066069602966, -0.004924099892377853, 0.6319135427474976, 0.4765503704547882, 0.01589702069759369, -0.05348372459411621, 0.055996619164943695, -0.5671727657318115, 0.15062867105007172, -0.5092383623123169, -0.058769792318344116, 0.09755947440862656, 0.2507498860359192, 0.08485430479049683, 0.0793701708316803, 0.021087057888507843, -0.5461320281028748, 0.304355651140213, -0.7253096699714661, 0.6294398307800293, -0.3105747103691101, 0.3905089199542999, 0.2852843105792999, -0.07321953773498535, 0.023653872311115265, -0.37693509459495544, 0.2675110697746277, 0.04210440814495087, 0.09981885552406311, -0.20424491167068481, 0.3030926287174225, 0.026748113334178925, 0.5781193375587463, 0.00048617273569107056, -0.1131657287478447, 0.2036598026752472, -0.2140207141637802, -0.011482134461402893, -0.2234390377998352, -0.43609818816185, -0.504375696182251, -0.01408250629901886, -0.06268232315778732, -0.3341209888458252, 0.29185229539871216, -0.2411029040813446, 0.5522151589393616, -0.06291865557432175, -0.3783164620399475, -0.2328968197107315, -0.3904339075088501, -0.11180463433265686, -0.16056767106056213, -0.16029603779315948, -0.2223338484764099, 0.33838483691215515, -0.23742935061454773, 0.12314961105585098, -0.5309375524520874, -0.07631200551986694, 0.35636740922927856, 0.17236372828483582, 0.19029422104358673, -0.1397394984960556, 0.5870028734207153, -0.4097609221935272, 0.15144716203212738, 0.003988280892372131, -0.05234433710575104, -0.2421717643737793, -0.006953790783882141, -0.5565206408500671, 0.06222153455018997, -0.07391542196273804, 0.40786826610565186, -0.04986128211021423, -0.285273939371109, -0.1444988250732422, 0.024988561868667603, -0.3689330816268921, -0.40285810828208923, -0.10759894549846649, -0.0395619198679924, 0.01536903902888298, 0.0707881823182106, 0.22722786664962769, 0.2506854236125946, 0.023645572364330292, -0.14591404795646667, 0.09581179171800613, -0.12541484832763672, 0.03997009992599487, -0.02025282382965088, -0.23364081978797913, 0.3128877580165863, 0.12357562780380249, -0.1596986949443817, 0.22576884925365448, -0.14733293652534485, 0.24771124124526978, 0.0026986300945281982, 0.5146662592887878]');

SELECT image_name,
       (my_dot_product(embedding, @target) / (my_vec_norm(embedding) * my_vec_norm(@target))) AS similarity
FROM image_embeddings
ORDER BY similarity DESC
LIMIT 5;
Enter fullscreen mode Exit fullscreen mode

Example output:
+------------+--------------------+
| image_name | similarity |
+------------+--------------------+
| img/t3.jpg | 1.0000000040388952 |
| img/t2.jpg | 0.9667246382795552 |
| img/t1.jpg | 0.9642200115094365 |
| img/t4.jpg | 0.9213785475357513 |
+------------+--------------------+

In this example, the target image is most similar to t3.jpg.

Source Code

All source files are available on GitHub:
GIT Source Codes

Role of the Libraries Used

NumPy

Handles:

  • arrays
  • matrix operations
  • linear algebra

PyTorch

Used for:

  • building neural networks
  • defining layers (Linear, CNN, Transformer)
  • training models
  • GPU acceleration
  • backpropagation

sentence-transformers

This library converts text or images into dense vector embeddings using deep neural networks.
Internally the process is:
Input → Transformer Model → Embedding Vector

Pillow

Pillow is a Python library for image processing.
Before an image is passed to a neural network, it usually undergoes preprocessing such as:

  • loading
  • resizing
  • cropping
  • normalization
  • converting to tensors

Pillow helps perform these operations.

Real-World Use Cases

  • Images stored as vectors could be useful for many real world applications
  • Search Image: Find similar images
  • Ecommerce: Find visually similar products
  • Check Duplicity: Detect very similar or duplicate image
  • Prohibited Image: Identify prohibited or restricted image

Tested with:

- Python 3.12  
- MySQL 9.x  
- sentence-transformers  
- PyTorch
Enter fullscreen mode Exit fullscreen mode

Conclusion

Vector embeddings enable powerful capabilities such as:

  • semantic search
  • image similarity detection
  • recommendation systems
  • multimodal AI applications With modern databases like MySQL supporting vector storage, it becomes possible to build AI-powered search systems directly inside the database layer.

Top comments (0)