Hello Devs👋
Have you ever wondered how Spotify knows the next song you’ll love, or how Google instantly finds answers that feel right? even though none of them may literally contain the exact phrase you typed? 🤔
When you search for information, get a product recommendation, or ask an AI assistant a question, there’s something invisible happening in the background. It’s not just large language models or massive recommendation engines, it’s something smaller but incredibly important called: embeddings.
_Embeddings_
are how machines represent meaning. They transform words, sentences, images, or even code into numbers (vectors) that capture relationships and similarities. Without embeddings, most of the smart search, recommendations, and question-answering we see today would simply not work.
In this article, we’ll take a deep dive into embeddings:
- What they are
- How they measure similarity
- How embedding models work (with Qodo-Embed-1 as an example)
- Their applications
- The best embedding models today
What are Embeddings?
Embeddings are numerical vector representations of the input data. They transform high-dimensional, complex things like words, sentences, images, or even source code into dense vectors of numbers.
The key idea: embeddings capture semantic meaning and relationships between data points. Similar items are placed closer together in this vector space, while dissimilar ones are farther apart. This makes it easier for algorithms to work with complex data such as words, images or audios in a recommendation system.
Imagine you’re trying to explain relationships between words.
- Apple and Orange are both fruits.
- Dog and Cat are both pets.
- But Apple and Dog don’t really belong together. 😑
In the human brain, we intuitively understand these connections. Embeddings let computers do the same by mapping them into a mathematical space.
For example, embeddings might look like this:
- Apple: [0.21, -0.87, 0.45, …]
- Orange: [0.20, -0.88, 0.50, …]
- Dog: [0.91, 0.10, -0.34, …]
If we try to put 2D graph for the above data:
Here, you can notice how Apple and Orange vectors are closer to each other, while Apple and Dog are far apart. That’s the magic of embeddings, they capture meaning not just raw text.
Measuring Similarity
Since an embedding is just a vector (a list of floating-point numbers), we can measure the distance or angle between vectors to determine how related two items are:
- Small distance = high similarity
- Large distance = low similarity
We can use simple mathematical operations to quickly measure how alike two pieces of text are, regardless of their original length or structure. Some common similarity metrics include:
- Cosine Similarity: Measures the cosine of the angle between two vectors.
- Euclidean Distance: Measures the straight-line distance between two points.
- Dot Product: Measures the projection of one vector onto another. The choice of similarity metric should be chosen based on the model.
👉 In practice, cosine similarity is the most widely used.
Example: Apple vs Orange 🍏🍊
import numpy as np
def cosine_similarity(vec1, vec2):
dot_product = np.dot(vec1, vec2)
norm_vec1 = np.linalg.norm(vec1)
norm_vec2 = np.linalg.norm(vec2)
return dot_product / (norm_vec1 * norm_vec2)
similarity = cosine_similarity([ 0.12, -0.03, 0.45], [ 0.11, -0.04, 0.47])
print("Cosine Similarity:", similarity)
Cosine Similarity: 0.9993630530517253
Example: Apple vs Dog 🍏🐶
import numpy as np
def cosine_similarity(vec1, vec2):
dot_product = np.dot(vec1, vec2)
norm_vec1 = np.linalg.norm(vec1)
norm_vec2 = np.linalg.norm(vec2)
return dot_product / (norm_vec1 * norm_vec2)
similarity = cosine_similarity([ 0.12, -0.03, 0.45], [ 0.88, 0.20, -0.33])
print("Cosine Similarity:", similarity)
Cosine Similarity: -0.10904568968509583
From the above examples we can understand that:
- If two vectors point in almost the same direction, they’re very similar (close to 1).
- If they point in opposite directions, they’re very different (close to -1).
How Do Embedding Models Work?
So far, we’ve seen that embeddings are just vectors, but how do we actually get from raw text to numbers?
Let’s break it down into simple steps 👇
1. Tokenization
The input text is split into smaller chunks called tokens. These could be words or even pieces of words.
Example:
- Text:
"My name is Kiran"
- Tokens:
["My", "name", "is", "Kiran"]
2. Encoding with Neural Networks
These tokens are passed through a neural network encoder (usually a Transformer). The encoder learns contextual meaning.
For example:
The word bank in river bank and money bank will have different embeddings, because the model uses surrounding context to understand meaning.
💡 You can read more about Transformers from this HuggingFace Guide
3. Vector Output
Finally, the model outputs a fixed-length vector (say 768 or 1536 dimensions). This becomes the embedding we can use for similarity, clustering, or search.
Meet Qodo-Embed-1-1.5B 🚀
Now that we know the process, let’s look at a real embedding model: Qodo-Embed-01.
Qodo-Embed-1-1.5B is a light weight(1.5B parameters) state-of-the-art code embedding model designed for retrieval tasks in the software development domain.
It’s optimized for natural language-to-code and code-to-code retrieval, making it great for developers.
Core Capabilities:
🔍 Code Search: Enables efficient searching across large codebases
🧠 Retrieval-Augmented Generation (RAG): Enhances code generation with contextual understanding
🤓 Semantic Code Understanding: Captures complex relationships between code snippets
🌐 Multi-Language Support: Processes code from 9 major programming languages.(Python, C++, C#, Go, Java, JavaScript, PHP, Ruby, Typescript)
📈 High-Dimensional Embeddings: Generates rich 1536-dimensional representations
If you're interested in learning more about this, you can check this blog
Example with Qodo-Embed-1
You can use the model via Hugging Face Transformers or SentenceTransformers libraries.
I'll be using sentence-transformers
library
# Install required libraries
pip install sentence-transformers
Here’s a simple example to measure similarity between sentences:
from sentence_transformers import SentenceTransformer
from sklearn.metrics.pairwise import cosine_similarity
import numpy as np
# Load the model
model = SentenceTransformer("Qodo/Qodo-Embed-1-1.5B")
# Source sentence and comparison list
source_sentence = "That is a very happy person"
sentences_to_compare = [
"That is a happy person",
"That is a happy dog",
"Today is a sunny day",
"The man is joyful and smiling"
]
# Encode source and comparison sentences
source_embedding = model.encode([source_sentence])
comparison_embeddings = model.encode(sentences_to_compare)
# Compute cosine similarity (returns a 2D array)
similarity_scores = cosine_similarity(source_embedding, comparison_embeddings)[0]
# Find most similar sentence
most_similar_idx = int(np.argmax(similarity_scores))
most_similar_sentence = sentences_to_compare[most_similar_idx]
similarity_score = similarity_scores[most_similar_idx]
# Print results
print(f"Source Sentence: \"{source_sentence}\"")
print(f"Most Similar Sentence: \"{most_similar_sentence}\"")
print(f"Similarity Score: {similarity_score:.4f}")
✅ Explanation:
- The source sentence is compared with multiple candidates.
- The model generates embeddings for all sentences.
- We calculate cosine similarity to find the closest match.
Output:
Source Sentence: "That is a very happy person"
Most Similar Sentence: "That is a happy person"
Similarity Score: 0.9795
This shows how embeddings can capture semantic meaning beyond exact words.
Applications of Embeddings
Embeddings aren’t just abstract math, they power real-world AI applications we use every day. Let’s walk through some key use cases with code examples.
1. Semantic Search 🔍
Instead of keyword matching, embeddings let us search by meaning.
from sklearn.neighbors import NearestNeighbors
import numpy as np
from sentence_transformers import SentenceTransformer
# Download from the Hub
model = SentenceTransformer("Qodo/Qodo-Embed-1-1.5B")
docs = [
"New York pizza is thin and crispy.",
"I love making homemade Italian pasta.",
"Best coffee shops in Brooklyn.",
"Top-rated pizza restaurants in Manhattan.",
"A guide to Italian restaurants in NYC."
]
# Embed docs
doc_embeds = model.encode(docs, normalize_embeddings=True)
index = NearestNeighbors(n_neighbors=2, metric="cosine").fit(doc_embeds)
query = "Where can I eat pizza in New York?"
qvec = model.encode([query], normalize_embeddings=True)
distances, indices = index.kneighbors(qvec)
print("Query:", query, "\n")
for i, d in zip(indices[0], distances[0]):
print(f"→ {docs[i]} (distance={d:.3f})")
Output:
Query: Where can I eat pizza in New York?
→ Top-rated pizza restaurants in Manhattan. (distance=0.245)
→ A guide to Italian restaurants in NYC. (distance=0.303)
👉 Instead of relying on keywords like pizza or New York, embeddings let the model understand intent.
2. Recommendations 🎵🍿
Platforms like Spotify, Netflix, and YouTube use embeddings to recommend items similar to what you like.
Example: recommending movies:
from sklearn.neighbors import NearestNeighbors
from sentence_transformers import SentenceTransformer
# Download from the Hub
model = SentenceTransformer("Qodo/Qodo-Embed-1-1.5B")
movies = [
"The Matrix - SciFi action with AI and virtual reality.",
"Inception - Dream within dream, sci-fi thriller.",
"The Notebook - Romantic love story.",
"Titanic - Tragic romance on a ship.",
"Avengers - Superheroes saving the world."
]
embs = model.encode(movies, normalize_embeddings=True)
index = NearestNeighbors(n_neighbors=2, metric="cosine").fit(embs)
query = "I like science fiction movies about AI"
qvec = model.encode([query], normalize_embeddings=True)
distances, indices = index.kneighbors(qvec)
print("Query:", query, "\n")
for i, d in zip(indices[0], distances[0]):
print(f"→ {movies[i]} (distance={d:.3f})")
Output:
Query: I like science fiction movies about AI
→ The Matrix - SciFi action with AI and virtual reality. (distance=0.205)
→ Inception - Dream within dream, sci-fi thriller. (distance=0.245)
👉 Notice how the model recommends movies semantically close to the query.
3. Clustering & Topic Discovery 📊
Embeddings can group related items automatically.
from sklearn.cluster import KMeans
from sentence_transformers import SentenceTransformer
# Download from the Hub
model = SentenceTransformer("Qodo/Qodo-Embed-1-1.5B")
sentences = [
"I love football",
"Basketball is exciting",
"Tennis players train hard",
"Apples and oranges are fruits",
"Bananas are tasty",
"Strawberries are sweet"
]
X = model.encode(sentences, normalize_embeddings=True)
kmeans = KMeans(n_clusters=2, random_state=42, n_init=10).fit(X)
for i, label in enumerate(kmeans.labels_):
print(f"{sentences[i]} → Cluster {label}")
Output:
I love football → Cluster 1
Basketball is exciting → Cluster 1
Tennis players train hard → Cluster 1
Apples and oranges are fruits → Cluster 0
Bananas are tasty → Cluster 0
Strawberries are sweet → Cluster 0
👉 The model naturally groups sports vs fruits, without us telling it!
4. Deduplication / Similarity Check ✅
Detect if two sentences mean the same thing.
from sentence_transformers import SentenceTransformer
# Download from the Hub
model = SentenceTransformer("Qodo/Qodo-Embed-1-1.5B")
s1 = "AI is transforming the world."
s2 = "Artificial intelligence is changing our world."
s3 = "Bananas are yellow."
vecs = model.encode([s1, s2, s3], normalize_embeddings=True)
def cosine(v1, v2): return float(np.dot(v1, v2))
print("s1 vs s2:", cosine(vecs[0], vecs[1])) # high similarity
print("s1 vs s3:", cosine(vecs[0], vecs[2])) # low similarity
Output:
s1 vs s2: 0.8994590044021606
s1 vs s3: 0.471836119890213
👉 This is how embeddings help in duplicate detection or plagiarism checking.
5. RAG (Retrieval-Augmented Generation) 🔗
Embeddings power modern RAG systems, where models retrieve facts before generating answers.
docs = [
"Python is great for data science.",
"Transformers are neural networks for sequences.",
"Qodo-Embed-1 is a model for embeddings.",
]
query = "Which model can create embeddings?"
# Embed docs + query
doc_vecs = model.encode(docs, normalize_embeddings=True)
query_vec = model.encode([query], normalize_embeddings=True)
scores = np.dot(doc_vecs, query_vec.T).flatten()
best_doc = docs[scores.argmax()]
print("Query:", query)
print("Retrieved:", best_doc)
Output:
Query: Which model can create embeddings?
Retrieved: Qodo-Embed-1 is a model for embeddings.
👉 This is the backbone of many chatbots and AI assistants today.
Best Embedding Models Today (2025)
If you’re looking which models to try in your next project, I’ve compiled a list here:

Embedding Models You Can Use in Your Next Project For Free 🚀
Kiran Naragund ・ Mar 26
Embeddings may look like just numbers, but they are the real heroes of modern AI, powering everything from search engines and recommendations to chatbots and code assistants.
That's It.🙏
Thank you for reading this far. If you find this article useful, please like and share this article. Someone could find it useful too.💖
Top comments (9)
Thank you Kiran
You're welcome!
👍
🙏
Awesome👍
Thanks!
Thanks, how about a comparasion to agentic search
Thanks Kiran!
You're welcome!