Transforming language into geometry.
Introduction
Embeddings are one of the most important building blocks of modern AI applications, yet they're often treated as a black box.
In this article, I'll demystify embeddings by exploring what they are, how they are created, and why they make semantic search possible.
The Problem With Traditional Search
Imagine searching for the phrase:
"How do I reset my password?"
A traditional keyword search looks for exact or similar words. If a document instead says:
"Steps to recover your account credentials"
the search may fail because the wording is completely different.
Humans immediately recognize that both sentences describe the same intent, but computers on the other hand need a different way to represent meaning, and this is where embeddings come in.
What Is an Embedding?
An embedding is a dense vector, a list of numbers that represents the semantic meaning of a piece of text. In a more simple way, an array of numerical values usually floating point numbers where almost every position holds meaningful information.
Instead of treating text as a sequence of characters or words, an embedding model maps it into a high dimensional vector space.
For example:
"cat"
↓
[0.18, -0.42, 0.91, ...]
The numbers themselves have no intuitive meaning.
What matters is where the vector is located relative to other vectors.
Meaning Comes From Position
Imagine a map where cities that are geographically close tend to share borders, climates, and transportation links.
Well embeddings work similarly, texts with similar meanings are placed near one another in vector space.
For example:
Dog
●
Cat
●
Puppy
●
Car ●
Engine ●
Truck ●
The actual space may have hundreds or thousands of dimensions instead of two, but the intuition remains the same, so we conclude that the distance represents semantic similarity.
Similar Meaning, Different Words
This is where we can see embeddings stenght.
In these sentences below:
- "Reset my password"
- "Recover my account"
- "Can't log in"
- "Forgot my credentials"
They share very few keywords, yet an embedding model places them close together because they express similar ideas.
This enables semantic search, where results are retrieved based on meaning rather than exact wording.
How Similarity Is Measured
Once text has been converted into vectors, we need a way to compare them and the most common metric is cosine similarity.
Rather than comparing the individual numbers, cosine similarity measures the angle between two vectors.
- Small angle → highly similar
- Large angle → less similar
This works surprisingly well because embedding models are trained to organize semantically related content in nearby regions of the vector space.
Why Embeddings Matter for RAG
Retrieval Augmented Generation (RAG) depends heavily on embeddings, where a typical pipeline looks like this:
Documents
│
▼
Embedding Model
│
▼
Vectors Stored in a Vector Database
│
▼
User Query
│
▼
Query Embedding
│
▼
Similarity Search
│
▼
Relevant Documents
│
▼
LLM
Notice something important:
The LLM never searches your documents directly. Instead, it searches the embedding space for documents whose vectors are closest to the query.
Conclusion
Now that I scratched the surface on how these "numerical representations of text" work. Understanding embeddings is essential for anyone building LLM applications because they power everything from document retrieval to recommendation systems.
Embeddings real power is not in storing vectors but in organizing them, and that what makes them so effective.
Top comments (0)