Vaishali

Posted on Mar 25

Embeddings: The One Concept Behind RAG, Search, and AI Systems

#beginners #machinelearning #ai #nlp

If you’ve been exploring AI and stumbled across terms like RAG, vector search, or semantic similarity — there's one concept sitting quietly underneath all of them.

Embeddings.

You’ll see this term everywhere:

vector databases
semantic search
similarity matching

But most explanations stop at:

"Embeddings convert text into vectors."

That's true.

But it doesn't explain why they matter.
Or why everything in modern AI seems to depend on them.

🧠 What Embeddings Actually Are

At a basic level, embeddings represent text as numeric vectors — lists of numbers.

Why?

Because ML models can't process raw text.
They need numbers.

But that's not the interesting part.

Embeddings don’t just convert text into numbers.
They preserve meaning in those numbers.

Each piece of text becomes a point in a high-dimensional space.
In that space:

similar meaning → closer together
different meaning → farther apart

For example:

"king" and "queen" → close
"cat" and "tiger" → close
"cat" and "car" → far

The numbers themselves don’t really matter.

The relationships between them do.

That’s the part that makes everything else possible.

🧩 Why We Need Them

Without embeddings, text is just… text.

There’s no clean way to:

compare meaning
measure similarity
search semantically.

Embeddings turn meaning into something that can be measured.
And once meaning becomes measurable, it becomes usable.

🧭 Types of Embeddings

Embeddings aren't just for text.
Images, audio, graphs — all of them can be represented as vectors.

Text → words, sentences, documents
Image → visual features
Audio → sound patterns
Graph → relationships between entities

I didn’t realize this at first.
I thought embeddings were only a “text thing”.

But in most AI applications like search and RAG,
text embeddings are the most relevant starting point.

🔗 Word vs Sentence Embeddings

Not all text embeddings work the same way.

Word embeddings:

Represent individual words
Do not consider context
Same word → same vector everywhere

Think of them like isolated puzzle pieces.

So a word like “bank” gets the same embedding whether you're talking about:

a riverbank
a savings account

Used in:

Named Entity Recognition (NER)
Part-of-Speech tagging
Word-level clustering

Sentence embeddings:

Represent full sentences or documents
Capture context and relationships
Same word → different meaning depending on usage

They look at the entire sentence and how words relate to each other.

So:

"I went to the bank to deposit money"
"I sat by the bank of the river"

…produce completely different embeddings.

Word embeddings capture meaning.
Sentence embeddings capture context.

Used in:

Semantic search
RAG (Retrieval-Augmented Generation)
Text similarity
Document classification

🌍 Where Embeddings Are Used

This is where things started making more sense to me.

Embeddings aren’t just a concept.
They show up everywhere:

Semantic search → find meaning, not just exact matches
RAG → retrieve relevant context for LLMs
Recommendations → find similar content
Memory in AI agents → store and retrieve past context
Text similarity & classification → measure and categorise meaning

All of these rely on one simple idea:

Find things that are close in meaning, not just exact matches.

🧮 Vector Similarity — The Engine Behind It All

Once everything becomes vectors, the question becomes:

how do you measure which ones are similar?

This is done using distance and similarity metrics.
Similarity metrics decide what “similar” actually means.

1. Cosine Similarity

Measures the angle between vectors
Ignores magnitude
Focuses on direction

So even if two pieces of text are very different in length,
if they point in the same direction → they’re considered similar.

That’s why it works so well for text.

Example:

A short tweet and a long article about the same topic
will point in the same direction.

Cosine similarity is the default in ~90% of modern AI systems.

Used in:

Semantic search
Document similarity
Recommendation systems

2. Dot Product

Dot product considers both:

direction
magnitude (vector size)

So in theory, it’s more expressive.

Used in:

recommendation systems (like YouTube)
ranking models
trained embedding systems

3. Euclidean Distance

Measures straight-line distance
Works fine in low dimensions.

But in high-dimensional spaces:

magnitude differences distort similarity
direction (meaning) matters more than distance

That’s why it’s less common in NLP.

Used in:

Clustering
Low-dimensional data
Classical ML systems

Quick Comparison

Method	What it focuses on	Usage
Cosine	Direction	Most common
Dot Product	Direction + magnitude	Selective
Euclidean	Distance	Rare

🤔 If Dot Product Is Better, Why Does Cosine Win?

This confused me for a while.

If dot product is more expressive —
and even used in recommendation systems —
then why does almost every modern application default to cosine?

Here’s what made it click:

1. Dot product only works when embeddings are learned to use magnitude

In some systems, embeddings are trained end-to-end
So magnitude becomes meaningful (e.g. preference strength)
Dot product can then use both direction and magnitude effectively

Systems like YouTube train their own embeddings.

In those systems:

magnitude = strength of preference
dot product becomes meaningful

But with off-the-shelf embeddings, you don’t get that.

2. In most embeddings, magnitude doesn’t mean anything

Most developers use pre-trained embeddings (APIs)
These encode meaning in direction, not length
So magnitude becomes unreliable

Which means: dot product ≈ cosine

Cosine is the default because most developers use pre-trained embeddings where magnitude means nothing.

Dot product is for teams who train their own models and design magnitude to mean something.