DEV Community

Cover image for The One Concept Behind RAG, Search, and AI Systems
Vaishali
Vaishali

Posted on

The One Concept Behind RAG, Search, and AI Systems

If you’ve been exploring AI and stumbled across terms like RAG, vector search, or semantic similarity — there's one concept sitting quietly underneath all of them.

Embeddings.

You’ll see this term everywhere:

  • vector databases
  • semantic search
  • similarity matching

But most explanations stop at:

"Embeddings convert text into vectors."

That's true.

But it doesn't explain why they matter.
Or why everything in modern AI seems to depend on them.


🧠 What Embeddings Actually Are

At a basic level, embeddings represent text as numeric vectors — lists of numbers.

Why?

Because ML models can't process raw text.
They need numbers.

But that's not the interesting part.

Embeddings don’t just convert text into numbers.
They preserve meaning in those numbers
.

Each piece of text becomes a point in a high-dimensional space.
In that space:

  • similar meaning → closer together
  • different meaning → farther apart

For example:

  • "king" and "queen" → close
  • "cat" and "tiger" → close
  • "cat" and "car" → far

The numbers themselves don’t really matter.

The relationships between them do.

That’s the part that makes everything else possible.


🧩 Why We Need Them

Without embeddings, text is just… text.

There’s no clean way to:

  • compare meaning
  • measure similarity
  • search semantically.

Embeddings turn meaning into something that can be measured.
And once meaning becomes measurable, it becomes usable.


🧭 Types of Embeddings

Embeddings aren't just for text.
Images, audio, graphs — all of them can be represented as vectors.

  • Text → words, sentences, documents
  • Image → visual features
  • Audio → sound patterns
  • Graph → relationships between entities

I didn’t realize this at first.
I thought embeddings were only a “text thing”.

But in most AI applications like search and RAG,
text embeddings are the most relevant starting point.


🔗 Word vs Sentence Embeddings

Not all text embeddings work the same way.

Word embeddings:

  • Represent individual words
  • Do not consider context
  • Same word → same vector everywhere

Think of them like isolated puzzle pieces.

So a word like “bank” gets the same embedding whether you're talking about:

  • a riverbank
  • a savings account

Used in:

  • Named Entity Recognition (NER)
  • Part-of-Speech tagging
  • Word-level clustering

Sentence embeddings:

  • Represent full sentences or documents
  • Capture context and relationships
  • Same word → different meaning depending on usage

They look at the entire sentence and how words relate to each other.

So:

  • "I went to the bank to deposit money"
  • "I sat by the bank of the river"

…produce completely different embeddings.

Word embeddings capture meaning.
Sentence embeddings capture context.

Used in:

  • Semantic search
  • RAG (Retrieval-Augmented Generation)
  • Text similarity
  • Document classification

🌍 Where Embeddings Are Used

This is where things started making more sense to me.

Embeddings aren’t just a concept.
They show up everywhere:

  • Semantic search → find meaning, not just exact matches
  • RAG → retrieve relevant context for LLMs
  • Recommendations → find similar content
  • Memory in AI agents → store and retrieve past context
  • Text similarity & classification → measure and categorise meaning

All of these rely on one simple idea:

Find things that are close in meaning, not just exact matches.


🧮 Vector Similarity — The Engine Behind It All

Once everything becomes vectors, the question becomes:

how do you measure which ones are similar?

This is done using distance and similarity metrics.
Similarity metrics decide what “similar” actually means.


1. Cosine Similarity

  • Measures the angle between vectors
  • Ignores magnitude
  • Focuses on direction

So even if two pieces of text are very different in length,
if they point in the same direction → they’re considered similar.

That’s why it works so well for text.

Example:

A short tweet and a long article about the same topic
will point in the same direction.

Cosine similarity is the default in ~90% of modern AI systems.

Used in:

  • Semantic search
  • Document similarity
  • Recommendation systems

2. Dot Product

Dot product considers both:

  • direction
  • magnitude (vector size)

So in theory, it’s more expressive.

Used in:

  • recommendation systems (like YouTube)
  • ranking models
  • trained embedding systems

3. Euclidean Distance

  • Measures straight-line distance
  • Works fine in low dimensions.

But in high-dimensional spaces:

  • magnitude differences distort similarity
  • direction (meaning) matters more than distance

That’s why it’s less common in NLP.

Used in:

  • Clustering
  • Low-dimensional data
  • Classical ML systems

Quick Comparison

Method What it focuses on Usage
Cosine Direction Most common
Dot Product Direction + magnitude Selective
Euclidean Distance Rare

🤔 If Dot Product Is Better, Why Does Cosine Win?

This confused me for a while.

If dot product is more expressive —
and even used in recommendation systems —
then why does almost every modern application default to cosine?

Here’s what made it click:

1. Dot product only works when embeddings are learned to use magnitude

  • In some systems, embeddings are trained end-to-end
  • So magnitude becomes meaningful (e.g. preference strength)
  • Dot product can then use both direction and magnitude effectively

Systems like YouTube train their own embeddings.

In those systems:

  • magnitude = strength of preference
  • dot product becomes meaningful

But with off-the-shelf embeddings, you don’t get that.

2. In most embeddings, magnitude doesn’t mean anything

  • Most developers use pre-trained embeddings (APIs)
  • These encode meaning in direction, not length
  • So magnitude becomes unreliable

Which means: dot product ≈ cosine

Cosine is the default because most developers use pre-trained embeddings where magnitude means nothing.

Dot product is for teams who train their own models and design magnitude to mean something.


🌱 The Takeaway

At first, embeddings can feel like just a preprocessing step.
Something you do before the "real" work.

But that's not accurate.

Embeddings are what make meaning searchable, comparable, and usable.

Without them:

  • RAG doesn't work
  • semantic search doesn't exist
  • recommendations break

You don’t need to memorise every model or metric.

But once embeddings make sense,
higher-level concepts become easier to place.

Top comments (0)