Devanshu Biswas

Posted on Jun 15

Embeddings Explained Simply: How AI Turns Words Into a Map of Meaning

#webdev #beginners #ai #machinelearning

If you've heard the words "vector database," "semantic search," or "RAG" and nodded along while quietly panicking — this one's for you. All three sit on top of one idea: embeddings. And the idea is genuinely simple.

This is Day 3 of AIFromZero, my concept-a-day series that explains how AI actually works, in plain language, no math degree required.

The problem embeddings solve

Computers are great with numbers and clueless about meaning. To a raw program, "happy" and "joyful" are just different strings of letters — no more related than "happy" and "stapler."

Embeddings fix that by turning each word (or sentence, or image) into a list of numbers — a point in space — arranged so that things with similar meaning land close together.

Think of it as GPS coordinates for meaning. "Dog" and "puppy" get nearby coordinates. "Dog" and "democracy" get far-apart ones.

What a vector actually is

A vector is just an ordered list of numbers:

"king"  → [0.21, -0.44, 0.88, ... ]
"queen" → [0.19, -0.41, 0.85, ... ]

Real embedding models use hundreds or thousands of numbers (dimensions) per word — far too many to picture. But you don't have to picture all of them. You only care about one thing: how close are two vectors?

Measuring "closeness"

The standard measure is cosine similarity — how much two vectors point in the same direction. You don't need the formula to get the intuition:

Pointing the same way → similarity near 1 → very related.
At right angles → near 0 → unrelated.
Opposite → near -1.

So "find me things similar to X" becomes "find the vectors closest to X's vector." That's the entire trick behind semantic search.

The famous party trick: meaning is math

Because meaning becomes geometry, you can do arithmetic on it:

vector("king") - vector("man") + vector("woman") ≈ vector("queen")

Subtract "man-ness," add "woman-ness," and you land near "queen." The model was never told this — it fell out of learning from billions of sentences. That's the moment embeddings click for most people.

In the interactive demo I built, you get a 2-D "meaning map": click any word and watch its nearest neighbors light up, and see the king − man + woman example play out as arrows.

Where you've already used them

Semantic search — search by meaning, not exact keywords. ("affordable laptop" finds "cheap notebook computer.")
Recommendations — "similar songs / products / articles" = nearest vectors.
RAG (retrieval-augmented generation) — before an AI answers, it embeds your question, finds the closest chunks of your documents, and feeds those in. That's how "chat with your PDF" works.
Clustering & deduplication — group similar items, spot near-duplicates.

How you'd get them in real life

You don't compute embeddings by hand. You send text to an embedding model and get the vector back — one call:

const vector = await embed("a fluffy golden retriever puppy");
// → [0.03, -0.51, 0.27, ...]  (hundreds of numbers)

Then you store those vectors in a vector database and search by closeness.

The one takeaway

Embeddings turn meaning into coordinates, so "is this similar to that?" becomes "are these two points close?"

Once that lands, half of modern AI — search, recommendations, RAG, clustering — stops being mysterious and starts being geometry.

👉 Try the Meaning Map (click a word, see its neighbors, watch king − man + woman): https://dev48v.infy.uk/ai/days/day3-embeddings.html

🌐 All concepts: https://dev48v.infy.uk/aifromzero.php

Tomorrow: what a neural net does — the intuition, no math.

DEV Community