Ólafur Aron Jóhannsson

Posted on Oct 19 • Originally published at olafuraron.is

What are Vector Embeddings?

#machinelearning #webassembly #rust #nlp

It's just matrices

Vector embeddings serve a very important piece in technology today, as their application is so useful, as they capture
meaning of natural language.

You've used them before

Netflix/Spotify uses it for their recommendation systems (because you watched X...)
Duplicate detection
Retrieval Augmented Generation (RAG) systems, retreive relevant text from a corpora
Content moderation
Question Answering, match the intent, not just keywords (How do i reset password -> Forgot login credentials)

Turn text into numbers in high-dimensional space. More dimensions = more detail about meaning. MiniLM uses 384. Bigger models go to 1024+, but cost more compute and memory.

This is how computers know "doctor" and "physician" mean the same thing despite sharing zero letters.

Try It Yourself

→ Live Interactive Demo

Demo

How It Works

Neural networks learn to map words to vectors by training on billions of text examples. Words that appear in similar contexts end up with similar vectors.

The network learns patterns like:

"The doctor prescribed medication"
"The physician prescribed medication"

Since "doctor" and "physician" appear in similar contexts, they get similar embeddings.

Why 384 Dimensions?

Each embedding is 384 numbers (for MiniLM-L6-v2). Why so many?

Each dimension captures a different aspect of meaning:

Dimension 1 might encode "is this a profession?"
Dimension 47 might encode "medical-related?"
Dimension 203 might encode "human-related?"

The model learns these automatically from data. We can't interpret individual dimensions, but the full vector captures nuanced meaning.

Measuring Similarity

Cosine similarity measures how "aligned" two vectors are:

1.0 = identical direction (same meaning)
0.0 = perpendicular (unrelated)
-1.0 = opposite direction (antonyms)

fn cosine_similarity(a: &[f32], b: &[f32]) -> f32 {
    let dot: f32 = a.iter().zip(b).map(|(x, y)| x * y).sum();
    let norm_a: f32 = a.iter().map(|x| x * x).sum::<f32>().sqrt();
    let norm_b: f32 = b.iter().map(|x| x * x).sum::<f32>().sqrt();
    dot / (norm_a * norm_b)
}

What You Can Build With This

Once you have embeddings, you can:

Semantic Search - Find documents by meaning, not keywords

Clustering - Group similar items together

Recommendations - "Users who liked X also liked Y"

Duplicate Detection - Find similar content with different wording

Classification - Categorize text by meaning

All powered by comparing vectors.

The Model Behind This Demo

This demo uses EdgeBERT, my pure Rust BERT model inference implementation that runs in browsers via WebAssembly.

Model: all-MiniLM-L6-v2 (384 dimensions)

Other Resources

EdgeBERT on GitHub - The WASM library powering this demo

Alse see BM25 vs TF-IDF - BM25 vs TF-IDF

DEV Community