When I first started exploring question-answering on PDFs, one thing confused me:
👉 Why do we use Sentence-BERT and FAISS together?
Can’t FAISS just create embeddings on its own?
Here’s the simple breakdown 👇
🔹 Sentence-BERT
It’s a neural network model.
Converts text into embeddings (vectors of numbers).
These vectors capture the meaning of the text.
Example: “dog” and “puppy” → end up with vectors close to each other.
🔹 FAISS (Facebook AI Similarity Search)
It doesn’t create embeddings.
Instead, it’s an efficient search engine for vectors.
Given a query vector, it finds the nearest neighbors (most similar chunks of text) super fast.
👉 Think of it like this:
Sentence-BERT = Translator (text → coordinates on a “map of meaning”)
FAISS = GPS (finds the closest points on that map in milliseconds)
đź’ˇ Together, they make semantic search possible:
Sentence-BERT gives us the “language of meaning” (embeddings)
FAISS makes searching through thousands or millions of embeddings lightning fast
Without Sentence-BERT → FAISS has nothing meaningful to compare.
Without FAISS → you can still compare embeddings, but it’s painfully slow at scale.
Top comments (0)