Ramya Perumal

Posted on May 31

RAG - Hybrid search and RAG pipeline using FAISS DB

#ai #beginners #rag #nlp

Hybrid Search

Hybrid search is a combination of dense embeddings and sparse embeddings.

Dense embeddings focus on semantic meaning, while sparse embeddings focus on exact keyword matching. By combining both approaches, hybrid search improves retrieval accuracy and relevance.

OpenSearch is commonly used as a search engine for:

Log analysis
Observability and monitoring

One of the key features of OpenSearch is hybrid search, which combines:

Vector search (dense retrieval)
BM25-based search (sparse retrieval)

BM25 internally uses concepts such as:

TF (Term Frequency)
IDF (Inverse Document Frequency)

This allows OpenSearch to retrieve documents based on both semantic meaning and exact keyword matches.

RAG Cycle

A Retrieval-Augmented Generation (RAG) system consists of the following stages:

1. Document Ingestion

Documents are split into chunks using a chunking strategy.

2. Embedding Generation

Each chunk is converted into an embedding vector using an embedding model.

3. Storage

The generated vectors are stored in a vector database.

4. Retrieval

When a user submits a query:

The query is converted into an embedding vector
Similar documents are retrieved from the vector database

5. Augmentation

The Augmentor combines:

User query
Retrieved documents/chunks
Prompt instructions

This combined context is then sent to the LLM.

Generation

The LLM processes the augmented context and generates a human-readable response.

RAG Flow

Documents
↓
Chunking
↓
Embeddings
↓
Vector Database
↓
User Query
↓
Retrieval
↓
Augmentation
(Query + Retrieved Documents + Instructions)
↓
LLM
↓
Human Readable Response

FAISS

FAISS (Facebook AI Similarity Search) is an open-source library used for efficient vector similarity search.

FAISS is commonly used to:

Store vector indexes locally
Perform similarity search efficiently
Build small to medium-scale RAG applications

Advantages

Fast similarity search
Open source
Easy to set up
Works well for local development and prototyping

Limitations

FAISS primarily stores indexes in memory or local files. Because of this:

It is not a full-fledged vector database
Managing very large datasets becomes challenging
Continuous streaming and real-time updates are more difficult compared to dedicated vector databases

When to Use FAISS

FAISS is a good choice when:

Building proof-of-concept projects
Developing small to medium-sized RAG applications
Running local experiments

When to Consider a Vector Database

For large-scale applications that require:

Billions of vectors
Real-time updates
Continuous data ingestion

DEV Community