Day 7 - Dense Embedding - RAG

#llm #machinelearning #nlp #rag

Dense embedding have continuous numeric values. i.e after decimal point values will be present. Chunk will be converted to embeddings, each embedding point will have number like [0.3455566 ,0.6777779, ...]. Generated vectors will be plotted in a space called latent space. Discrete values like 0 won't be present.

Sparse embedding will mostly have discrete values like 0,1 etc. Rather than semantic meaning, it considers frequency or importance of words in a text.
Ex: one hot encoding

Models for Dense embedding
1. LLM

Embed only LLMs are also available. Sole purpose of these LLMs is to generate embedding. Ex: Nomic embed, BGE.
We can also give a prompt to general purpose LLM to generate embedding. But this is costly operation.

2. Transformers (encoder)
Ex: Minilm, nomic transformers

These models are available in hugging face, ollama.It also hosts other models as well.

How can we evaluate the performance of RAG system ?
For a given user query, RAG system will return some set of matching documents. If the returned documents matches with our expectations, we can say it is yielding good results. Say, if our expectation from RAG is to return a, b, c, d, e documents for a user query and in reality it returns a, b, d docs alone. Out of 5, 3 is returned. It is meeting expectation to half right ? Like how we write unit test cases for a software code, we need to write test cases for user query in evaluating the RAG systems.

Should the same embedding model be used throughout the RAG pipeline?
Yes. If we use nomic embed text for document vectorisation then the same model should be used for query vectorisation as well. Suppose if we use different models(one for document vectorisation and other for query vectorisation), then there is a chance that the documents vectors will be plotted in one cluster space and query vector will be plotted in another cluster space. To avoid this, we need to use the same embedding throughout the pipeline.

DEV Community

Day 7 - Dense Embedding - RAG

Top comments (0)