RAG is Retrieval-Augmented Generation.
What is a Model?
A model is nothing but an equation.
Example:
y=mx+c
During training, values of x and y will be provided. The model has to find the appropriate values of m and c and try to make a line that best fits the graph. The values of m and c may vary depending on the use case.
What is a Parameter?
A parameter is nothing but a variable that is learned during training.
In the above equation:
m is a parameter
c is a parameter
If the number of parameters is more, the model can learn more complex patterns.
What is Temperature
Temperature controls the model's creativity. It usually ranges from 0 to 1.
Lower temperature gives more factual answers.
Higher temperature gives more imaginative answers.
Temperature is passed along with the prompt input.
Usually, it is kept around 0.5 for balanced output.
SLM
SLM stands for Small Language Model.
It usually has fewer billion parameters and is trained for a particular domain or specific tasks.
Training cost can still be high, similar to LLMs, depending on the use case.
Example: smallest ai - provides voice-based smaller AI models.
LLM
LLM stands for Large Language Model.
It usually has billions of parameters and contains knowledge from many domains. It is called a generalized model.
Example: gpt-oss-120b.
How LLM Works
The primary functionality of an LLM is to predict the next word correctly.
It generates text by predicting one word after another based on previous words.
Sometimes LLMs generate incorrect information confidently. This is called hallucination.
Example:
If the model knows about cats and dogs but has limited knowledge about lions, it may generate irrelevant or incorrect content.
Hallucination can be reduced by writing proper prompts and providing correct context.
What is RAG?
RAG stands for Retrieval-Augmented Generation.
It is a method used to provide private or external knowledge such as:
Company policies
HR policy documents
Internal business documents
This information is given to the LLM so it can generate human-readable answers based on that content.
Where is Private Data Stored?
Private data is usually stored in a database called a Vector Database.
How Documents are Stored
Documents are split into smaller parts called chunks.
These chunks are converted into numerical vectors and stored in the vector database.
To search relevant chunks quickly, algorithms like:
ANN (Approximate Nearest Neighbors)
KNN (K-Nearest Neighbors)
are commonly used.
Top comments (0)