DEV Community

Kalpana R
Kalpana R

Posted on

🎓 Session 1: Hello World of RAG + Introduction & Need of RAG

What I Learned

Today’s session introduced me to Retrieval-Augmented Generation (RAG) and why it’s becoming essential in AI. The focus was on understanding the limitations of plain language models (LLMs) and how RAG helps overcome them.
..

What is RAG?

Retrieval-Augmented Generation (RAG) is a technique that combines:

  • Retrieval → fetching relevant, external information (from documents, databases, or the web).
  • Generation → using a language model (LLM) to produce coherent, context-aware responses.

The term “Augmented” means the LLM’s generation process is enhanced by the retrieved information — instead of relying only on its internal training data, it’s augmented with fresh, factual context from external sources.

Together, RAG helps produce outputs that are more factual, relevant, and up-to-date by grounding responses in retrieved information.


Key Concepts:

Language Models (LLMs) generate text by predicting the next word based on learned patterns and context.
They use probabilities and context to produce coherent and meaningful responses.

Limitations of LLMs:

  • Hallucinations (making up answers when unsure).
  • Outdated knowledge (training data has a cutoff).
  • No access to private or domain-specific documents.

RAG: Combines retrieval (fetching relevant info) with generation (LLM output).
- RAG helps improve factuality, relevance, and freshness by grounding responses in retrieved information


How Do LLMs Learn? (Weights & Parameters)

The language models are like equations with parameters. During training, the model adjusts its weights — internal values that decide how strongly one word or feature influences another.

  • Example: Just as changing 𝑚 or 𝑐 in 𝑦=𝑚𝑥+𝑐 changes the line, adjusting weights changes the model’s predictions.
  • These weights are what allow the model to learn patterns from massive datasets.

SLM vs. LLM (General Note)

  • While this session focused mainly on Large Language Models (LLMs) and RAG, it’s useful to know the distinction:

  • LLMs (Large Language Models) → Very big models trained on massive datasets. They’re powerful, but resource-heavy.

  • SLMs (Small Language Models) → More compact models designed for efficiency. They can run faster, use less memory, and are easier to deploy on devices with limited resources.

In practice:

  • LLMs are great for complex reasoning and broad knowledge.
  • SLMs are often used for lightweight tasks, edge devices, or situations where speed and efficiency matter.

This is useful context to keep in mind as I continue learning about RAG and AI systems.


Why Do We Need RAG?

Plain language models are powerful but limited:

  • They hallucinate (make up answers when unsure).
  • They rely on static training data (no updates after cutoff).
  • They can’t access private or domain-specific documents. RAG helps reduce these issues by grounding answers in retrieved context.

Key Examples from the Session

  • Dogs, Cats, and Lion

    • Without RAG: If a model has not seen enough relevant information about lions in its training data, it may generate incorrect or fabricated answers (hallucinations).
    • With RAG: Retrieval brings in factual information about lions from external sources, helping the model generate a more accurate and grounded response.
  • COVID vs. Current Events

    • Without RAG: The model may know about COVID (from training data) but struggle with recent events due to outdated knowledge.
    • With RAG: Retrieval pulls in recent articles or documents, allowing the model to respond with up-to-date context.
  • River Bank → Context confusion: “river bank” vs. “financial bank.”

    • Without RAG: The model may confuse “river bank” (geography) with “bank” (finance) depending on context.
    • With RAG: Retrieval provides relevant domain context, helping the model choose the correct meaning.
  • Company Docs → LLM alone can’t answer from private files, but RAG can.

    • Without RAG: The model cannot access private or internal company documents.
    • With RAG: Retrieval fetches relevant internal documents, enabling accurate answers based on company data.
  • Hello Predictions

    • Without RAG: With “Hello,” low temperature may produce “World,” while high temperature may produce “How are you?” or other creative outputs — but answers may drift.
    • With RAG: Even at high temperature, retrieval keeps outputs grounded in factual context.

Temperature Settings [Temperature in LLMs]

  • Low Temperature (~0) → More deterministic and consistent responses.
  • High Temperature (~1 or above) → More creative and varied responses.
    -Takeaway: Use low temperature for consistency and high temperature for creativity. Note that temperature controls randomness, not correctness.

  • My Note: Retrieval can help guide responses with relevant context, even when temperature increases variability.


Real-World Applications of RAG

  • Customer Support → Answers from FAQs and manuals.
  • Healthcare → Grounded responses from medical databases.
  • Education → Fact-checked explanations for learners.
  • *Enterprise Search *→ Unlocking insights from private organizational data.

Key Takeaways (Quick Reference)

  • RAG = Retrieval + Generation.
  • Helps reduce hallucinations, outdated knowledge issues, and lack of private context.
  • Temperature controls creativity vs. accuracy.
  • Real-world uses: support, healthcare, education, enterprise search.
  • Core idea: ground AI in facts before generating answers.

My Conclusion

Today’s session gave me a strong foundation in understanding the limitations of AI and how RAG helps overcome them.

Instead of relying only on memory, RAG allows AI to look up relevant information before answering—just like how we perform better when we can refer to notes.

This is just the beginning of my learning journey with RAG — I’ll continue documenting as I go.


📚 This post is part of my Learning Notes – RAG Series.

Next up: Session 2, where I’ll continue exploring and documenting my journey.

Top comments (0)