Ryan Giggs

Posted on Jun 21

I Just Built an Agentic RAG System From Scratch — Here's What I Learned (LLM Zoomcamp 2026, Module 1)

#llm #ai #rag #datatalksclub

I just completed Module 1 of the LLM Zoomcamp 2026 by @DataTalksClub — and honestly, this is the most hands-on AI course I've taken.

No fluff. No hand-holding. Just real code, real concepts, and a working system by the end.

Here's everything I learned — and why it matters if you're a data engineer or software developer trying to break into LLM applications.

What Is LLM Zoomcamp?

LLM Zoomcamp is a free, open-source course by Alexey Grigorev and DataTalks.Club that teaches you how to build production-ready LLM applications from scratch. No GPU required. No expensive API bills. Just Python, curiosity, and a willingness to build.

Module 1 is called Agentic RAG — and it covers everything from what an LLM is to building a fully autonomous AI agent that decides when and what to search.

What I Built in Module 1

1. A RAG Pipeline From Scratch

RAG stands for Retrieval-Augmented Generation. The idea is simple: instead of asking an LLM a question and hoping it knows the answer, you first search a knowledge base for relevant documents, then pass those documents as context to the LLM.

The result? Grounded, accurate answers instead of hallucinations.

I built the full pipeline in ~30 lines of Python:

def rag(question):
    search_results = search(question)          # 1. find relevant docs
    prompt = build_prompt(question, search_results)  # 2. build context prompt
    return llm(prompt)                         # 3. generate answer

Simple. Powerful. The foundation of every production RAG system.

2. Document Indexing with minsearch

I indexed 1,242 FAQ documents from the DataTalks.Club Zoomcamp courses using minsearch — a lightweight keyword search library built by Alexey himself. It uses the same concepts as Elasticsearch (text fields, keyword fields, boosting, filtering) but runs in pure Python with zero infrastructure.

3. Document Chunking for Better Retrieval

Long documents hurt retrieval precision. A match deep inside a 10,000-character page still pulls the whole page into the LLM context — wasteful and noisy.

The fix is chunking: split each document into smaller overlapping pieces and index those instead.

from gitsource import chunk_documents
chunks = chunk_documents(documents, size=2000, step=1000)

The result? ~3× fewer input tokens sent to the LLM — smaller, faster, cheaper, and more accurate.

4. Turning RAG Into an Agent with Function Calling

This is where it gets exciting.

A standard RAG pipeline is fixed: question → search once → answer. The developer controls the flow.

An agentic RAG system puts the LLM in charge:

question → LLM thinks → search? → LLM thinks → search again? → LLM thinks → answer

The LLM decides:

Whether to search at all
What to search for
How many times to search
When it has enough context to answer

I implemented this using function calling — giving the LLM a search tool it can invoke on its own. In one test, the agent autonomously made 3 different searches with progressively refined queries before generating a final answer. No hardcoding. No fixed flow.

Tech Stack

LLM: Groq API (llama-3.1-8b-instant) — free tier, blazing fast
Search: minsearch — lightweight keyword search
Chunking: gitsource chunk_documents
Environment: uv + Python 3.12
Knowledge base: DataTalks.Club Zoomcamp lesson pages (72 markdown files, 295 chunks)

Key Takeaways

1. RAG is just 3 functions. search() + build_prompt() + llm(). Everything else is optimization.

2. Chunking matters more than you think. Going from full documents to 2,000-character chunks reduced input tokens by 3× and improved answer quality significantly.

3. Agents are just loops. The "magic" of agentic AI is literally a while loop that keeps calling the LLM until finish_reason == "stop". Understanding this demystifies 90% of agent frameworks like LangChain and LlamaIndex.

4. You don't need OpenAI. I ran everything on Groq's free tier using Llama 3.1. The OpenAI-compatible API means zero code changes.

My Homework Solution

All my code for Module 1 is open source on GitHub:

github.com/Derrick-Ryan-Giggs/llm-zoomcamp-2026

It includes:

rag-intro.ipynb — the full RAG pipeline
agents.ipynb — function calling and the agentic loop
homework.ipynb — Module 1 homework solutions

Want to Learn Too?

LLM Zoomcamp is completely free. No paywall, no certificate fees, just open-source learning.

Module 2 is up next — Vector Search. I'll be writing about that too.

If you're a data engineer, ML practitioner, or software developer who wants to build real LLM applications — not just call ChatGPT — this course is for you.

Are you taking LLM Zoomcamp 2026? Drop a comment — I'd love to connect and compare notes.

DEV Community