<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Threshika Vijayakumar</title>
    <description>The latest articles on DEV Community by Threshika Vijayakumar (@threshika_vs).</description>
    <link>https://dev.to/threshika_vs</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3567341%2Fdf5261ab-7492-494d-a822-f9d8e7738f91.png</url>
      <title>DEV Community: Threshika Vijayakumar</title>
      <link>https://dev.to/threshika_vs</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/threshika_vs"/>
    <language>en</language>
    <item>
      <title>I Thought My RAG Was Broken. The Real Problem Was Chunking.</title>
      <dc:creator>Threshika Vijayakumar</dc:creator>
      <pubDate>Wed, 10 Jun 2026 06:22:23 +0000</pubDate>
      <link>https://dev.to/threshika_vs/i-thought-my-rag-was-broken-the-real-problem-was-chunking-4b04</link>
      <guid>https://dev.to/threshika_vs/i-thought-my-rag-was-broken-the-real-problem-was-chunking-4b04</guid>
      <description>&lt;p&gt;When I started learning RAG, I assumed the difficult parts would be:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Embeddings&lt;/li&gt;
&lt;li&gt;Vector databases&lt;/li&gt;
&lt;li&gt;LLMs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I was wrong.&lt;/p&gt;

&lt;p&gt;My embeddings were working.&lt;/p&gt;

&lt;p&gt;My vector database was returning results.&lt;/p&gt;

&lt;p&gt;The LLM was generating answers.&lt;/p&gt;

&lt;p&gt;Yet the responses were often incomplete, irrelevant, or missing important context.&lt;/p&gt;

&lt;p&gt;After hours of debugging, I discovered the problem wasn't the model.&lt;/p&gt;

&lt;p&gt;It was how I was splitting my documents.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Chunking Matters More Than Most People Think
&lt;/h2&gt;

&lt;p&gt;A RAG system can only retrieve what it can find.&lt;/p&gt;

&lt;p&gt;And what it can find depends heavily on how your documents are chunked.&lt;/p&gt;

&lt;p&gt;Bad chunking leads to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Missing context&lt;/li&gt;
&lt;li&gt;Poor retrieval&lt;/li&gt;
&lt;li&gt;Irrelevant answers&lt;/li&gt;
&lt;li&gt;Hallucinations&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Even when everything else is configured correctly.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbzwjf9nc50wravaeroxu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbzwjf9nc50wravaeroxu.png" alt=" " width="528" height="260"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Figure 1: Good chunking improves retrieval quality, while bad chunking fragments context and hurts answer quality.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;In many cases, the quality of your answers is decided before the LLM generates a single token.&lt;/p&gt;


&lt;h2&gt;
  
  
  Mistake #1: Chunks That Are Too Large
&lt;/h2&gt;

&lt;p&gt;Imagine storing an entire chapter as a single chunk.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;20-page chapter
        ↓
      1 chunk
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now a user asks a question about one paragraph.&lt;/p&gt;

&lt;p&gt;The retrieval system has to bring back the entire chapter.&lt;/p&gt;

&lt;p&gt;This introduces a lot of irrelevant context and makes retrieval less precise.&lt;/p&gt;

&lt;p&gt;Bigger chunks don't always mean better answers.&lt;/p&gt;




&lt;h2&gt;
  
  
  Mistake #2: Chunks That Are Too Small
&lt;/h2&gt;

&lt;p&gt;I then tried the opposite approach.&lt;/p&gt;

&lt;p&gt;Tiny chunks.&lt;/p&gt;

&lt;p&gt;Something like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Chunk 1:
The capital of France is

Chunk 2:
Paris
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The problem?&lt;/p&gt;

&lt;p&gt;Context gets destroyed.&lt;/p&gt;

&lt;p&gt;The retrieval system may find only part of the answer.&lt;/p&gt;

&lt;p&gt;The information exists, but the meaning is fragmented.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F87obuojtenss8v8vb673.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F87obuojtenss8v8vb673.png" alt=" " width="799" height="559"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Figure 2: Effective chunking is a balance. Chunks that are too large introduce noise, while chunks that are too small lose context.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;This was the first time I realized that chunk size isn't just a preprocessing setting—it directly impacts retrieval quality.&lt;/p&gt;


&lt;h2&gt;
  
  
  Mistake #3: No Chunk Overlap
&lt;/h2&gt;

&lt;p&gt;This was one of the most surprising lessons.&lt;/p&gt;

&lt;p&gt;Without overlap:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Chunk 1
--------
Embeddings
Vector Search

Chunk 2
--------
Retrieval
Generation
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;What happens if an important concept sits between the boundary of two chunks?&lt;/p&gt;

&lt;p&gt;You lose context.&lt;/p&gt;

&lt;p&gt;Adding overlap helps preserve information that naturally spans multiple chunks.&lt;/p&gt;




&lt;h2&gt;
  
  
  Mistake #4: Splitting by Character Count Alone
&lt;/h2&gt;

&lt;p&gt;A lot of tutorials do something like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;chunk_size&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;500&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;and stop there.&lt;/p&gt;

&lt;p&gt;The problem is that text doesn't naturally organize itself into 500-character blocks.&lt;/p&gt;

&lt;p&gt;You might accidentally split:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;The vector database stores embeddings used for...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;and&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;...semantic search across documents.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The sentence survives.&lt;/p&gt;

&lt;p&gt;The meaning doesn't.&lt;/p&gt;




&lt;h2&gt;
  
  
  Mistake #5: Using the Same Strategy Everywhere
&lt;/h2&gt;

&lt;p&gt;Not every document should be chunked the same way.&lt;/p&gt;

&lt;p&gt;Documentation, codebases, contracts, and research papers all have different structures.&lt;/p&gt;

&lt;p&gt;For example:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Documentation → section-based chunks&lt;/li&gt;
&lt;li&gt;Code → function or class-based chunks&lt;/li&gt;
&lt;li&gt;Research papers → section-based chunks&lt;/li&gt;
&lt;li&gt;Contracts → clause-based chunks&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The document structure often provides better chunk boundaries than arbitrary token counts.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Lesson That Changed My Thinking
&lt;/h2&gt;

&lt;p&gt;When I started learning RAG, I viewed chunking as a preprocessing step.&lt;/p&gt;

&lt;p&gt;Now I see it differently.&lt;/p&gt;

&lt;p&gt;Chunking is retrieval engineering.&lt;/p&gt;

&lt;p&gt;Because retrieval quality directly affects answer quality.&lt;/p&gt;

&lt;p&gt;Better chunks lead to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Better retrieval&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Better context&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Better answers&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Without changing the LLM at all.&lt;/p&gt;




&lt;h2&gt;
  
  
  Final Thoughts
&lt;/h2&gt;

&lt;p&gt;The biggest surprise in my RAG journey wasn't embeddings or vector databases.&lt;/p&gt;

&lt;p&gt;It was discovering how much impact document splitting has on retrieval.&lt;/p&gt;

&lt;p&gt;If your RAG system isn't performing well, don't immediately blame the model.&lt;/p&gt;

&lt;p&gt;Look at your chunks first.&lt;/p&gt;

&lt;p&gt;The problem might already exist before the LLM ever sees the question.&lt;/p&gt;




&lt;p&gt;💡 What's your preferred chunking strategy when building RAG systems?&lt;/p&gt;

</description>
      <category>ai</category>
      <category>rag</category>
      <category>machinelearning</category>
      <category>llm</category>
    </item>
    <item>
      <title>The Day I Realized RAG Isn't an AI Problem</title>
      <dc:creator>Threshika Vijayakumar</dc:creator>
      <pubDate>Wed, 10 Jun 2026 05:30:10 +0000</pubDate>
      <link>https://dev.to/threshika_vs/the-day-i-realized-rag-isnt-an-ai-problem-23ac</link>
      <guid>https://dev.to/threshika_vs/the-day-i-realized-rag-isnt-an-ai-problem-23ac</guid>
      <description>&lt;p&gt;When I first started learning Retrieval-Augmented Generation (RAG), I thought the hardest part would be understanding Large Language Models.&lt;/p&gt;

&lt;p&gt;I was wrong.&lt;/p&gt;

&lt;p&gt;I thought I would spend most of my time:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Choosing the best LLM&lt;/li&gt;
&lt;li&gt;Writing better prompts&lt;/li&gt;
&lt;li&gt;Tweaking model parameters&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Instead, I ended up spending most of my time thinking about &lt;strong&gt;search&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;And that's when something clicked:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;RAG isn't primarily an AI problem.&lt;/p&gt;

&lt;p&gt;It's a search problem.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Let me explain.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Mental Model Most Beginners Have
&lt;/h2&gt;

&lt;p&gt;When we first interact with ChatGPT or any AI assistant, we imagine something like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Question
   ↓
  AI
   ↓
Answer
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Simple, right?&lt;/p&gt;

&lt;p&gt;Ask a question.&lt;/p&gt;

&lt;p&gt;Get an answer.&lt;/p&gt;

&lt;p&gt;But once you start building applications with your own data, this model breaks.&lt;/p&gt;

&lt;p&gt;Fast.&lt;/p&gt;




&lt;h2&gt;
  
  
  My First "Wait... Why Is This Wrong?" Moment
&lt;/h2&gt;

&lt;p&gt;Imagine asking an AI:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"What happened in yesterday's IPL match?"&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Or:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"What's the latest version of this framework?"&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Or:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"What does page 42 of this PDF say?"&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The model might answer confidently.&lt;/p&gt;

&lt;p&gt;The problem?&lt;/p&gt;

&lt;p&gt;It may not actually know.&lt;/p&gt;

&lt;p&gt;Why?&lt;/p&gt;

&lt;p&gt;Because LLMs don't magically know everything.&lt;/p&gt;

&lt;p&gt;They only know what was available during training.&lt;/p&gt;

&lt;p&gt;Anything outside that knowledge is a problem.&lt;/p&gt;

&lt;p&gt;And that's exactly the problem RAG tries to solve.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I Thought RAG Was
&lt;/h2&gt;

&lt;p&gt;When I first heard about RAG, I imagined something extremely complicated.&lt;/p&gt;

&lt;p&gt;Maybe:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Multiple AI models&lt;/li&gt;
&lt;li&gt;Complex reasoning systems&lt;/li&gt;
&lt;li&gt;Fancy prompt engineering tricks&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;But after digging deeper, I realized RAG is surprisingly simple.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbiv21f1tf10w4b06rsv1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbiv21f1tf10w4b06rsv1.png" alt=" " width="580" height="342"&gt;&lt;/a&gt;&lt;br&gt;
The biggest surprise wasn't the "Generation" part.&lt;/p&gt;

&lt;p&gt;It was the "Retrieval" part.&lt;/p&gt;

&lt;p&gt;Notice how the answer isn't generated immediately. The system first searches for relevant information, gathers context, and only then asks the LLM to generate a response.&lt;/p&gt;
&lt;h2&gt;
  
  
  This was the moment I started seeing RAG as a search problem rather than an AI problem.
&lt;/h2&gt;
&lt;h2&gt;
  
  
  The Library Analogy That Made Everything Click 📚
&lt;/h2&gt;

&lt;p&gt;Imagine walking into a library with one million books.&lt;/p&gt;

&lt;p&gt;You ask:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"How do black holes affect time?"&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;What would a librarian do?&lt;/p&gt;

&lt;p&gt;Probably not this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Read 1,000,000 books
      ↓
Find answer
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Instead:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Find relevant books
      ↓
Open relevant pages
      ↓
Read only what's needed
      ↓
Answer
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's basically RAG.&lt;/p&gt;

&lt;p&gt;The system first finds relevant information.&lt;/p&gt;

&lt;p&gt;Then the LLM uses that information to generate an answer.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why Traditional Search Isn't Enough
&lt;/h2&gt;

&lt;p&gt;Let's say a document contains:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Electric vehicles are becoming more popular.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now imagine the user searches:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Why are battery-powered cars growing in popularity?
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Keyword matching struggles.&lt;/p&gt;

&lt;p&gt;Humans don't.&lt;/p&gt;

&lt;p&gt;We instantly understand both sentences are talking about the same thing.&lt;/p&gt;

&lt;p&gt;Machines need help understanding that connection.&lt;/p&gt;

&lt;p&gt;This is where embeddings enter the story.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Concept That Changed Everything: Embeddings
&lt;/h2&gt;

&lt;p&gt;Embeddings sounded scary when I first heard the term.&lt;/p&gt;

&lt;p&gt;In reality, the idea is beautiful.&lt;/p&gt;

&lt;p&gt;We convert text into numbers.&lt;/p&gt;

&lt;p&gt;Something like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"car"
      ↓
[0.12, -0.55, 0.89, ...]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"vehicle"
      ↓
[0.15, -0.51, 0.91, ...]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The exact numbers don't matter.&lt;/p&gt;

&lt;p&gt;What matters is this:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Similar meanings produce similar vectors.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Which means:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;car
vehicle
automobile
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;end up close together in vector space.&lt;/p&gt;

&lt;p&gt;Now the machine can search by meaning instead of exact words.&lt;/p&gt;

&lt;p&gt;That's huge.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Most Underrated Part of RAG: Chunking
&lt;/h2&gt;

&lt;p&gt;When people talk about RAG, they usually talk about:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;OpenAI&lt;/li&gt;
&lt;li&gt;Gemini&lt;/li&gt;
&lt;li&gt;Claude&lt;/li&gt;
&lt;li&gt;Vector Databases&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;But one of the most important decisions happens before any of that.&lt;/p&gt;

&lt;h3&gt;
  
  
  Chunking
&lt;/h3&gt;

&lt;p&gt;Imagine storing an entire 200-page book as a single document.&lt;/p&gt;

&lt;p&gt;A user asks about one sentence.&lt;/p&gt;

&lt;p&gt;Good luck retrieving that efficiently.&lt;/p&gt;

&lt;p&gt;Instead we split content into smaller chunks:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Document
   ↓
Chunk 1
Chunk 2
Chunk 3
Chunk 4
...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now retrieval becomes much more precise.&lt;/p&gt;

&lt;p&gt;One thing I've learned:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Bad chunking can destroy a RAG system.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Even when everything else is configured correctly.&lt;/p&gt;




&lt;h2&gt;
  
  
  So Why Do We Need Vector Databases?
&lt;/h2&gt;

&lt;p&gt;After creating embeddings, we need somewhere to store them.&lt;/p&gt;

&lt;p&gt;That's where vector databases come in.&lt;/p&gt;

&lt;p&gt;Traditional databases answer questions like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'John'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Vector databases answer questions like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Find content most similar
to this question
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's a completely different problem.&lt;/p&gt;

&lt;p&gt;And it's what makes semantic search possible.&lt;/p&gt;

&lt;p&gt;Popular options include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;PostgreSQL + pgvector&lt;/li&gt;
&lt;li&gt;Pinecone&lt;/li&gt;
&lt;li&gt;Weaviate&lt;/li&gt;
&lt;li&gt;Qdrant&lt;/li&gt;
&lt;li&gt;Milvus&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  What Actually Happens Inside a RAG Pipeline?
&lt;/h2&gt;

&lt;p&gt;Here's the simplified flow:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F22ae5qfcpsrfulbxhqhp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F22ae5qfcpsrfulbxhqhp.png" alt=" " width="800" height="426"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Notice something?&lt;/p&gt;

&lt;p&gt;The LLM appears near the end.&lt;/p&gt;

&lt;p&gt;Most of the work happens before generation.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Biggest Lesson From My RAG Journey
&lt;/h2&gt;

&lt;p&gt;When I started learning RAG, I thought:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Better model = better answers&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Now I think:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Better retrieval = better answers&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Because even the most powerful model can't answer questions if the relevant information never reaches it.&lt;/p&gt;

&lt;p&gt;That's why experienced engineers spend so much time improving:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Chunking&lt;/li&gt;
&lt;li&gt;Embeddings&lt;/li&gt;
&lt;li&gt;Search quality&lt;/li&gt;
&lt;li&gt;Metadata filtering&lt;/li&gt;
&lt;li&gt;Reranking&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The answer quality often depends more on retrieval than generation.&lt;/p&gt;




&lt;h2&gt;
  
  
  Final Thoughts
&lt;/h2&gt;

&lt;p&gt;The most surprising thing I've learned about RAG is that it changed the way I think about AI systems.&lt;/p&gt;

&lt;p&gt;I used to believe the intelligence lived entirely inside the model.&lt;/p&gt;

&lt;p&gt;Now I realize a huge part of the intelligence comes from finding the right information at the right time.&lt;/p&gt;

&lt;p&gt;And that's why I no longer see RAG as just an AI technique.&lt;/p&gt;

&lt;p&gt;I see it as a search problem that happens to use AI.&lt;/p&gt;

&lt;p&gt;And honestly?&lt;/p&gt;

&lt;p&gt;That realization taught me more about modern AI than any prompt engineering tutorial ever did.&lt;/p&gt;




&lt;p&gt;💡&lt;strong&gt;What's the most surprising thing you've learned while building or learning RAG? I'd love to hear your experience in the comments.&lt;/strong&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>rag</category>
      <category>llm</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>Keycloak: The Open-Source Hero Behind Secure Logins</title>
      <dc:creator>Threshika Vijayakumar</dc:creator>
      <pubDate>Sun, 26 Oct 2025 16:11:40 +0000</pubDate>
      <link>https://dev.to/threshika_vs/keycloak-the-open-source-hero-behind-secure-logins-43n5</link>
      <guid>https://dev.to/threshika_vs/keycloak-the-open-source-hero-behind-secure-logins-43n5</guid>
      <description>&lt;p&gt;Every time you click “Login with Google” or “Sign in with GitHub,” a complex dance happens in the background: tokens are exchanged, your identity is verified, and permissions are granted, all in a matter of seconds.&lt;/p&gt;

&lt;p&gt;While many developers rely on cloud services like AWS Cognito or Firebase Authentication, there’s a powerful open-source alternative that gives you full control over authentication and user management: &lt;em&gt;Keycloak&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What is Keycloak?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Keycloak is an open-source Identity and Access Management (IAM) solution developed by Red Hat.&lt;br&gt;
It helps developers add authentication, authorization, and single sign-on (SSO) to their applications without writing security code from scratch.&lt;/p&gt;

&lt;p&gt;In simple terms:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Keycloak helps you manage who can access your application, how they log in, and what permissions they have.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;Why Keycloak When There Are So Many Cloud Options?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;You might wonder why not use AWS Cognito, Firebase Auth, or Azure AD instead?&lt;/p&gt;

&lt;p&gt;Here’s what makes Keycloak special:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Open Source&lt;/li&gt;
&lt;li&gt;Self-hosted&lt;/li&gt;
&lt;li&gt;Easy Integration&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Keycloak in a Nutshell&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Realm – Your own isolated space managing users, roles, and clients. (You can have multiple realms like dev, test, prod.)&lt;/p&gt;

&lt;p&gt;User – Represents a person or service that can log in. Can be created manually, registered, or linked via external IdPs.&lt;/p&gt;

&lt;p&gt;Client – Any app using Keycloak for login (e.g., frontend, backend). Defines redirect URIs, access type, and permissions.&lt;/p&gt;

&lt;p&gt;Identity Provider (IdP) – External service verifying user identity (e.g., Google, GitHub, Azure AD, AWS Cognito, GCP). Keycloak connects them all in one place.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Hands-On: Run Keycloak Using Docker&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Step 1: Pull the Keycloak Image&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;docker pull quay.io/keycloak/keycloak:latest
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Step 2: Run Keycloak in Development Mode&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;docker run -d \
  --name keycloak \
  -p 8080:8080 \
  -e KEYCLOAK_ADMIN=admin \
  -e KEYCLOAK_ADMIN_PASSWORD=admin \
  quay.io/keycloak/keycloak:latest start-dev

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Step 3: Log in to the Admin Console&lt;/p&gt;

&lt;p&gt;Go to:&lt;br&gt;
&lt;code&gt;http://localhost:8080&lt;/code&gt;&lt;br&gt;
Login using:&lt;br&gt;
Username: admin&lt;br&gt;
Password: admin&lt;/p&gt;

&lt;p&gt;You’ll see the Keycloak dashboard with options to manage realms, users, and clients.&lt;/p&gt;

&lt;p&gt;Step 4: Create a Realm&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Click on the top-left dropdown → Create Realm&lt;br&gt;
Name it (e.g., myapp-realm)&lt;br&gt;
Save&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Step 5: Add a Client&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Go to Clients → Create Client&lt;br&gt;
Name: react-app&lt;br&gt;
Root URL: &lt;a href="http://localhost:3000" rel="noopener noreferrer"&gt;http://localhost:3000&lt;/a&gt; (your app’s URL)&lt;br&gt;
Save and configure redirect URIs&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Step 6: Add a User&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Go to Users → Add User&lt;br&gt;
Set username (e.g., john)&lt;br&gt;
Go to Credentials tab → Set password&lt;br&gt;
Enable Temporary Password = OFF&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;My Experience Working with Keycloak&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Recently, I came across Keycloak while exploring secure authentication. I started experimenting with it, and soon I was able to integrate the latest Keycloak Quarkus version (previously, it was based on WildFly). The new Quarkus-based version felt significantly lighter, started faster, and was easier to configure, which made the entire setup experience smoother.&lt;/p&gt;

&lt;p&gt;However, it wasn’t without challenges. One of the main issues I faced was with webhook-like event integrations, which weren’t available directly through the UI. I had to configure them manually using Keycloak’s event listener mechanism. Since Keycloak is open-source and fully extensible, I could add custom logic and workarounds, but it took some digging through the documentation to get it right.&lt;/p&gt;

&lt;p&gt;Another challenge was handling redirect URIs and token configurations for clients. A small mismatch in redirect URLs or access type (public vs. confidential) can cause authentication loops or token errors. Understanding how Keycloak issues tokens and how the client consumes them took some trial and error, but once it clicked, the flow made perfect sense.&lt;/p&gt;

&lt;p&gt;Despite these hurdles, the experience was amazing. Once the integration was complete, authentication and user management became seamless. It felt rewarding to see how flexible and powerful Keycloak can be when you really understand its structure and flow.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq7hli7hj7s2tydm38onh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq7hli7hj7s2tydm38onh.png" alt=" " width="535" height="467"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;When you finally get Keycloak working after the setup struggle 😎&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Final Thoughts&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Authentication is a complex but critical part of every application.&lt;br&gt;
Instead of building your own login system and handling tokens manually, Keycloak provides a ready-to-use, secure, and flexible identity management solution.&lt;/p&gt;

&lt;p&gt;Whether you’re securing a single web app or managing microservices in the cloud, Keycloak simplifies identity so you can focus on building your core product.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Start your Keycloak journey today because secure login doesn’t have to be hard.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>keycloak</category>
      <category>webdev</category>
      <category>programming</category>
      <category>security</category>
    </item>
    <item>
      <title>Redis Explained: The Secret Ingredient Behind Fast Apps &amp; Smooth DevOps</title>
      <dc:creator>Threshika Vijayakumar</dc:creator>
      <pubDate>Wed, 15 Oct 2025 18:18:49 +0000</pubDate>
      <link>https://dev.to/threshika_vs/redis-explained-the-secret-ingredient-behind-fast-apps-smooth-devops-1p9n</link>
      <guid>https://dev.to/threshika_vs/redis-explained-the-secret-ingredient-behind-fast-apps-smooth-devops-1p9n</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Have you ever wondered how apps like Instagram, Netflix, or GitHub handle millions of users without breaking a sweat?&lt;br&gt;
How do they make things load instantly, even when thousands of people are online at the same time?&lt;/p&gt;

&lt;p&gt;The secret sauce is often something hidden behind the scenes, a little hero called &lt;em&gt;&lt;strong&gt;Redis&lt;/strong&gt;&lt;/em&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;What Exactly is Redis?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Redis stands for &lt;em&gt;Remote Dictionary Server&lt;/em&gt;, but don’t let the name scare you.&lt;br&gt;
Think of Redis as a super-speedy notepad that your app can use to remember things temporarily (or even permanently, if you want).&lt;/p&gt;

&lt;p&gt;It’s an open-source, &lt;strong&gt;in-memory data store&lt;/strong&gt;, which means it keeps your data in RAM instead of a hard drive. And since RAM is way faster than disk, Redis can respond in microseconds, which is why it’s so popular.&lt;/p&gt;

&lt;p&gt;It can act as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A database (for storing data)&lt;/li&gt;
&lt;li&gt;A cache (for speeding up responses)&lt;/li&gt;
&lt;li&gt;A message broker (for managing queues &amp;amp; communication)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;How Redis Works?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Redis stores data in the form of key-value pairs like a &lt;strong&gt;dictionary in Python&lt;/strong&gt;, a &lt;strong&gt;Map in JavaScript&lt;/strong&gt;, or a &lt;strong&gt;HashMap in Java&lt;/strong&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Key: "user:101"
Value: "{name: 'Threshika', age: 22, country: 'India'}"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can set, get, update, or delete values using simple commands:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SET user:101 "Threshika"
GET user:101
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Setting Up Redis Using Docker&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Now that you know what Redis is, let’s actually run it!&lt;br&gt;&lt;br&gt;
The easiest way to get started is by using &lt;strong&gt;Docker&lt;/strong&gt; no installation headaches, just pull and run.&lt;/p&gt;

&lt;p&gt;If you have Docker installed, open your terminal and run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;docker pull redis/redis-stack
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This pulls the Redis Stack image, which includes Redis plus extra features like RedisInsight (a UI tool) and modules for JSON, Search, and Graph.&lt;/p&gt;

&lt;p&gt;Once pulled, start the container with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;docker run -d --name redis-stack -p 6379:6379 -p 8001:8001 redis/redis-stack
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Port 6379 → Redis server&lt;/li&gt;
&lt;li&gt;Port 8001 → RedisInsight dashboard (&lt;a href="http://localhost:8001" rel="noopener noreferrer"&gt;http://localhost:8001&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Now Redis is up and running inside Docker!&lt;br&gt;
You can connect to it using any Redis client or CLI:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;redis-cli
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Redis in Development: The Developer’s Superpower&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Redis isn’t just about speed; it helps developers build smarter and more efficient apps.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Caching for Instant Responses&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Your backend can cache database queries or API results in Redis, so users don’t have to wait every time.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Example:&lt;/em&gt;&lt;br&gt;
If a user opens your profile page 10 times, your app fetches it from Redis instead of reloading everything from the database again and again.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Session Storage&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Web apps use Redis to store user sessions, those tiny pieces of data that remember you’re logged in.&lt;br&gt;
If you’ve ever been logged into a website even after closing your browser, there’s a good chance Redis was behind it.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Real-Time Applications&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Redis has a Pub/Sub (Publish/Subscribe) feature that makes it ideal for chat apps, live notifications, or multiplayer games, allowing updates to be sent to users instantly.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Queues and Background Jobs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Redis Lists and Streams are great for managing background tasks like sending emails or processing payments asynchronously.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Redis in DevOps: The Backbone of Speed and Reliability&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Developers use Redis in their code, but DevOps engineers rely on it to make entire systems faster and more reliable.&lt;/p&gt;

&lt;p&gt;Here’s how,&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Shared Caching Layer Across Microservices&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In large systems, Redis acts as a central cache, helping multiple microservices share data efficiently.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Message Broker for Smooth Communication&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Redis Streams enable services to communicate with each other. One service sends a message, another receives it, and processes it. This is how scalable systems handle background work seamlessly. &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;CI/CD Optimization&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In DevOps pipelines, Redis stores temporary build data, job states, and cache dependencies to speed up deployment times.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Scalable Infrastructure&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Redis runs beautifully inside Docker, Kubernetes, or cloud platforms like AWS, Azure, and Google Cloud. It can even be clustered for high availability, with no single point of failure.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Wrapping Up: Why Redis is Worth Learning&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff3rugovplzju0idtpdy9.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff3rugovplzju0idtpdy9.jpg" alt=" " width="512" height="384"&gt;&lt;/a&gt;&lt;br&gt;
Redis is more than just a tool; it’s a mindset of speed, simplicity, and scalability.&lt;br&gt;
Whether you’re building a small side project or managing a large-scale distributed system, Redis fits right in.&lt;/p&gt;

&lt;p&gt;Once you start using Redis, you’ll realize how much time and effort it saves and how much faster your apps feel.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Final Thought&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Redis isn’t just for developers or DevOps; it’s for anyone who loves building things that feel instant.&lt;br&gt;
Learn it once, and you’ll find yourself using it everywhere.&lt;/p&gt;

</description>
      <category>beginners</category>
      <category>devops</category>
      <category>database</category>
      <category>performance</category>
    </item>
  </channel>
</rss>
