<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Harsh Pundhir</title>
    <description>The latest articles on DEV Community by Harsh Pundhir (@harsh_pundhir_61bb1fe5fbd).</description>
    <link>https://dev.to/harsh_pundhir_61bb1fe5fbd</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3369485%2F17e7175d-491e-4de9-8fd3-853e8a7b1e12.jpg</url>
      <title>DEV Community: Harsh Pundhir</title>
      <link>https://dev.to/harsh_pundhir_61bb1fe5fbd</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/harsh_pundhir_61bb1fe5fbd"/>
    <language>en</language>
    <item>
      <title>Still RAG‑ing in 2025? Use this periodic table instead</title>
      <dc:creator>Harsh Pundhir</dc:creator>
      <pubDate>Sun, 20 Jul 2025 21:55:12 +0000</pubDate>
      <link>https://dev.to/harsh_pundhir_61bb1fe5fbd/still-rag-ing-in-2025-use-this-periodic-table-instead-2h9f</link>
      <guid>https://dev.to/harsh_pundhir_61bb1fe5fbd/still-rag-ing-in-2025-use-this-periodic-table-instead-2h9f</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgwsicpgn0ergrl2ldk58.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgwsicpgn0ergrl2ldk58.png" alt=" " width="800" height="400"&gt;&lt;/a&gt;&lt;br&gt;
&lt;strong&gt;Building a Retrieval‑Augmented‑Generation stack still feels like LEGO on hard‑mode—too many bricks, not enough labels.&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
So I stole a page from Feynman: break the system into “atoms,” keep only one killer &lt;strong&gt;advantage&lt;/strong&gt; and one honest &lt;strong&gt;gotcha&lt;/strong&gt; per tool, and let you mix‑and‑match. Grab one block from each column and you’ve got production RAG.&lt;/p&gt;




&lt;h2&gt;
  
  
  🧱 Data Sources &amp;amp; Content Prep
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://commoncrawl.org" rel="noopener noreferrer"&gt;Common Crawl&lt;/a&gt;&lt;/strong&gt; – nonprofit snapshot of the open web; &lt;strong&gt;+ free 250 B‑page corpus, – raw HTML is very noisy&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/apache/tika" rel="noopener noreferrer"&gt;Apache Tika&lt;/a&gt;&lt;/strong&gt; – Swiss‑army text extractor; &lt;strong&gt;+ handles 1 000+ file types, – Java dependencies add weight&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/Unstructured-IO/unstructured" rel="noopener noreferrer"&gt;Unstructured&lt;/a&gt;&lt;/strong&gt; – Python ETL that chunks complex docs; &lt;strong&gt;+ turns PDFs into LLM‑friendly bits, – API shifts fast&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  🧭 Embeddings
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/UKPLab/sentence-transformers" rel="noopener noreferrer"&gt;Sentence‑Transformers&lt;/a&gt;&lt;/strong&gt; – SBERT family of open‑source encoders; &lt;strong&gt;+ can fine‑tune locally, – slower than one‑shot API vectors&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://platform.openai.com/docs/models/text-embedding-ada-002" rel="noopener noreferrer"&gt;OpenAI Embeddings&lt;/a&gt;&lt;/strong&gt; – pay‑per‑call vectors; &lt;strong&gt;+ one‑line rollout, – vendor lock‑in &amp;amp; cost&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://docs.cohere.com/reference/embed" rel="noopener noreferrer"&gt;Cohere Embed&lt;/a&gt;&lt;/strong&gt; – multilingual / multimodal embeddings; &lt;strong&gt;+ strong non‑English support, – closed‑source API only&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  🗄️ Vector Stores
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://www.pinecone.io" rel="noopener noreferrer"&gt;Pinecone&lt;/a&gt;&lt;/strong&gt; – managed vector database; &lt;strong&gt;+ zero ops, – gets pricey at scale&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/facebookresearch/faiss" rel="noopener noreferrer"&gt;Faiss&lt;/a&gt;&lt;/strong&gt; – C++/GPU brute‑force search; &lt;strong&gt;+ blazing local speed, – DIY sharding &amp;amp; infra&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/qdrant/qdrant" rel="noopener noreferrer"&gt;Qdrant&lt;/a&gt;&lt;/strong&gt; – Rust‑based engine with filters; &lt;strong&gt;+ fast &amp;amp; simple, – smaller community&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/weaviate/weaviate" rel="noopener noreferrer"&gt;Weaviate&lt;/a&gt;&lt;/strong&gt; – GraphQL‑native hybrid search; &lt;strong&gt;+ combines keyword+vector, – memory‑hungry JVM&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/milvus-io/milvus" rel="noopener noreferrer"&gt;Milvus&lt;/a&gt;&lt;/strong&gt; – cloud‑native cluster‑scale store; &lt;strong&gt;+ billion‑scale horizontal scaling, – helm‑chart complexity&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  🧠 Large Language Models
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://openai.com/index/hello-gpt-4o/" rel="noopener noreferrer"&gt;GPT‑4o&lt;/a&gt;&lt;/strong&gt; – OpenAI’s flagship multimodal LLM; &lt;strong&gt;+ tops most benchmarks, – highest token cost&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/meta-llama/llama3" rel="noopener noreferrer"&gt;Llama 3&lt;/a&gt;&lt;/strong&gt; – Meta’s permissively‑licensed weights; &lt;strong&gt;+ self‑host friendly, – trails GPT‑4o on code&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/mistralai/mistral-inference" rel="noopener noreferrer"&gt;Mixtral 8×7B&lt;/a&gt;&lt;/strong&gt; – sparse‑mixture MoE rocket; &lt;strong&gt;+ GPT‑3.5 quality at 6× speed, – big MoE weight files&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/deepseek-ai/DeepSeek-LLM" rel="noopener noreferrer"&gt;DeepSeek‑LLM&lt;/a&gt;&lt;/strong&gt; – bilingual 2 T‑token model; &lt;strong&gt;+ English‑Chinese fluency, – EU data‑privacy questions&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://docs.x.ai/docs/models" rel="noopener noreferrer"&gt;Grok 4&lt;/a&gt;&lt;/strong&gt; – xAI’s real‑time reasoning model; &lt;strong&gt;+ built‑in web search, – premium subscription wall&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://www.anthropic.com/news/claude-3-family" rel="noopener noreferrer"&gt;Claude 3&lt;/a&gt;&lt;/strong&gt; – Anthropic’s alignment‑first family; &lt;strong&gt;+ 200 K context window, – rate‑limited free tier&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  🔗 RAG Orchestration
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/langchain-ai/langchain" rel="noopener noreferrer"&gt;LangChain&lt;/a&gt;&lt;/strong&gt; – LEGO‑style building blocks for chains &amp;amp; agents; &lt;strong&gt;+ huge ecosystem, – can feel heavyweight&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/run-llama/llama_index" rel="noopener noreferrer"&gt;LlamaIndex&lt;/a&gt;&lt;/strong&gt; – data‑centric RAG graphs; &lt;strong&gt;+ 100+ data loaders, – API churn&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/deepset-ai/haystack" rel="noopener noreferrer"&gt;Haystack&lt;/a&gt;&lt;/strong&gt; – node‑based pipelines with built‑in eval; &lt;strong&gt;+ first‑class RAG evaluation, – verbose YAML configs&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  🏷️ Re‑ranking &amp;amp; Evaluation
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/FlagOpen/FlagEmbedding" rel="noopener noreferrer"&gt;BGE‑Reranker&lt;/a&gt;&lt;/strong&gt; – tiny cross‑encoder for relevance boost; &lt;strong&gt;+ SOTA recall, – adds latency&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://docs.cohere.com/reference/rerank" rel="noopener noreferrer"&gt;Cohere Rerank&lt;/a&gt;&lt;/strong&gt; – drop‑in API reordering; &lt;strong&gt;+ one‑line integration, – paid quota&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/stanford-futuredata/ColBERT" rel="noopener noreferrer"&gt;ColBERT&lt;/a&gt;&lt;/strong&gt; – late‑interaction bi‑encoder; &lt;strong&gt;+ scales to 100 M docs, – GPU hungry&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/PrithivirajDamodaran/FlashRank" rel="noopener noreferrer"&gt;FlashRank&lt;/a&gt;&lt;/strong&gt; – super‑lite LLM ranker; &lt;strong&gt;+ CPU friendly, – very new project&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/explodinggradients/ragas" rel="noopener noreferrer"&gt;RAGAS&lt;/a&gt;&lt;/strong&gt; – automated RAG metrics dashboards; &lt;strong&gt;+ generates test sets, – metrics still evolving&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  🚢 Deploy &amp;amp; Safety
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/kubernetes/kubernetes" rel="noopener noreferrer"&gt;Kubernetes&lt;/a&gt;&lt;/strong&gt; – the de‑facto container OS; &lt;strong&gt;+ autoscaling pods, – steep ops learning curve&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/openfaas/faas" rel="noopener noreferrer"&gt;OpenFaaS&lt;/a&gt;&lt;/strong&gt; – serverless on your own K8s; &lt;strong&gt;+ function‑first DX, – cold‑start lag&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/guardrails-ai/guardrails" rel="noopener noreferrer"&gt;Guardrails AI&lt;/a&gt;&lt;/strong&gt; – schema‑based output validators; &lt;strong&gt;+ easy policy checks, – can over‑filter creativity&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/NVIDIA/NeMo-Guardrails" rel="noopener noreferrer"&gt;NeMo Guardrails&lt;/a&gt;&lt;/strong&gt; – Nvidia hallucination fence; &lt;strong&gt;+ programmable dialogue rules, – CUDA bias&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  🎛️ UI / UX
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/streamlit/streamlit" rel="noopener noreferrer"&gt;Streamlit&lt;/a&gt;&lt;/strong&gt; – data apps in five lines of Python; &lt;strong&gt;+ instant dashboards, – limited custom CSS&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/gradio-app/gradio" rel="noopener noreferrer"&gt;Gradio&lt;/a&gt;&lt;/strong&gt; – shareable ML demos; &lt;strong&gt;+ public links in seconds, – hefty JS bundle&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/reflex-dev/reflex" rel="noopener noreferrer"&gt;Reflex&lt;/a&gt;&lt;/strong&gt; – full‑stack web apps in pure Python; &lt;strong&gt;+ no JavaScript required, – pre‑1.0 API churn&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/PostHog/posthog" rel="noopener noreferrer"&gt;PostHog&lt;/a&gt;&lt;/strong&gt; – open‑source product analytics; &lt;strong&gt;+ autocapture events, – self‑hosting eats RAM&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  👩‍🍳 10‑Line Recipe
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Common Crawl ➜ Apache Tika ➜ Sentence‑Transformers ➜ Qdrant ➜ BGE‑Rerank ➜ Llama 3 ➜ LangChain ➜ Guardrails‑AI ➜ Streamlit ➜ Kubernetes&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Swap any piece in its column to tweak cost, latency, or licensing — the chemistry still works. Happy RAG‑building!&lt;/p&gt;

</description>
      <category>ai</category>
      <category>aiops</category>
      <category>rag</category>
      <category>machinelearning</category>
    </item>
  </channel>
</rss>
