<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: A Aesthetic</title>
    <description>The latest articles on DEV Community by A Aesthetic (@a_aesthetic_dbd654c063b47).</description>
    <link>https://dev.to/a_aesthetic_dbd654c063b47</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3951475%2F9704b0ac-1626-484a-9d3c-fd5cb4939ec0.png</url>
      <title>DEV Community: A Aesthetic</title>
      <link>https://dev.to/a_aesthetic_dbd654c063b47</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/a_aesthetic_dbd654c063b47"/>
    <language>en</language>
    <item>
      <title>I built a local-first movie recommender with Corrective-RAG (cited explanations, hybrid retrieval, runs entirely on Ollama)</title>
      <dc:creator>A Aesthetic</dc:creator>
      <pubDate>Mon, 25 May 2026 22:50:24 +0000</pubDate>
      <link>https://dev.to/a_aesthetic_dbd654c063b47/i-built-a-local-first-movie-recommender-with-corrective-rag-cited-explanations-hybrid-retrieval-1iog</link>
      <guid>https://dev.to/a_aesthetic_dbd654c063b47/i-built-a-local-first-movie-recommender-with-corrective-rag-cited-explanations-hybrid-retrieval-1iog</guid>
      <description>&lt;p&gt;Hey — sharing a project I've been building for the last&lt;br&gt;
few months. It's a movie recommendation system that runs entirely on&lt;br&gt;
your laptop using Ollama, with a Corrective-RAG pipeline.&lt;/p&gt;

&lt;p&gt;Why I built it: existing streaming platforms only know what you&lt;br&gt;
watched on them. Netflix can't see my Prime history, none of them know&lt;br&gt;
about cinema watches. Wanted one system that learns from all of it.&lt;/p&gt;

&lt;p&gt;Stack:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;7-stage Corrective-RAG (LangGraph static graph, not autonomous agents)&lt;/li&gt;
&lt;li&gt;Hybrid retrieval: Chroma dense vectors + rank-bm25 sparse, fused via RRF&lt;/li&gt;
&lt;li&gt;BGE-small-en-v1.5 embeddings + BGE-reranker-base cross-encoder&lt;/li&gt;
&lt;li&gt;Grader-based correction loop with retry budget&lt;/li&gt;
&lt;li&gt;Cited explanations - every bullet must reference a real source field,
bullets that fail validation are dropped (no hallucinated plot summaries)&lt;/li&gt;
&lt;li&gt;Ollama llama3 default, OpenAI/Anthropic pluggable per role&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The interesting design choice was query expansion at INGEST time instead&lt;br&gt;
of query time. The enrichment LLM generates 3-5 pseudo-queries per movie&lt;br&gt;
and embeds them alongside the plot. Catalogues are bounded; user queries&lt;br&gt;
aren't, so paying the LLM cost once per movie scales better than once&lt;br&gt;
per query.&lt;/p&gt;

&lt;p&gt;Latency on M3 / 36GB / Ollama llama3: ~90s/query (filter_extract +&lt;br&gt;
explain dominate). llama3.2:1b drops to ~15-20s. Hosted models ~5-10s.&lt;/p&gt;

&lt;p&gt;Code + setup: github.com/meetgrewal7793-creator/personal-movie-recommender&lt;/p&gt;

&lt;p&gt;The 7-stage architecture diagram is in the README. Feedback welcome —&lt;br&gt;
especially on the grader prompt calibration, which I had to relax for&lt;br&gt;
local-LLM defaults because llama3 graders over-flag results as weak.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>llm</category>
      <category>rag</category>
      <category>showdev</category>
    </item>
  </channel>
</rss>
