<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: JustATalentedGuy</title>
    <description>The latest articles on DEV Community by JustATalentedGuy (@justatalentedguy).</description>
    <link>https://dev.to/justatalentedguy</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1212790%2F109d6513-fdf4-4015-8b7f-e26edf195ae8.jpg</url>
      <title>DEV Community: JustATalentedGuy</title>
      <link>https://dev.to/justatalentedguy</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/justatalentedguy"/>
    <language>en</language>
    <item>
      <title>RAG Is Easy. Useful RAG Is the Hard Part</title>
      <dc:creator>JustATalentedGuy</dc:creator>
      <pubDate>Sat, 04 Jul 2026 13:55:57 +0000</pubDate>
      <link>https://dev.to/justatalentedguy/rag-is-easy-useful-rag-is-the-hard-part-32fd</link>
      <guid>https://dev.to/justatalentedguy/rag-is-easy-useful-rag-is-the-hard-part-32fd</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Everybody says “just add RAG” like it is a button in settings.&lt;br&gt;&lt;br&gt;
It is not. I checked. Very disappointing.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  The Brief: Personalized News Feeds
&lt;/h2&gt;

&lt;p&gt;Pulse started as a personal AI intelligence feed.&lt;/p&gt;

&lt;p&gt;Not a chatbot with a search bar glued to it. Not another app where an LLM confidently explains an article it has never seen. I wanted something more useful:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;collect AI engineering content from RSS, GitHub, arXiv, and Gmail newsletters&lt;/li&gt;
&lt;li&gt;summarize and classify articles&lt;/li&gt;
&lt;li&gt;store embeddings&lt;/li&gt;
&lt;li&gt;support exact, semantic, and hybrid search&lt;/li&gt;
&lt;li&gt;answer questions from my own corpus&lt;/li&gt;
&lt;li&gt;cite the articles it used&lt;/li&gt;
&lt;li&gt;say “I do not know” when the corpus has no answer&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fvtrag2iyt602o29c3huu.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fvtrag2iyt602o29c3huu.jpeg" alt="Home Page" width="610" height="1229"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;That last part is important.&lt;/p&gt;

&lt;p&gt;A RAG system that cannot say “I do not know” is not intelligent. It is just overconfident autocomplete in formal clothes.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fu9t4fu2clvvjfti206py.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fu9t4fu2clvvjfti206py.jpeg" alt="Retrieving suitable article for the question" width="790" height="1600"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The simple version looked like this:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fcu0qnq1rxi11qe2y9y2v.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fcu0qnq1rxi11qe2y9y2v.png" alt="Simple flow" width="800" height="152"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Very clean. Very incomplete.&lt;/p&gt;

&lt;p&gt;The useful version needed much more.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Actual System Architecture
&lt;/h2&gt;

&lt;p&gt;Pulse uses a FastAPI backend, PostgreSQL with pgvector, Groq for generation, and an Expo Android app.&lt;/p&gt;

&lt;p&gt;At a high level:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fyknj2lda2w9dv2fxjn3s.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fyknj2lda2w9dv2fxjn3s.png" alt="Flowchart" width="800" height="267"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For retrieval, the important database columns are:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Article&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Base&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Mapped&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;summary&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Mapped&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;category&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Mapped&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;keywords&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Mapped&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Mapped&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;mapped_column&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Vector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;384&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;embedding_model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Mapped&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;enrichment_status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Mapped&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;hidden&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Mapped&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The vector column uses pgvector, which supports vector similarity search inside Postgres including cosine distance and approximate indexes: &lt;a href="https://github.com/pgvector/pgvector" rel="noopener noreferrer"&gt;pgvector README&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;PostgreSQL also gives full-text search, documented in the &lt;a href="https://www.postgresql.org/docs/current/textsearch.html" rel="noopener noreferrer"&gt;PostgreSQL full-text search docs&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;So Pulse does not choose between SQL search and vector search.&lt;/p&gt;

&lt;p&gt;It uses both.&lt;/p&gt;

&lt;p&gt;Because of course one search mode was too peaceful.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why “Just Use Embeddings” Was Not Enough
&lt;/h2&gt;

&lt;p&gt;Embeddings are useful. They are not magic.&lt;/p&gt;

&lt;p&gt;If the user searches:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;on-device foundation models
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;semantic search is great. It can find articles about local AI, small models, mobile inference, and related topics even if the exact words do not match.&lt;/p&gt;

&lt;p&gt;But if the user searches:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Anthropic
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;exact search is often better. The word itself matters. I do not need a poetic interpretation of Anthropic. I need articles that mention Anthropic.&lt;/p&gt;

&lt;p&gt;This is where pure vector search becomes annoying.&lt;/p&gt;

&lt;p&gt;Vector search is good at meaning. Full-text search is good at exact language. A useful product usually needs both.&lt;/p&gt;

&lt;p&gt;So Pulse supports three modes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Exact      -&amp;gt; PostgreSQL full-text search
Semantic   -&amp;gt; pgvector cosine similarity
Hybrid     -&amp;gt; merge both result sets
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Search Mode 1: Exact Search
&lt;/h2&gt;

&lt;p&gt;Exact search uses PostgreSQL full-text search.&lt;/p&gt;

&lt;p&gt;This works well for names, tools, companies, and terms that should match literally.&lt;/p&gt;

&lt;p&gt;It is also fast and boring.&lt;/p&gt;

&lt;p&gt;But boring is underrated. Many production systems are just boring things that work while exciting things are busy timing out.&lt;/p&gt;

&lt;h2&gt;
  
  
  Search Mode 2: Semantic Search
&lt;/h2&gt;

&lt;p&gt;Semantic search embeds the query and compares it with article embeddings using cosine distance.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;query_embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;call_embedder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query_text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;distance&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Article&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cosine_distance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query_embedding&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;rows&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="nf"&gt;select&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Article&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;distance&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;where&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;Article&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;enrichment_status&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;done&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;Article&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;is_not&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;Article&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;hidden&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;is_&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;order_by&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;distance&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Article&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ingested_at&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;desc&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Search Mode 3: Hybrid Search
&lt;/h2&gt;

&lt;p&gt;Hybrid search combines exact and semantic results using Reciprocal Rank Fusion.&lt;/p&gt;

&lt;p&gt;The idea is simple:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;score = 1 / (k + rank)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If an article ranks well in exact search and semantic search, it rises. If it ranks well in only one, it still has a chance.&lt;/p&gt;

&lt;p&gt;We merge both result lists:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;scores&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;article_id&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nf"&gt;rrf_score&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;exact_rank&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;scores&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;article_id&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nf"&gt;rrf_score&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;semantic_rank&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This made hybrid the default.&lt;/p&gt;

&lt;p&gt;Why?&lt;/p&gt;

&lt;p&gt;Because users do not wake up thinking:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“Today I shall formulate a query that is best served by cosine similarity.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;They type words. The system should adapt.&lt;/p&gt;

&lt;p&gt;Hybrid search lets exact names win when they should, while semantic matches still catch broader ideas.&lt;/p&gt;

&lt;h2&gt;
  
  
  Ask Mode: RAG With Brakes
&lt;/h2&gt;

&lt;p&gt;The Ask mode is where retrieval becomes generation.&lt;/p&gt;

&lt;p&gt;The user asks:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;What are the recent themes around AI coding tools?
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Pulse does this:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fjtisprotg30z4szzs2hm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fjtisprotg30z4szzs2hm.png" alt="Answering Flow" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Here, the rejection step matters.&lt;/p&gt;

&lt;p&gt;If the top retrieved articles are weak, Pulse does not call the LLM.&lt;/p&gt;

&lt;p&gt;This is not a failure.&lt;/p&gt;

&lt;p&gt;This is the product behaving responsibly.&lt;/p&gt;

&lt;p&gt;If I ask:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;What is the weather in Mumbai?
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Pulse should not a produce meteorology fan fiction.&lt;/p&gt;

&lt;p&gt;It should say:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;I do not have enough relevant context in the corpus.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Prompting With Context, Not Hope
&lt;/h2&gt;

&lt;p&gt;The Ask prompt includes only controlled context:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Article ID
Title
Summary
URL
Similarity score
Recent conversation messages
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Not raw HTML. Not full article bodies. Not the entire database. Not “please be accurate” as a magical spell.&lt;/p&gt;

&lt;p&gt;A simplified prompt shape:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;build_ask_prompt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;articles&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;context&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;[&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;article&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;]&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Title: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;article&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Summary: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;article&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;summary&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;URL: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;article&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;article&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;articles&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
Answer the user using only the context below.
If the context is not enough, say so.

Context:
&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;

Question:
&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;
&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The answer includes citations back to article IDs and URLs.&lt;/p&gt;

&lt;p&gt;This keeps the system grounded.&lt;/p&gt;

&lt;p&gt;Not perfectly. Nothing with an LLM is perfect. But much better than letting the model free-climb the truth.&lt;/p&gt;

&lt;h2&gt;
  
  
  Personalization: Ranking Is Also Retrieval
&lt;/h2&gt;

&lt;p&gt;Search is not the only retrieval problem.&lt;/p&gt;

&lt;p&gt;The feed itself is retrieval.&lt;/p&gt;

&lt;p&gt;Pulse learns from reading behavior:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;short reads are weak signals&lt;/li&gt;
&lt;li&gt;longer reads are stronger signals&lt;/li&gt;
&lt;li&gt;read categories update category weights&lt;/li&gt;
&lt;li&gt;article keywords update interest terms&lt;/li&gt;
&lt;li&gt;bookmarks and hidden articles affect what should appear&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The engagement score is intentionally simple:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;engagement_signal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;duration_seconds&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;duration_seconds&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;duration_seconds&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mf"&gt;0.2&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;duration_seconds&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;120&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mf"&gt;1.0&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No fake machine learning ceremony. No “neural preference engine” because I read one article for 14 seconds.&lt;/p&gt;

&lt;p&gt;Category weights use an exponential moving average:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;new_weight&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;old_weight&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;alpha&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;signal&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;old_weight&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The feed score combines:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;importance + category preference + recency + keyword overlap
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Learning Features: RAG Was Only One Part Of The Loop
&lt;/h2&gt;

&lt;p&gt;Once articles are cleaned, summarized, embedded, and ranked, other AI features become easier.&lt;/p&gt;

&lt;p&gt;Pulse uses the same enriched corpus for:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Daily Digest
&lt;/h3&gt;

&lt;p&gt;The digest selects recent high-importance enriched articles and asks Groq for a three-paragraph briefing.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fhjhnqa1oe9yvnqb4578g.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fhjhnqa1oe9yvnqb4578g.png" alt="Daily Digest Flow" width="800" height="149"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This is not just summarization. It is scheduled synthesis.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Trends
&lt;/h3&gt;

&lt;p&gt;Trend detection scans enriched entities from recent articles.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;entity&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;article&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;entities&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;mentions&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;normalized_entity&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;article&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;trends&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="n"&gt;entity&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;entity&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;article_ids&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;mentions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;items&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;article_ids&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This lets the app show repeated topics like companies, models, tools, or research themes.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. LangGraph Quiz Agent
&lt;/h3&gt;

&lt;p&gt;For learning retention, Pulse generates three-question quizzes from an article summary and entities.&lt;/p&gt;

&lt;p&gt;LangGraph is useful for modeling multi-step agent flows.&lt;/p&gt;

&lt;p&gt;Pulse uses the quiz flow for:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Ftrq76tj34n6m6x3d8c2w.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Ftrq76tj34n6m6x3d8c2w.png" alt="Quiz Flow" width="800" height="164"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Quiz sessions are stored server-side with expiry. The answer key is not trusted from the client.&lt;/p&gt;

&lt;p&gt;Because yes, even in a personal app, the client should not grade itself.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Product Rule: Retrieval Before Generation
&lt;/h2&gt;

&lt;p&gt;The biggest design rule became:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Retrieve first. Generate second. Refuse when retrieval is weak.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That rule shows up everywhere:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Search can run without Groq.&lt;/li&gt;
&lt;li&gt;Ask mode refuses unrelated questions before spending quota.&lt;/li&gt;
&lt;li&gt;Digest uses selected articles, not the entire database.&lt;/li&gt;
&lt;li&gt;Quiz generation only works on enriched articles.&lt;/li&gt;
&lt;li&gt;Feed ranking uses stored signals, not live model calls.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This made the system cheaper, faster, and less ridiculous.&lt;/p&gt;

&lt;p&gt;LLMs are powerful. They are also expensive, rate-limited, and occasionally very committed to being wrong.&lt;/p&gt;

&lt;p&gt;So Pulse uses them where they add value, and keeps boring deterministic code around them.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Final Shape
&lt;/h2&gt;

&lt;p&gt;The final RAG architecture looked like this:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Feedgndjhmlpn90fonvju.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Feedgndjhmlpn90fonvju.png" alt="Ingestion Pipeline" width="784" height="441"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fhye9oamr5b2mgy7786lb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fhye9oamr5b2mgy7786lb.png" alt="Question Answering Pipeline" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;That is more work than:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;documents -&amp;gt; embeddings -&amp;gt; chatbot
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Takeaway
&lt;/h2&gt;

&lt;p&gt;RAG is easy when the input data is clean, the query is friendly, and nobody asks anything weird.&lt;/p&gt;

&lt;p&gt;Useful RAG is different.&lt;/p&gt;

&lt;p&gt;Useful RAG needs:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;clean source data&lt;/li&gt;
&lt;li&gt;validated enrichment&lt;/li&gt;
&lt;li&gt;exact search&lt;/li&gt;
&lt;li&gt;semantic search&lt;/li&gt;
&lt;li&gt;hybrid ranking&lt;/li&gt;
&lt;li&gt;relevance thresholds&lt;/li&gt;
&lt;li&gt;citations&lt;/li&gt;
&lt;li&gt;refusal paths&lt;/li&gt;
&lt;li&gt;personalization&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The hard part is not putting vectors in a database.&lt;/p&gt;

&lt;p&gt;The hard part is deciding when the vector result is not good enough.&lt;/p&gt;

&lt;p&gt;The hard part is not calling the LLM.&lt;/p&gt;

&lt;p&gt;The hard part is knowing when not to call it.&lt;/p&gt;

&lt;p&gt;That is what made Pulse useful.&lt;/p&gt;

&lt;p&gt;Not because it could answer everything.&lt;/p&gt;

&lt;p&gt;Because it knew when it could not.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>rag</category>
      <category>postgres</category>
      <category>systemdesign</category>
    </item>
  </channel>
</rss>
