DEV Community

Venkat Ambati
Venkat Ambati

Posted on

Memory Palace Part 2: Agentic RAG, Chrome Extension, and Making AI Actually Understand You 🧠✨

From "dumb search" to intelligent reasoning β€” plus save anything with one click


Previously on Memory Palace...

A few weeks ago, I shared how I built Memory Palace β€” a RAG-powered knowledge management system that handles both external research (Pockets) and personal thoughts (Memories).

The feedback was amazing. But two things kept coming up:

"This is great, but sometimes the answers don't quite get what I'm asking..."

"I don't want to copy-paste URLs into a web app. Can I just... click a button?"

So I rebuilt the entire RAG pipeline. And built a Chrome Extension.


What's New in Part 2?

🧠 Agentic RAG Pipeline β€” AI That Actually Thinks

The biggest upgrade isn't visible β€” it's in how the system thinks. We went from "dumb retrieval" to a multi-step reasoning pipeline.

πŸ”Œ Chrome Extension β€” Save Anything With One Click

A full-featured browser extension that brings Memory Palace to every webpage.

Let's dive into both.


Part 1: The Agentic RAG Pipeline

The original RAG was simple:

  1. Embed query β†’ Vector search β†’ Get chunks β†’ Send to LLM β†’ Done

It worked. But it was... dumb. It didn't understand intent. It couldn't tell if you were asking a follow-up question. It treated "hello" the same as "compare the methodologies in my research papers."

The New Pipeline: 7 Steps of Intelligence

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                        AGENTIC RAG PIPELINE                                 β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                                                             β”‚
β”‚   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                   β”‚
β”‚   β”‚  1. Query   β”‚     β”‚  2. Adaptive β”‚     β”‚  3. Context β”‚                   β”‚
β”‚   β”‚   Router    │────▢│  Retrieval   │────▢│   Rewrite   β”‚                   β”‚
β”‚   β”‚             β”‚     β”‚   Params     β”‚     β”‚             β”‚                   β”‚
β”‚   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                   β”‚
β”‚         β”‚                                       β”‚                           β”‚
β”‚         β–Ό                                       β–Ό                           β”‚
β”‚   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                   β”‚
β”‚   β”‚  Skip RAG?  β”‚     β”‚  4. Multi   β”‚     β”‚  5. Hybrid  β”‚                   β”‚
β”‚   β”‚  (Greeting) β”‚     β”‚   Query     │────▢│   Search    β”‚                   β”‚
β”‚   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β”‚  Generation β”‚     β”‚             β”‚                   β”‚
β”‚                       β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                   β”‚
β”‚                                                 β”‚                           β”‚
β”‚                                                 β–Ό                           β”‚
β”‚                       β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                   β”‚
β”‚                       β”‚  7. Answer  β”‚     β”‚  6. CRAG    β”‚                   β”‚
β”‚                       β”‚  Synthesis  │◀────│  Grading    β”‚                   β”‚
β”‚                       β”‚  + Stream   β”‚     β”‚             β”‚                   β”‚
β”‚                       β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                   β”‚
β”‚                                                                             β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
Enter fullscreen mode Exit fullscreen mode

Let me explain each step:


Step 1: Query Router β€” Intent Classification

Before doing anything, we ask: "What kind of question is this?"

type QueryIntent =
    | "no_retrieval" // "Hello!" - no sources needed
    | "simple_lookup" // "What is X?" - direct fact lookup
    | "comparison" // "Compare A and B" - needs multiple sources
    | "summarization" // "Summarize..." - needs aggregation
    | "analytical" // "Why does..." - deep reasoning required
    | "follow_up"; // "Tell me more" - needs conversation context
Enter fullscreen mode Exit fullscreen mode

Why it matters: If someone says "Hi, how are you?", we don't need to search 1000 chunks. We can respond directly.

if (routerResult.skipRetrieval) {
    // Just respond, no RAG needed
    sendEvent({ type: "token", payload: "Hello! How can I help you today?" });
    return;
}
Enter fullscreen mode Exit fullscreen mode

Step 2: Adaptive Retrieval Parameters

Different questions need different retrieval strategies:

function getAdaptiveRetrievalParams(
    intent: QueryIntent
): AdaptiveRetrievalParams {
    switch (intent) {
        case "comparison":
            return {
                chunkCount: 20, // Need more chunks for comparison
                vectorWeight: 0.5, // Balance semantic + keyword
                ftsWeight: 0.5,
                expansionQueries: 5, // Generate more query variations
            };

        case "simple_lookup":
            return {
                chunkCount: 5, // Few chunks, high precision
                vectorWeight: 0.7, // Lean into semantic
                ftsWeight: 0.3,
                expansionQueries: 2,
            };

        case "analytical":
            return {
                chunkCount: 15, // Lots of context for analysis
                vectorWeight: 0.6,
                ftsWeight: 0.4,
                expansionQueries: 4,
            };
        // ...
    }
}
Enter fullscreen mode Exit fullscreen mode

The insight: "Compare Apple and Google's AI strategy" needs way more chunks than "What is Apple's market cap?"


Step 3: Context-Aware Query Rewriting

Follow-up questions are the hardest. When you ask "What about their revenue?", what does "their" mean?

We rewrite ambiguous queries using conversation history:

const rewrittenQuery = await rewriteQueryWithContext(
    "What about their revenue?", // Original query
    [
        { role: "user", content: "Compare Apple and Google AI" },
        { role: "assistant", content: "Apple focuses on..." },
    ]
);

// Result:
// {
//   original: "What about their revenue?",
//   rewritten: "What is Apple and Google's revenue?",
//   extractedEntities: ["Apple", "Google", "revenue"],
//   needsContext: true
// }
Enter fullscreen mode Exit fullscreen mode

Now the search actually finds what you meant.


Step 4: Multi-Query Generation

One query isn't enough. We generate variations to catch different phrasings in your sources:

User asks: "What are the risks of AI?"

We search for:

  • "What are the risks of AI?"
  • "AI dangers and downsides"
  • "Negative impacts of artificial intelligence"
  • "AI safety concerns"
  • "Problems with AI adoption"
const searchQueries = await generateSearchQueriesStream(
    effectiveQuery,
    retrievalParams.expansionQueries // 2-5 queries based on intent
);
Enter fullscreen mode Exit fullscreen mode

Result: 40% better recall on average. We find chunks that matter even if they don't use your exact words.


Step 5: Hybrid Search

Vector search is great for semantics. But sometimes you need exact matches.

We combine both:

-- Hybrid search: vector + full-text with adaptive weights
SELECT
  chunk_id,
  text,
  (
    ({vectorWeight} * (1 - (embedding <=> query_embedding))) +
    ({ftsWeight} * ts_rank(search_vector, websearch_to_tsquery(query)))
  ) as combined_score
FROM chunks
ORDER BY combined_score DESC
LIMIT {chunkCount};
Enter fullscreen mode Exit fullscreen mode

Example: If you search for "NVIDIA earnings Q3", vector search finds semantically similar chunks. Full-text search finds chunks with those exact words. Combined = best results.


Step 6: CRAG β€” Corrective RAG (Chunk Grading)

Here's the innovation: not all retrieved chunks are relevant.

Before sending chunks to the LLM, we grade each one:

interface GradedChunk {
    chunk: any;
    relevance: "relevant" | "partially_relevant" | "irrelevant";
    score: number; // 0-1
    reasoning: string;
}

const cragResult = await gradeChunksRelevance(query, chunks);

// Result:
// {
//   decision: 'sufficient' | 'needs_expansion' | 'no_relevant_sources',
//   avgRelevanceScore: 0.73,
//   relevantChunks: [...],  // Only the good ones
// }
Enter fullscreen mode Exit fullscreen mode

Three outcomes:

  1. sufficient β€” Good chunks found, proceed to answer
  2. needs_expansion β€” Chunks are borderline, try broader search
  3. no_relevant_sources β€” Nothing relevant, tell user honestly

Why this matters: Without CRAG, the LLM gets noisy context and hallucinates. With CRAG, it only sees relevant chunks.


Step 7: Answer Synthesis with Streaming

Finally, we generate the answer with real-time streaming:

// Stream tokens as they're generated
for await (const token of chatGen) {
    sendEvent({ type: "token", payload: token });
}

// Include citations
sendEvent({
    type: "done",
    payload: {
        answer,
        citations: sources.map((s) => ({ id: s.source_id, title: s.title })),
        intent: routerResult.intent,
    },
});
Enter fullscreen mode Exit fullscreen mode

Real-time status updates throughout:

[Status] Analyzing query intent...
[Routing] Intent: comparison, Confidence: 0.89
[Status] Rewriting query with context...
[Rewriting] "What about their approach?" β†’ "What is Apple and Google's approach to AI?"
[Status] Generating 4 search queries...
[Queries] ["Apple Google AI approach", "tech giants AI strategy", ...]
[Status] Searching 4 queries...
[Status] Grading chunk relevance...
[Grading] Decision: sufficient, Avg Score: 0.78, 12/15 chunks relevant
[Sources] Found 3 relevant sources
[Status] Generating answer...
[Token] Apple...
[Token] 's approach...
Enter fullscreen mode Exit fullscreen mode

Part 2: The Chrome Extension

Now onto the second major feature: save anything from anywhere.

Features

  • One-click save β€” Save any webpage as a published memory instantly
  • Built-in chat β€” Ask questions about your memories without leaving the page
  • Smart extraction β€” Pulls full content from any website
  • Secure login β€” Uses the same Supabase auth as the web app

The Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                     Your Browser                                β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                    β”‚
β”‚  β”‚   Popup UI      β”‚     β”‚  Content Script β”‚                    β”‚
β”‚  β”‚  - Login        β”‚     β”‚  - Extracts DOM β”‚                    β”‚
β”‚  β”‚  - Chat         β”‚     β”‚  - Full content β”‚                    β”‚
β”‚  β”‚  - Save button  β”‚     β”‚  - No limits    β”‚                    β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜                    β”‚
β”‚           β”‚                       β”‚                              β”‚
β”‚           β–Ό                       β–Ό                              β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                    β”‚
β”‚  β”‚         Background Service Worker       β”‚                    β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                    β”‚
β”‚                       β”‚                                          β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                        β”‚
                        β–Ό
          β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
          β”‚    Memory Palace API    β”‚
          β”‚      (Railway)          β”‚
          β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
Enter fullscreen mode Exit fullscreen mode

The Game Changer: Unlike the worker which fetches URLs and parses HTML, the extension runs directly in your browser with full DOM access:

  1. No content limits β€” Get the entire article, not just 50KB
  2. JavaScript-rendered content β€” Works on SPAs and dynamic sites
  3. Bypasses bot detection β€” You're a real browser, not a scraper
  4. Sees what you see β€” If you can read it, you can save it

Smart Content Extraction

We don't just grab document.body.innerText. We intelligently extract content:

Platform-specific selectors:

const mainSelectors = [
    // Medium
    "article[data-testid='post']",
    ".meteredContent",

    // Substack
    ".post-content",
    ".available-content",

    // WordPress
    ".entry-content",
    ".article-body",

    // News sites
    ".story-body",
    ".article__body",

    // Dev blogs
    ".markdown-body",
    ".prose",

    // Generic fallbacks
    "article",
    "main",
    '[role="main"]',
];
Enter fullscreen mode Exit fullscreen mode

Find the best element (most content):

let mainElement = null;
let maxContentLength = 0;

for (const selector of mainSelectors) {
    const element = document.querySelector(selector);
    if (element && element.innerText.length > maxContentLength) {
        mainElement = element;
        maxContentLength = element.innerText.length;
    }
}
Enter fullscreen mode Exit fullscreen mode

Aggressive cleanup:

const removeSelectors = [
    "script",
    "style",
    "noscript",
    "iframe",
    "nav",
    "header",
    "footer",
    ".sidebar",
    ".comments",
    ".advertisement",
    ".ad",
    ".social-share",
    ".related-posts",
    ".newsletter",
    ".cookie-notice",
    "button",
    "form",
];
Enter fullscreen mode Exit fullscreen mode

Structure-preserving extraction:

const walkNode = (node) => {
    if (node.nodeType === Node.TEXT_NODE) {
        content += node.textContent;
    } else if (node.nodeType === Node.ELEMENT_NODE) {
        if (["p", "div", "h1", "h2", "li"].includes(tag)) content += "\n";
        if (["h1", "h2", "h3"].includes(tag)) content += "\n## ";
        if (tag === "li") content += "β€’ ";

        for (const child of node.childNodes) walkNode(child);
    }
};
Enter fullscreen mode Exit fullscreen mode

Result: Clean, formatted text with headings and bullet points preserved.


Auto-Publish: Skip the Draft Stage

Previously: Create draft β†’ Review β†’ Publish β†’ Chunk β†’ Embed

Now: Save from extension β†’ Immediately published and searchable

// Extension sends:
body: JSON.stringify({
    org_id: orgId,
    title,
    content,
    status: "published", // Skip the draft!
});

// API automatically queues for chunking:
if (body.status === "published") {
    await queue.add("chunk-memory", { memoryId: memory.id });
}
Enter fullscreen mode Exit fullscreen mode

Result: Save an article β†’ Immediately searchable in chat. No extra clicks.


GitHub Repositories

vedha-pocket-extension β€” Chrome Extension for one-click saves
πŸ‘‰ https://github.com/venki0552/vedha-pocket-extension ← NEW!


The Funny Bits (More Lessons Learned)

1. The "Why Is Everything Relevant?" Disaster

First version of CRAG graded everything as "relevant" because the prompt was too lenient. LLM was like "well, it could be related..."

Fix: Added strict grading criteria and asked for reasoning before the score.

2. The Query Router That Said "Hello" To Everything

Intent classification kept detecting greetings in legitimate questions because the prompt prioritized "be friendly."

Before: "Hello, can you compare the AI strategies?" β†’ intent: 'no_retrieval'

After: Only no_retrieval for actual greetings with no substantive question.

3. The Timeout Cascade

Agentic RAG has 7 LLM calls. At 10 seconds each, that's 70 seconds worst case. Original code had no timeouts. Users waited. Forever.

Fix: 10-second timeout per step, graceful fallbacks:

const routerResult = await fetchWithTimeout(
    url,
    options,
    LLM_TIMEOUT_MS // 10 seconds max
);
Enter fullscreen mode Exit fullscreen mode

4. The "Multi-Query Made It Worse" Mystery

More queries = more results. But more results = more noise. CRAG was filtering out 90% of chunks.

Insight: Generate fewer, better queries. 3-5 is the sweet spot.

5. The "Why Is It All On One Line" Formatting Bug

Extension content extraction used:

content = content.replace(/\s+/g, " ");
Enter fullscreen mode Exit fullscreen mode

Except \s+ matches newlines too. Every article became one giant paragraph.

Fix:

content = content.replace(/ +/g, " "); // Only spaces, preserve newlines
Enter fullscreen mode Exit fullscreen mode

Updated Full Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                 β”‚
β”‚Chrome Extension │─────────────────────────────┐
β”‚  - Save pages   β”‚                             β”‚
β”‚  - Chat         β”‚                             β”‚
β”‚                 β”‚                             β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                             β”‚
                                                β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                 β”‚     β”‚                 β”‚     β”‚                 β”‚
β”‚   Next.js Web   │────▢│   Fastify API   │────▢│   BullMQ Worker β”‚
β”‚   (Vercel)      β”‚     β”‚   (Railway)     β”‚     β”‚   (Railway)     β”‚
β”‚                 β”‚     β”‚  + Agentic RAG  β”‚     β”‚                 β”‚
β”‚                 β”‚     β”‚                 β”‚     β”‚                 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                 β”‚                       β”‚
                                 β–Ό                       β–Ό
                        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                        β”‚                 β”‚     β”‚                 β”‚
                        β”‚    Supabase     β”‚     β”‚    OpenRouter   β”‚
                        β”‚  (PostgreSQL +  β”‚     β”‚   (LLM + Embed) β”‚
                        β”‚    pgvector)    β”‚     β”‚                 β”‚
                        β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
Enter fullscreen mode Exit fullscreen mode

What's Next?

Completed βœ…

  • [x] Agentic RAG Pipeline β€” Query routing, CRAG, adaptive retrieval
  • [x] Chrome Extension β€” Save pages with one click
  • [x] Auto-publish from extension
  • [x] Real-time status streaming

Coming Soon πŸš€

  • [ ] Self-reflective answer grading β€” Retry if answer has hallucinations
  • [ ] Firefox Extension
  • [ ] Keyboard shortcuts β€” Ctrl+Shift+S to save
  • [ ] Offline queue β€” Save when offline, sync when online

Try It Yourself

  1. Web App: https://vedha-pocket-web.vercel.app
  2. Extension: Clone from GitHub
  3. API: https://vedha-api-production.up.railway.app

All open source. All self-hostable.


Final Thoughts

Part 1 was about making RAG work. Part 2 is about making it smart.

The difference between "good enough" and "actually useful" is in the details:

  • Understanding what kind of question you're asking
  • Knowing when retrieval isn't needed
  • Filtering noise before it reaches the LLM
  • Meeting users where they are (in the browser)

The biggest insight: RAG isn't one thing. It's a pipeline. And every step in that pipeline is an opportunity to add intelligence.

Next up: making the system learn from feedback. If you downvote an answer, it should remember why.

Built with ❀️, even more β˜•, and a deep appreciation for graceful timeouts.

β€” Venkat


Part 1: I Built a RAG-Powered Second Brain

GitHub Repos:


Tags: #AI #AgenticRAG #CRAG #ChromeExtension #OpenSource #Supabase #TypeScript #RAG #KnowledgeManagement

Top comments (0)