DEV Community

RajeevaChandra
RajeevaChandra

Posted on

๐๐ฎ๐ข๐ฅ๐๐ข๐ง๐  ๐š ๐ƒ๐ฒ๐ง๐š๐ฆ๐ข๐œ ๐‘๐€๐† ๐๐ข๐ฉ๐ž๐ฅ๐ข๐ง๐ž ๐ฐ๐ข๐ญ๐ก ๐‹๐š๐ง๐ ๐‚๐ก๐š๐ข๐ง (๐“๐ก๐š๐ญ ๐’๐ญ๐š๐ฒ๐ฌ ๐…๐ซ๐ž๐ฌ๐ก)

Most RAG (Retrieval-Augmented Generation) systems work fine for static knowledge basesโ€”but the moment your documents start changing (new policies, updated financials, revised product specs), they quickly go stale.

We solved that with a dynamic RAG pipeline that keeps embeddings and context fresh without doing heavy full rebuilds. Hereโ€™s how it works:

๐Ÿงฉ High-Level Flow

1๏ธโƒฃ ๐–๐š๐ญ๐œ๐ก๐ž๐ซ (๐…๐ข๐ฅ๐ž/๐’3 ๐œ๐ก๐š๐ง๐ ๐ž๐ฌ)
โ–ช Continuously listens for file changes (local folder or S3 bucket).
โ–ช Detects when a document is new, updated, or deleted.
2๏ธโƒฃ๐„๐ฆ๐›๐ž๐๐๐ข๐ง๐  (๐จ๐ง๐ฅ๐ฒ ๐ฎ๐ฉ๐๐š๐ญ๐ž๐ฌ)
โ–ช Instead of re-embedding everything, it re-embeds only the changed chunks.
โ–ช Saves time and compute costs while keeping the knowledge base fresh.
3๏ธโƒฃ ๐•๐ž๐œ๐ญ๐จ๐ซ ๐ƒ๐ (๐‚๐ก๐ซ๐จ๐ฆ๐š)
โ–ช Stores embeddings with metadata like updated_at.
โ–ช When conflicts arise (e.g., same document with old + new facts), retrieval logic can guide the LLM to trust the freshest snippet.
4๏ธโƒฃ ๐‹๐‹๐Œ (๐Ž๐ฅ๐ฅ๐š๐ฆ๐š/๐Ž๐ฉ๐ž๐ง๐€๐ˆ)
โ–ช Takes the top-k retrieved chunks and augments the query.
โ–ช Produces a contextualized answer with citations.
5๏ธโƒฃ ๐’๐ญ๐ซ๐ž๐š๐ฆ๐ฅ๐ข๐ญ ๐”๐ˆ
โ–ช Users simply ask questions.
โ–ช The UI calls the FastAPI backend, retrieves from Chroma, and passes to the LLM.
โ–ชResponses include answers + sources, so users know why the model said what it did.

๐Ÿšง The Challenge (Simple Example)
One file said:
โžก๏ธ โ€œAll banks must maintain capital reserves of 10%.โ€
Later, an update stated:
โžก๏ธ โ€œAll banks must maintain capital reserves of 12%.โ€
When I asked: โ€œWhat is the required capital reserve?โ€

Static RAG: โ€œI donโ€™t know.โ€ (confused by conflicting facts)
Dynamic RAG: โ€œ12%โ€ (trusts the most recent doc)

๐“๐ก๐ž ๐’๐จ๐ฅ๐ฎ๐ญ๐ข๐จ๐ง โ€” ๐ƒ๐ฒ๐ง๐š๐ฆ๐ข๐œ ๐„๐ฆ๐›๐ž๐๐๐ข๐ง๐ ๐ฌ
๐Ÿ”„ Watches for new/updated docs in real time
โšก Re-embeds only what changes (no full rebuilds)
๐Ÿท๏ธ Tracks updated_at so the LLM knows the freshest fact
๐Ÿง  Guides the model to resolve conflicts by trusting the most recent snippet
Now, when a file is updated, the system re-embeds instantly and gives the right answer.

High Level Architecture

For the full working codebase, check my GitHub repo
https://github.com/rajeevchandra/dynamic_embeddings

At the end of the day, AI systems are only as useful as the freshness of the knowledge they rely on. Building dynamic pipelines isnโ€™t just about better tech โ€” itโ€™s about building assistants that can actually keep up with how fast the world changes.

Top comments (0)