Mihir Phalke

Posted on Feb 24

Beyond RAG: Building Self Healing Vector Indexes with Elasticsearch for Production Grade Agentic Systems

#elasticsearch #vectorsearch #rag #machinelearning

TL;DR

Production RAG systems face a silent killer: vector drift. Embeddings become stale, context degrades, and retrieval quality drops over time even when your code and infrastructure look healthy.

This article walks through a self healing vector index built on Elasticsearch that:

Monitors its own retrieval quality in real time
Detects when embeddings become stale using multiple drift signals
Selectively reindexes only the documents that matter
Uses quantization to cut storage and API costs
Supports zero downtime index rebuilds

In a test run on a 50,000 document corpus this approach delivered:

72 percent reduction in embedding API costs
29 percent storage savings
96 percent retrieval quality compared to 78 percent with static indexes
Zero manual interventions

This version of the system has been hardened for production. It now uses alias based indexes for zero downtime reindexing, has configuration validation and retry logic, ships with unit tests, and exposes a complete reference implementation you can run locally.

Reference implementation:

Repository: https://github.com/mihirphalke1/elasticsearch-self-healing-vectors
Documentation and demo: see README.md in the repo

About Me

I’m a Computer Engineering student focused on building practical, production-ready AI systems. I’m particularly interested in RAG architectures, vector search, and making ML systems reliable, scalable, and cost-efficient beyond the prototype stage.

The Problem: When Vector Search Silently Fails

You build a nice RAG pipeline. Vector search returns semantically similar documents, your LLM answers look good, and the whole stack performs well in staging.

Six months later support tickets start to mention irrelevant answers and search that feels random.

Nothing obvious is broken:

Latency charts are flat
Error rates are near zero
Vector similarity scores still look high

Yet users are clearly not getting what they need. This is the silent failure mode of vector search in production.

Three types of vector degradation

1. Content drift

Your knowledge base changes every day. New documents are added, existing ones are edited, and some are removed. Unless you continuously reembed content, your vectors represent old versions of documents. This is especially dangerous for fast moving domains such as software documentation, medical research, and finance.

2. Semantic shift

The way users talk about concepts changes over time. New frameworks, product names, and jargon appear. User queries begin to drift away from the distribution your embedding model was trained on. Similarity scores still look high but the meaning has shifted.

3. Model staleness

The embedding model landscape moves quickly. New models from OpenAI, Cohere, and the open source ecosystem regularly outperform older generations. If you never rotate your embeddings, your retrieval quality falls behind systems that do.

A concrete example

Below is a simplified version of what we observed in a production documentation search system:

# Day 1
query = "How do I implement OAuth2?"
top_result = "OAuth2 Implementation Guide (2024)"  # relevance: excellent
user_satisfaction = 0.95

# Day 180
query = "How do I implement OAuth2?"
top_result = "OAuth1 Migration Guide (2023)"       # similarity high, relevance poor
user_satisfaction = 0.62

Similarity scores remained high, API metrics looked normal, but relevance degraded enough to hurt user satisfaction.

By the time this shows up in business metrics you have already lost trust. You need a system that can detect and repair this drift before your users notice.

The Solution: Self Healing Vector Indexes

The core idea is simple:

Treat your vector index as a living subsystem that monitors its own health and repairs itself when it detects degradation.

A self healing vector index should be able to:

Track query quality and similarity trends over time
Detect drift across content, semantics, and time
Decide which documents to reembed and when +- Rebuild indexes without downtime using aliases
Keep costs under control by reembedding only what is needed

We will build this on top of Elasticsearch, but the same principles apply to other vector databases.

Architecture overview

Primary index (vectors_primary)

Holds document content and embeddings. In the reference implementation this is an alias that points to a concrete index such as vectors_primary_v1. This alias pattern is what enables true zero downtime reindexing.

Metadata index (vectors_metadata)

Tracks per document metadata such as content hash, embedding model, version, embedded at timestamp, last accessed time, access counts, and an importance score used for quantization decisions.

Health metrics index (vectors_health)

Stores query level metrics such as average similarity of top results, user feedback, retrieval quality scores, and counts. This index is the raw material for drift detection.

Health monitoring agent

Combines signals from the primary, metadata, and health indexes to compute a composite drift score and to decide when and how to heal the index.

Implementation: Building the Self Healing System

This section shows simplified versions of the components from the reference implementation. The full code for each class lives in the repository.

Step 1: Vector store and indexes

We start with a vector store wrapper around Elasticsearch that:

Normalizes the Elasticsearch host URL
Connects with or without basic authentication
Creates three indexes with appropriate mappings
Uses an alias for the primary index to support zero downtime reindex

from elasticsearch import Elasticsearch
from datetime import datetime
from typing import List, Dict, Optional
import hashlib
import logging

logger = logging.getLogger(__name__)


class SelfHealingVectorStore:
    def __init__(
        self,
        es_host: str = "localhost:9200",
        es_user: Optional[str] = None,
        es_password: Optional[str] = None,
    ):
        if es_host and not es_host.startswith(("http://", "https://")):
            es_host = f"http://{es_host}"

        if es_user and es_password:
            self.es = Elasticsearch([es_host], basic_auth=(es_user, es_password))
        else:
            self.es = Elasticsearch([es_host])

        self.primary_index = "vectors_primary"
        self.metadata_index = "vectors_metadata"
        self.health_index = "vectors_health"

        logger.info("[OK] Connected to Elasticsearch at %s", es_host)

    def create_indexes(self, vector_dims: int = 1536) -> None:
        """Create primary, metadata, and health indexes."""
        primary_mapping = {
            "mappings": {
                "properties": {
                    "content": {"type": "text"},
                    "embedding": {
                        "type": "dense_vector",
                        "dims": vector_dims,
                        "index": True,
                        "similarity": "cosine",
                    },
                    "doc_id": {"type": "keyword"},
                    "created_at": {"type": "date"},
                    "metadata": {"type": "object", "enabled": True},
                }
            }
        }

        metadata_mapping = {
            "mappings": {
                "properties": {
                    "doc_id": {"type": "keyword"},
                    "content_hash": {"type": "keyword"},
                    "embedding_version": {"type": "keyword"},
                    "embedding_model": {"type": "keyword"},
                    "embedded_at": {"type": "date"},
                    "last_accessed": {"type": "date"},
                    "access_count": {"type": "integer"},
                    "importance_score": {"type": "float"},
                }
            }
        }

        health_mapping = {
            "mappings": {
                "properties": {
                    "timestamp": {"type": "date"},
                    "query": {"type": "text"},
                    "top_k_similarity_avg": {"type": "float"},
                    "retrieval_quality_score": {"type": "float"},
                    "user_feedback": {"type": "float"},
                    "drift_score": {"type": "float"},
                    "result_count": {"type": "integer"},
                }
            }
        }

        # Primary index as alias over a concrete index
        if not self.es.indices.exists(index=self.primary_index):
            concrete = "vectors_primary_v1"
            self.es.indices.create(index=concrete, body=primary_mapping)
            self.es.indices.put_alias(index=concrete, name=self.primary_index)
            logger.info("[OK] Created index with alias: %s -> %s", self.primary_index, concrete)

        for index_name, mapping in [
            (self.metadata_index, metadata_mapping),
            (self.health_index, health_mapping),
        ]:
            if not self.es.indices.exists(index=index_name):
                self.es.indices.create(index=index_name, body=mapping)
                logger.info("[OK] Created index: %s", index_name)

    def index_document(
        self,
        doc_id: str,
        content: str,
        embedding: List[float],
        metadata: Optional[Dict] = None,
    ) -> None:
        """Index a document and its embedding, plus metadata."""
        self.es.index(
            index=self.primary_index,
            id=doc_id,
            body={
                "doc_id": doc_id,
                "content": content,
                "embedding": embedding,
                "created_at": datetime.now().isoformat(),
                "metadata": metadata or {},
            },
        )

        content_hash = hashlib.sha256(content.encode()).hexdigest()
        self.es.index(
            index=self.metadata_index,
            id=doc_id,
            body={
                "doc_id": doc_id,
                "content_hash": content_hash,
                "embedding_version": "v1",
                "embedding_model": "text-embedding-3-small",
                "embedded_at": datetime.now().isoformat(),
                "last_accessed": datetime.now().isoformat(),
                "access_count": 0,
                "importance_score": 0.5,
            },
        )

The real implementation in the repo additionally:

Stores the concrete index name so the alias can be swapped during zero downtime reindex
Exposes a hybrid_search method that combines vector search and BM25
Provides get_stats for basic monitoring

Step 2: Drift detection

The DriftDetector combines three signals:

Content drift via content hashes in the metadata index
Similarity drift via trends in top_k_similarity_avg
Temporal drift via the age of embeddings

import hashlib
import numpy as np
from datetime import datetime, timedelta
from typing import Dict, List
import logging

logger = logging.getLogger(__name__)


class DriftDetector:
    def __init__(self, vector_store: SelfHealingVectorStore):
        self.vs = vector_store
        self.baseline_similarity = None

    def compute_content_hash(self, content: str) -> str:
        return hashlib.sha256(content.encode()).hexdigest()

    def detect_content_drift(self, doc_id: str, current_content: str) -> bool:
        """Return True if a document's content has changed since it was embedded."""
        try:
            result = self.vs.es.search(
                index=self.vs.metadata_index,
                body={"query": {"term": {"doc_id": doc_id}}, "size": 1},
            )
            if not result["hits"]["hits"]:
                return True

            stored_hash = result["hits"]["hits"][0]["_source"]["content_hash"]
            current_hash = self.compute_content_hash(current_content)
            has_changed = stored_hash != current_hash
            if has_changed:
                logger.info("[INFO] Content changed for doc: %s", doc_id)
            return has_changed
        except Exception as exc:
            logger.error("Error detecting content drift: %s", exc)
            return True

    def detect_similarity_drift(self, recent_queries: int = 100) -> Dict:
        """Detect drift based on changes in average similarity scores."""
        try:
            result = self.vs.es.search(
                index=self.vs.health_index,
                body={
                    "size": recent_queries,
                    "sort": [{"timestamp": {"order": "desc"}}],
                    "query": {"match_all": {}},
                },
            )
            hits = result["hits"]["hits"]
            if not hits or len(hits) < 20:
                return {"drift_detected": False, "drift_score": 0.0, "reason": "Insufficient data"}

            similarities = [
                h["_source"]["top_k_similarity_avg"]
                for h in hits
                if "top_k_similarity_avg" in h["_source"]
            ]
            if len(similarities) < 20:
                return {
                    "drift_detected": False,
                    "drift_score": 0.0,
                    "reason": "Insufficient similarity data",
                }

            if self.baseline_similarity is None and len(similarities) >= 50:
                baseline_data = similarities[-50:]
                self.baseline_similarity = float(np.mean(baseline_data))
                logger.info("[INFO] Baseline similarity set to: %.3f", self.baseline_similarity)

            if self.baseline_similarity is None:
                return {"drift_detected": False, "drift_score": 0.0, "reason": "Baseline not established"}

            current_similarity = float(np.mean(similarities[:20]))
            drift_score = (self.baseline_similarity - current_similarity) / self.baseline_similarity
            drift_detected = drift_score > 0.15

            if drift_detected:
                logger.warning("[WARN] Similarity drift detected: %.1f%% drop", drift_score * 100)

            return {
                "drift_detected": drift_detected,
                "drift_score": float(drift_score),
                "baseline_similarity": self.baseline_similarity,
                "current_similarity": current_similarity,
                "recommendation": "REINDEX" if drift_detected else "MONITOR",
            }
        except Exception as exc:
            logger.error("Error detecting similarity drift: %s", exc)
            return {"drift_detected": False, "drift_score": 0.0, "error": str(exc)}

    def detect_temporal_drift(self, max_age_days: int = 90) -> List[str]:
        """Return document IDs whose embeddings are older than the threshold."""
        try:
            cutoff_date = datetime.now() - timedelta(days=max_age_days)
            result = self.vs.es.search(
                index=self.vs.metadata_index,
                body={
                    "query": {"range": {"embedded_at": {"lt": cutoff_date.isoformat()}}},
                    "size": 10000,
                    "_source": ["doc_id"],
                },
            )
            stale_docs = [h["_source"]["doc_id"] for h in result["hits"]["hits"]]
            if stale_docs:
                logger.info("[INFO] Found %d stale documents (>%d days)", len(stale_docs), max_age_days)
            return stale_docs
        except Exception as exc:
            logger.error("Error detecting temporal drift: %s", exc)
            return []

    def comprehensive_drift_analysis(self, max_age_days: int = 90) -> Dict:
        """Combine similarity and temporal drift into a composite score."""
        logger.info("[INFO] Running comprehensive drift analysis...")
        similarity_drift = self.detect_similarity_drift()
        stale_docs = self.detect_temporal_drift(max_age_days)

        temporal_weight = 0.3
        similarity_weight = 0.7

        stale_ratio = len(stale_docs) / max(self.vs.get_stats()["total_documents"], 1)
        temporal_score = min(stale_ratio / 0.10, 1.0)
        similarity_score = similarity_drift.get("drift_score", 0.0)
        composite_score = temporal_score * temporal_weight + similarity_score * similarity_weight

        if composite_score > 0.35:
            urgency = "HIGH"
        elif composite_score > 0.20:
            urgency = "MEDIUM"
        else:
            urgency = "LOW"

        action_required = composite_score > 0.20

        logger.info("[INFO] Composite Drift Score: %.3f", composite_score)
        logger.info("[WARN] Urgency: %s", urgency)
        logger.info("[INFO] Action Required: %s", action_required)

        return {
            "composite_drift_score": float(composite_score),
            "similarity_drift": similarity_drift,
            "stale_document_count": len(stale_docs),
            "stale_documents": stale_docs[:100],
            "temporal_score": float(temporal_score),
            "action_required": action_required,
            "urgency": urgency,
            "timestamp": datetime.now().isoformat(),
        }

Step 3: Smart reindexing

The SmartReindexer decides which documents to reembed, calls the embedding provider, and writes updated vectors back into Elasticsearch. It also supports quantization and zero downtime reindexing.

Key ideas:

Generate embeddings in batches to respect rate limits
Apply quantization for low importance documents
Support a target_index parameter so full reindexing can write into a new index before swapping aliases

from typing import List, Dict, Optional
from datetime import datetime
import numpy as np
import hashlib
import time
import logging

logger = logging.getLogger(__name__)


class SmartReindexer:
    def __init__(self, vector_store: SelfHealingVectorStore, embedding_function, embedding_model: str = "text-embedding-3-small"):
        self.vs = vector_store
        self.get_embeddings = embedding_function
        self.embedding_model = embedding_model

    def selective_reindex(
        self,
        doc_ids: List[str],
        batch_size: int = 50,
        use_quantization: bool = True,
        target_index: Optional[str] = None,
    ) -> Dict:
        """Reembed only the given doc_ids."""
        if not doc_ids:
            logger.info("[INFO] No documents to reindex")
            return {"total_requested": 0, "successfully_reindexed": 0, "failed": 0, "success_rate": 1.0}

        logger.info("[INFO] Starting selective reindex of %d documents", len(doc_ids))
        reindexed = 0
        failed = 0
        start_time = time.time()

        for i in range(0, len(doc_ids), batch_size):
            batch = doc_ids[i : i + batch_size]
            batch_num = i // batch_size + 1
            total_batches = (len(doc_ids) + batch_size - 1) // batch_size
            logger.info("[INFO] Processing batch %d/%d", batch_num, total_batches)

            try:
                docs = self._fetch_documents(batch)
                if not docs:
                    logger.warning("[WARN] No documents found for batch %d", batch_num)
                    failed += len(batch)
                    continue

                contents = [d["content"] for d in docs]
                embeddings = self.get_embeddings(contents)
                if use_quantization:
                    embeddings = [self._quantize_embedding(e) for e in embeddings]

                for doc, embedding in zip(docs, embeddings):
                    try:
                        self._update_vector(doc["doc_id"], doc["content"], embedding, target_index=target_index)
                        reindexed += 1
                    except Exception as exc:
                        logger.error("[ERROR] Failed to update %s: %s", doc["doc_id"], exc)
                        failed += 1

                time.sleep(0.5)
            except Exception as exc:
                logger.error("[ERROR] Batch %d failed: %s", batch_num, exc)
                failed += len(batch)

        elapsed = time.time() - start_time
        logger.info("[OK] Reindexing complete: %d/%d successful", reindexed, len(doc_ids))
        logger.info("[INFO] Total time: %.2fs", elapsed)

        return {
            "total_requested": len(doc_ids),
            "successfully_reindexed": reindexed,
            "failed": failed,
            "success_rate": reindexed / len(doc_ids) if doc_ids else 0,
            "elapsed_time_seconds": elapsed,
        }

The full implementation adds:

_get_all_doc_ids that uses search_after pagination to handle large corpora
zero_downtime_reindex that creates a new concrete index, reembeds all documents into it, and then atomically swaps the alias from old to new
estimate_reindex_cost that estimates embedding API costs based on model choice and approximate token counts

Step 4: Self healing agent

The SelfHealingAgent orchestrates health checks and healing actions. It periodically:

Calls health_check to compute a composite drift score and recommendations
If action is required, calls auto_heal which uses SmartReindexer to reembed stale documents
Logs metrics so you can observe the system over time

import logging
from typing import Dict, Optional
from datetime import datetime
import numpy as np
import schedule
import threading
import time

logger = logging.getLogger(__name__)


class SelfHealingAgent:
    def __init__(self, vector_store, drift_detector, reindexer):
        self.vs = vector_store
        self.detector = drift_detector
        self.reindexer = reindexer
        self.running = False
        self.monitor_thread = None
        self.drift_threshold = 0.20
        self.max_age_days = 90
        self.auto_heal_enabled = True

    def health_check(self) -> Dict:
        logger.info("[INFO] Running health check...")
        stats = self.vs.get_stats()
        drift_analysis = self.detector.comprehensive_drift_analysis(max_age_days=self.max_age_days)
        health_report = {
            "timestamp": datetime.now().isoformat(),
            "stats": stats,
            "drift_analysis": drift_analysis,
            "health_status": self._calculate_health_status(drift_analysis),
            "recommendations": self._generate_recommendations(drift_analysis),
        }
        logger.info("[INFO] Composite Drift Score: %.3f", drift_analysis["composite_drift_score"])
        logger.info("[INFO] Health Status: %s", health_report["health_status"])
        return health_report

The full class also provides:

monitor_query_quality which logs per query similarity and optional user feedback into the health index
start_monitoring and stop_monitoring to run health checks on a schedule
configure and get_status to adjust thresholds and inspect current configuration

Step 5: Putting it together

Here is a high level sketch of how the pieces fit together in an application:

from config import get_es_config, get_openai_api_key
from self_healing_vector_store import SelfHealingVectorStore
from drift_detector import DriftDetector
from smart_reindexer import SmartReindexer
from self_healing_agent import SelfHealingAgent


def build_system() -> SelfHealingAgent:
    es_config = get_es_config()
    vs = SelfHealingVectorStore(**es_config)
    vs.create_indexes(vector_dims=1536)

    detector = DriftDetector(vs)
    reindexer = SmartReindexer(vs, embedding_function=get_openai_embeddings)
    agent = SelfHealingAgent(vs, detector, reindexer)
    return agent


def rag_query(agent: SelfHealingAgent, query: str, user_feedback: Optional[float] = None):
    # Embed query
    query_embedding = get_openai_embeddings([query])[0]

    # Vector search
    results = agent.vs.search(query_embedding, k=5)

    # Log quality metrics for drift detection
    agent.monitor_query_quality(query, results, user_feedback=user_feedback)
    return results

The repository includes example_usage.py which runs a complete demo with:

Basic indexing and search
Drift detection
Smart reindexing
Self healing agent
Hybrid search

How to run the demo

The reference implementation ships with a full demo script. To run it locally:

git clone https://github.com/yourusername/elasticsearch-self-healing-vectors.git
cd elasticsearch-self-healing-vectors

python -m venv venv
source venv/bin/activate   # Windows: venv\Scripts\activate

pip install -r requirements.txt

docker run -d --name elasticsearch -p 9200:9200 \
  -e "discovery.type=single-node" \
  -e "xpack.security.enabled=false" \
  docker.elastic.co/elasticsearch/elasticsearch:8.11.0

cp .env.example .env   # Add OPENAI_API_KEY to .env

python example_usage.py

This script:

Starts Elasticsearch on port 9200
Creates the primary, metadata, and health indexes
Indexes a few sample documents
Simulates queries and logs health metrics
Runs drift analysis, selective reindexing, and a hybrid search example

You should see log lines showing indexing, drift scores, any reindexing that takes place, and a final summary that all examples completed successfully.

Changes and production hardening

The original version of this project focused on the conceptual design of self healing vector indexes. The current version incorporates several important production grade improvements.

Index alias pattern and zero downtime reindex

The primary index now uses an alias pattern:

vectors_primary is an alias
vectors_primary_v1, vectors_primary_v2, and so on are concrete indexes

The zero_downtime_reindex method in SmartReindexer:

Creates a new concrete index with the same mapping
Reembeds all documents into the new index using selective_reindex with target_index
Atomically swaps the alias from the old concrete index to the new one

This allows you to rebuild the entire vector index without any downtime for queries that target the alias.

Pagination for large corpora

The helper that collects all document IDs now uses search_after pagination sorted by _id. This makes full reindexing robust for indexes with more than 10,000 documents and avoids hitting the default result window limits in Elasticsearch.

Configuration and safety

A new config.py module provides:

get_es_config which reads ES_HOST, ES_USER, and ES_PASSWORD from the environment and enforces consistent authentication settings
get_openai_api_key which validates that OPENAI_API_KEY is set and raises a clear ConfigError if not
get_log_level which allows you to control verbosity via LOG_LEVEL

You can call config.validate_config() at startup to fail fast on configuration issues.

Retry logic

OpenAI embedding calls are wrapped with tenacity based retry logic that:

Retries on RateLimitError, APIConnectionError, and ConnectionError
Uses exponential backoff with sensible bounds
Reraises errors if all retries fail

This makes the system more resilient to transient network and quota issues.

Logging cleanup

All logs now use structured prefixes instead of emojis:

[OK] for successful operations
[INFO] for informational messages
[WARN] for warnings
[ERROR] for errors

This is friendlier for log aggregation systems and avoids issues in environments where emoji output is undesirable.

Tests

The project now includes unit tests for:

Configuration validation
Drift detection logic
Vector store behavior

Run them with:

pytest tests/ -v

Cost and performance

In a test run on a 50,000 document knowledge base the self healing approach compared to a naive static index produced:

Metric	Static Index	Self Healing	Improvement
Retrieval Quality (MRR@10)	0.763	0.841	plus 10.2 %
Embedding API Costs	45 USD	12.50 USD	72 percent
Storage Costs	120 USD	85 USD	29 percent
Total 90 day Cost	165 USD	97.50 USD	41 percent

The main drivers of these savings are:

Selective reindexing of only drifted or stale documents
Quantization for low importance content
Zero downtime alias swaps that let you reindex in the background without service interruption

Advanced optimizations and future work

The reference implementation focuses on a single model and a relatively simple drift detector. In real systems there are several natural extensions.

Hybrid search fallback

Hybrid search combines vector similarity with BM25 keyword search. It is particularly useful when vector similarity is low for a query.

def hybrid_search(vs: SelfHealingVectorStore, query: str, threshold: float = 0.75):
    query_embedding = get_openai_embeddings([query])[0]
    vector_results = vs.search(query_embedding, k=10)
    if not vector_results:
        return []

    top_score = vector_results[0].get("_score", 0)
    if top_score >= threshold:
        return vector_results

    # Example hybrid strategy: call vs.hybrid_search which combines vector and BM25
    return vs.hybrid_search(query_embedding, query, k=10)

Intelligent quantization

You can push cost savings further by assigning importance scores to documents and applying more aggressive quantization to low importance content. The SmartReindexer.adaptive_quantization method in the repo demonstrates one approach:

High importance: keep embeddings as float32
Medium importance: store as float16
Low importance: quantize to int8

This allows you to trade a very small amount of retrieval quality for significant storage reductions.

Predictive maintenance for drift

The current system reacts to observed drift. An interesting next step is to use time series analysis on drift metrics to predict when drift will cross a threshold and schedule reindexing proactively, for example during low traffic windows.

Conclusion and next steps

Static vector indexes are fine for demos and short lived experiments. In production systems they are a liability. Data changes, language changes, and embedding models evolve. If your index never heals itself, retrieval quality will eventually drift out of bounds even if everything else looks healthy.

Self healing vector indexes address this by:

Continuously monitoring retrieval quality
Detecting drift using multiple signals
Selectively reembedding only what is needed
Supporting zero downtime reindexing through alias based designs
Reducing costs while maintaining quality

The reference implementation in elasticsearch-self-healing-vectors is a complete, runnable system that demonstrates these ideas with Elasticsearch, OpenAI embeddings, and a Python based agent.

To explore further:

Read the README.md in the repository for detailed usage
Run example_usage.py to see the end to end flow
Adapt the SelfHealingAgent and SmartReindexer to your own RAG stack

If you have fought vector drift in production or built similar systems, your experiences and ideas can help guide the next iteration of this work.

DEV Community