DEV Community

Cover image for Building MeridianDB: Solving AI's Memory Crisis with Multi-Dimensional RAG
Ahmed Rakan
Ahmed Rakan

Posted on

Building MeridianDB: Solving AI's Memory Crisis with Multi-Dimensional RAG

Building MeridianDB: Solving AI's Memory Crisis with Multi-Dimensional RAG

Why I Built This

When exploring cloud platforms, I don't just read documentation—I build something substantial. Recently, I dove deep into Cloudflare Workers, and I wanted to tackle a problem that's becoming critical in today's AI landscape: catastrophic forgetting.

The Problem: AI Agents That Forget

Traditional RAG (Retrieval-Augmented Generation) systems use vector databases to enhance AI outputs by storing data as embeddings—multi-dimensional vectors that machines can understand. When you search, the system transforms your query into vectors and performs similarity searches using mathematical distance calculations.

This approach searches for meaning, not just text. But it fails to solve a fundamental problem in agentic AI: catastrophic forgetting—when AI systems learn new information, they often forget old knowledge.

Standard RAG mitigates this issue but doesn't fundamentally solve it. As user data grows exponentially, two critical questions emerge:

  1. How does retrieved data affect AI generation quality?
  2. How relevant is this data over time?

The Solution: Multi-Dimensional Memory

MeridianDB goes beyond traditional RAG by adding multiple dimensions on top of semantic search. Built entirely on Cloudflare's infrastructure (Workers, D1, Vectorize, KV, Queues, and R2), it provides Auto-RAG that's highly scalable, performant, and runs at the edge—near your users, without headaches.

The Four Dimensions of Memory

1. Semantic Search

Like any RAG database, MeridianDB uses Cloudflare Vectorize at its core. When your AI agent sends a query, it performs semantic search to retrieve meaningful data. We recommend over-fetching to allow other features to refine results.

2. Behavioral Learning

When your agent retrieves data, you can add like/dislike buttons to generated responses. User feedback creates behavioral signals—all memories retrieved get penalized for negative signals. Combined with agent configuration, this filters out memories that produce poor results.

3. Temporal Decay

Facts become irrelevant over time. We provide temporal features where you can:

  • Mark data as factual (always included, no decay)
  • Mark data as irrelevant (always excluded)
  • Let intelligent active/passive learning determine inclusion based on smart filtering and access patterns

Our exponential decay algorithm with frequency boost ensures recent and frequently accessed memories stay relevant while old, unused memories naturally fade.

4. Contextual Filtering

Developers or other AI agents can describe memories for specific tasks. This additional metadata helps task-performing agents find precisely what they need.

The Science Behind It

We considered adding graph capabilities—giving agentic AI the ability to build knowledge graphs would be powerful. We could implement this with edge columns and JOIN queries, but decided against it for now to maintain simplicity and performance.

The core challenge is balancing stability and plasticity:

  • Stability: AI systems must consolidate old knowledge when learning new things
  • Plasticity: AI agents must learn new things quickly

This balance varies wildly by use case. A chatbot's stability-plasticity requirements differ dramatically from a coding agent, which needs longer memory consolidation and slower learning rates.

MeridianDB's federated database is extremely configurable, with passive/active learning controlled through agent configuration.

Architecture Decisions

Handling Consistency

Many developers overlook a critical question: when building RAG, your queries are federated (affecting multiple databases)—how do you handle consistency?

Data can go out of sync. Embeddings may succeed while record insertion fails. Lots can go wrong.

MeridianDB handles all of this out of the box.

Our white paper details our approach:

  • Queue-based writes ensure eventual consistency without manual orchestration
  • Data is redundantly stored (Vectorize ( stores only Id of memory in D1 ) + D1 ( memory content )) to preserve multi-dimensional context
  • Automatic retries, failover, graceful degradation on retrieval, NewSQL inspired transactions and event-driven processing

The Learning Phases

We recommend operating agents in two phases:

Phase 1: Passive Learning

Start with successRate: 0.0 and stabilityThreshold: 0.0. This prevents false positives when the system lacks sufficient data. The agent collects interaction data without aggressive filtering.

Phase 2: Active Learning

Once you've accumulated meaningful data, activate filtering by setting appropriate thresholds. The system automatically filters out:

  • Memories with low success rates (behavioral)
  • Memories with low stability scores (temporal)

Temporal Configuration

We use exponential decay with frequency boost. Each agent has its own configuration:

Balanced (Default)

{
  halfLifeHours: 168,      // 7 days
  timeWeight: 0.6,
  frequencyWeight: 0.4,
  decayCurve: 'hybrid',
  decayFloor: 0.15
}
Enter fullscreen mode Exit fullscreen mode

Aggressive Decay (for chatbots)

{
  halfLifeHours: 72,       // 3 days
  timeWeight: 0.7,
  frequencyWeight: 0.3,
  decayCurve: 'exponential'
}
Enter fullscreen mode Exit fullscreen mode

Long-Term Memory (for knowledge bases)

{
  halfLifeHours: 720,      // 30 days
  timeWeight: 0.5,
  frequencyWeight: 0.5,
  decayCurve: 'polynomial'
}
Enter fullscreen mode Exit fullscreen mode

The recency score calculation runs in SQL, keeping retrieval latency at 300-500ms.

Behavioral Configuration

Behavioral features use the Wilson score confidence interval—a statistically robust method for scoring with sparse data:

function wilsonScore(success: number, failure: number, confidence = 0.95) {
  const total = success + failure;
  if (total === 0) return 0;

  const p = success / total;
  const z = confidence;

  const denominator = 1 + (z * z) / total;
  const center = p + (z * z) / (2 * total);
  const spread = z * Math.sqrt((p * (1 - p) + (z * z) / (4 * total)) / total);

  return Math.max(0, (center - spread) / denominator);
}
Enter fullscreen mode Exit fullscreen mode

This prevents manipulation from sparse data and provides conservative scoring for new memories.

Developer Experience

Simple SDK

Install via npm:

npm i meridiandb-sdk
Enter fullscreen mode Exit fullscreen mode

Three core methods: store, retrieve, recordFeedback.

Example usage:

import { MeridianDBClient } from "meridiandb-sdk";

const client = new MeridianDBClient({
  url: "https://api.meridiandb.com",
  accessToken: "your-token"
});

// Retrieve memories
const memories = await client.retrieveMemoriesSingleAgent({
  query: "user preferences"
});

// Store new memory
await client.storeMemory({
  agentId: "chatbot-v1",
  content: "User prefers dark mode",
  isFactual: true,
  context: "UI preferences"
});

// Record feedback
await client.recordFeedback({
  success: true,
  memories: ["memory-id-1", "memory-id-2"]
});
Enter fullscreen mode Exit fullscreen mode

Admin Portal

Built with React and Vite, deployable to Cloudflare Pages. The operator UI provides observability, data management, and debugging tools.

Technical Stack

For development/free tier, we provide cfw-poor-man-queue—a lightweight distributed queue implementation that lets you run MeridianDB on Cloudflare's free plan.

Performance & Scalability

  • <500ms retrieval latency including multi-dimensional filtering
  • Global edge deployment for low-latency access worldwide
  • SQL-based scoring for maximum scalability
  • Event-driven updates prevent write-on-read latency penalties
  • Horizontally scalable architecture

Limitations

Being transparent about trade-offs:

  • Eventual consistency: Reads may slightly lag behind writes
  • Manual context: Developers must supply contextual features (auto-generation coming)
  • Storage constraints: D1 has a 10GB limit per database
  • Platform coupling: Optimized for Cloudflare ecosystem - but replacing D1 with SQLite, workers with nodejs, vectorize with chromadb, cloudflare or PMQ with rabbitmq or kafka is totally doable.
  • Learning curve: Multi-dimensional retrieval differs from traditional vector search

Getting Started

  1. Clone the repository
   git clone https://github.com/ARAldhafeeri/MeridianDB
   cd MeridianDB
   npm install
Enter fullscreen mode Exit fullscreen mode
  1. Set up Cloudflare resources
   # Create vectorize index
   npx wrangler vectorize create meridiandb --dimensions=768 --metric=cosine

   # Create metadata index for agent isolation
   npx wrangler vectorize create-metadata-index meridiandb --property-name=agentId --type=string
Enter fullscreen mode Exit fullscreen mode
  1. Run migrations
   npm run server:migrations
   npm run server:migrate:local
Enter fullscreen mode Exit fullscreen mode
  1. Start development
   npm run dev
Enter fullscreen mode Exit fullscreen mode
  1. Initialize super admin Hit /auth/init endpoint to set up admin access

Resources

Why This Matters

Cloudflare offers Auto-RAG as a product. But if you want state-of-the-art RAG that actively learns from user behavior, adapts over time, and balances stability with plasticity—try MeridianDB.

The future of AI agents depends on memory systems that don't just store and retrieve, but actively curate knowledge based on utility, recency, and performance. MeridianDB makes this vision practical and deployable today.


Interested in using MeridianDB for your team? Book a meeting to discuss your use case.

Scientific Foundation

MeridianDB's approach is grounded in established research:

By combining neuroscience-inspired principles with modern vector databases and edge computing, MeridianDB offers a mathematically grounded solution to one of AI's most challenging problems: building agents that learn continuously without forgetting what matters.

Top comments (0)