DEV Community

Arman
Arman

Posted on

I'm building CortexDB — an agent-native context database for AI agents

I'm building CortexDB — an agent-native context database for AI agents

Most modern RAG systems follow the same pattern:

  1. Split documents into chunks
  2. Compute embeddings
  3. Store them in a vector database
  4. Retrieve top-k similar chunks
  5. Send them to an LLM

It works. But as AI agents become more autonomous, a clear problem emerges:

Agents don't just need similar chunks.

They need bounded, permission-safe, evidence-aware, and verifiable context.

That's why I'm building CortexDB.

GitHub: https://github.com/AubakirovArman/CortexDB


What is CortexDB?

CortexDB is an experimental agent-native context database.

It's not a traditional vector database.

It's not a key-value store.

It's not just another memory layer on top of embeddings.

The core idea is to store knowledge and agent memory in a way that allows the system to compile a structured Context Pack — a ready-to-use, evidence-aware package of context.


Why classic RAG is often not enough

Classic retrieval often returns raw chunks. This leads to several problems:

  • Duplication
  • Weak provenance
  • Token budget overruns
  • Potential data leakage
  • Ignored contradictions

Example:

  • Document 1: Solar Plant budget is 1.2B KZT
  • Document 2: Solar Plant budget was updated to 1.4B KZT

A classic pipeline may return only the first document, and the agent confidently answers with an outdated number.

CortexDB is designed to handle such conflicts properly.


Core Feature: ContextPack

The main output of CortexDB is a ContextPack — a structured context package:


json
{
  "token_budget_tokens": 4000,
  "estimated_tokens": 2500,
  "truncated": false,
  "citations_required": true,
  "cells": [...],
  "anomalies": [...]
}
Enter fullscreen mode Exit fullscreen mode

Top comments (0)