I'm building CortexDB — an agent-native context database for AI agents

#rust #database #ai #rag

I'm building CortexDB — an agent-native context database for AI agents

Most modern RAG systems follow the same pattern:

Split documents into chunks
Compute embeddings
Store them in a vector database
Retrieve top-k similar chunks
Send them to an LLM

It works. But as AI agents become more autonomous, a clear problem emerges:

Agents don't just need similar chunks.

They need bounded, permission-safe, evidence-aware, and verifiable context.

That's why I'm building CortexDB.

GitHub: https://github.com/AubakirovArman/CortexDB

What is CortexDB?

CortexDB is an experimental agent-native context database.

It's not a traditional vector database.

It's not a key-value store.

It's not just another memory layer on top of embeddings.

The core idea is to store knowledge and agent memory in a way that allows the system to compile a structured Context Pack — a ready-to-use, evidence-aware package of context.

Why classic RAG is often not enough

Classic retrieval often returns raw chunks. This leads to several problems:

Duplication
Weak provenance
Token budget overruns
Potential data leakage
Ignored contradictions

Example:

Document 1: Solar Plant budget is 1.2B KZT
Document 2: Solar Plant budget was updated to 1.4B KZT

A classic pipeline may return only the first document, and the agent confidently answers with an outdated number.

CortexDB is designed to handle such conflicts properly.

Core Feature: ContextPack

The main output of CortexDB is a ContextPack — a structured context package:


json
{
  "token_budget_tokens": 4000,
  "estimated_tokens": 2500,
  "truncated": false,
  "citations_required": true,
  "cells": [...],
  "anomalies": [...]
}