I'm building CortexDB — an agent-native context database for AI agents
Most modern RAG systems follow the same pattern:
- Split documents into chunks
- Compute embeddings
- Store them in a vector database
- Retrieve top-k similar chunks
- Send them to an LLM
It works. But as AI agents become more autonomous, a clear problem emerges:
Agents don't just need similar chunks.
They need bounded, permission-safe, evidence-aware, and verifiable context.
That's why I'm building CortexDB.
GitHub: https://github.com/AubakirovArman/CortexDB
What is CortexDB?
CortexDB is an experimental agent-native context database.
It's not a traditional vector database.
It's not a key-value store.
It's not just another memory layer on top of embeddings.
The core idea is to store knowledge and agent memory in a way that allows the system to compile a structured Context Pack — a ready-to-use, evidence-aware package of context.
Why classic RAG is often not enough
Classic retrieval often returns raw chunks. This leads to several problems:
- Duplication
- Weak provenance
- Token budget overruns
- Potential data leakage
- Ignored contradictions
Example:
- Document 1: Solar Plant budget is 1.2B KZT
- Document 2: Solar Plant budget was updated to 1.4B KZT
A classic pipeline may return only the first document, and the agent confidently answers with an outdated number.
CortexDB is designed to handle such conflicts properly.
Core Feature: ContextPack
The main output of CortexDB is a ContextPack — a structured context package:
json
{
"token_budget_tokens": 4000,
"estimated_tokens": 2500,
"truncated": false,
"citations_required": true,
"cells": [...],
"anomalies": [...]
}
Top comments (0)