I Built an Open Source AI Memory Layer. The Legacy File System Will Eventually Die.

#agents #python #rag #ai

Your file system is from 1970.

Files. Folders. Drives. Keyword search. That's it. That's what we're still using to manage our entire digital lives in 2026.

Meanwhile, video and audio are completely unsearchable. Photos you took five years ago might as well not exist. Documents from old projects? Good luck. The moment you stop remembering where you put something, it's gone.

That's the problem I built Omnex to solve.

What Is Omnex?

Omnex is a self-hosted, local-first AI memory layer. It indexes everything — documents, images, audio, video, code — using purpose-built embedding models for each file type. Then you query it in plain language.

Not keyword search. Memory.

"Find the contract I signed around the time we moved."
"Show me photos with my sister from the Cape Town trip."
"Pull up the authentication code I wrote last year."

Everything stays on your machine. No cloud required for indexing or search. Your data never leaves.

How It Works

Every file goes through a pipeline:

Text/docs → MiniLM-L6-v2 sentence embeddings
Images → CLIP ViT-B/32 visual embeddings + moondream2 captions (images become text-searchable by what they actually depict)
Audio/video → Whisper transcription — every spoken word indexed
Code → CodeBERT semantic embeddings
Faces → InsightFace ArcFace + DBSCAN clustering. Name a cluster once, recall every photo of that person forever.

Chunks land in a USearch i8-quantised vector index + MongoDB. Queries run through a LangGraph state graph — score-based routing, no LLM classifier, no misrouting.

The query engine also extracts structured signals before searching: date ranges, file type hints, device names, GPS regions. These become MongoDB match conditions alongside the vector search.

It's Not Just a Personal Tool

Omnex has a full agent memory API. AI agents — Claude, GPT-4, Cursor, Windsurf, any MCP-compatible tool — can read from and write to the same index you use.

POST /agents/observe
{
  "agent_id": "research-agent",
  "content": "User prefers concise technical summaries.",
  "broadcast": true
}

Multiple Omnex instances can federate. One query fans out to all active peers, merges results by cosine score, annotates each result with its origin instance. A swarm of agents sharing distributed semantic memory across machines, users, or organisations.

The MCP server exposes: omnex_search, omnex_remember, omnex_search_federated, and more.

The Stack

Layer	Tech
Backend	Python 3.11, FastAPI
Vector Index	USearch (i8 quantised)
Metadata	MongoDB 7
Query Engine	LangGraph StateGraph
Frontend	Next.js 14, React 18, TypeScript
Voice	Whisper input + Chatterbox Turbo / Kokoro ONNX output
Virtual FS	FUSE — mounts at `/mnt/omnex`
LLM	Anthropic / OpenAI / Ollama — swappable at runtime
Infra	Docker Compose

One docker compose up and it's running.

Why Build This?

Big tech is building their version. Google Photos. iCloud. Microsoft Recall. Cloud-first. Walled garden. Trained on your data.

The open alternative needs to exist.

Omnex is local. Private. Agentic. Federated. AGPL-3.0 — stays open forever.

AI agents are about to operate at scale on our behalf. The data layer underneath them needs to be rebuilt from scratch. Not another search engine with an AI veneer. A genuine memory substrate that humans and agents share equally.

The file system had a good run. It's done.

Get Started

git clone https://github.com/sup3rus3r/omnex
cd omnex
# Add your .env with ANTHROPIC_API_KEY
docker compose up --build -d

Open localhost:3007, drop files into Ingest, start querying in Recall.

If this solves a problem you have — star it on GitHub. If you want to contribute (Python, TypeScript, Go, ML) — read docs/ARCHITECTURE.md and open an issue.