DEV Community

Discussion on: I Built a Local AI Agent That Actually Remembers You — Here's How the River Algorithm Works

Collapse
 
harsh2644 profile image
Harsh

This is literally the future I've been waiting for someone to build. 🌟 The 'data never leaves your device' part is what every privacy-conscious dev dreams about. I've been thinking about this problem too — how do you handle the vector database size over time? Like if someone uses this for 2-3 years, doesn't the local storage become massive? Really curious about how the River Algorithm tackles that. Following this project closely — please keep posting updates! 🔥

Collapse
 
collen profile image
collen w

Storage isn't really a concern here. The vector database only holds embeddings for active data — current profile
facts, recent observations (capped at 500), and the latest 200 conversation turns. When a fact gets closed or an event
expires, its embedding is cleaned up automatically. So the vector DB size scales with how complex your life is, not
how long you've been using it.

The raw conversation archive does grow indefinitely (append-only by design), but that's just plain text in PostgreSQL
— 10,000 sessions is maybe 10-20MB. Even after 2-3 years of daily use, you're probably looking at a few hundred MB
total. Not exactly "massive" by modern standards.