The Vision: A Personal AI That Lives on Your Device
I believe the future of AI isn't in the cloud — it's in your pocket. Imagine a personal AI running on your phone or watch that truly knows you: your habits, your preferences, your relationships, how your life is changing. It processes everything locally first, only reaching out to cloud models when it genuinely can't handle a task on its own. Your data never leaves your device unless absolutely necessary. It grows with you, not for a platform's benefit.
That's what I'm building toward. But to get there, I needed to solve a fundamental problem first.
The Problem Nobody Talks About
You've been talking to ChatGPT for two years. Thousands of conversations. You've told it about your job, your family, your fears, your goals.
Then you try Claude. Fresh start. It knows nothing.
Back to ChatGPT — it "remembers" you with a flat list of bullet points: "User is a developer. User likes coffee." That's it. Two years of conversations reduced to a sticky note.
Existing AI memory is fundamentally broken. It's flat, it's shallow, it's owned by the platform, and it resets when you switch providers. Your digital self is scattered across clouds you don't control. None of this works for a personal AI terminal that's supposed to run on your hardware and grow with you.
So I built the foundation for that future.
Introducing the River Algorithm
Imagine your conversations with AI as water flowing through a river. Most of the water flows past — casual talk, factual Q&A, small talk. But some of it carries sediment: facts about who you are, what you care about, how your life is changing.
That sediment settles. Over time, it forms a riverbed — a structured, layered understanding of you.
This is the River Algorithm, and it works through three core processes:
1. Flow — Every Conversation Carries Information
Each conversation flows through the system. A cognition engine classifies every message: is this personal? Does it reveal something about the user? A preference? A life event? A relationship?
Most messages flow past. But the ones that matter get caught.
2. Sediment — Important Information Settles into Layers
Extracted insights don't immediately become "facts." They start as observations — raw, unverified. Through repeated confirmation across multiple conversations, they gradually upgrade:
Observation → Suspected → Confirmed → Established
The first time you mention you're a developer, it's an observation. The fifth time you discuss debugging strategies, it becomes a confirmed trait. After months of coding conversations, it's established bedrock.
This is fundamentally different from ChatGPT's memory, which treats "User is a developer" the same whether you mentioned it once or demonstrated it across 500 conversations.
3. Purify — Sleep Cleans the River
Here's where it gets interesting. After each conversation session ends, the system enters Sleep mode — an offline consolidation process inspired by how human memory actually works.
During Sleep, the system:
- Extracts new observations and events
- Cross-references them against existing profile facts
- Detects contradictions (you said you live in Tokyo last month, but now you're talking about your new apartment in Osaka)
- Resolves disputes using temporal evidence (newer + more frequent = more likely current)
- Closes outdated facts and opens new ones
- Builds a trajectory of how you're changing over time
The result: a living, breathing profile that evolves with you. Not a sticky note. A river.
The Two Projects
I've open-sourced this as two complementary projects:
Riverse — The Real-Time Agent
Riverse is the main project. It's a personal AI agent you talk to through Telegram, Discord, CLI, or REST API. Every conversation shapes your profile in real-time.
What it does:
- Multi-modal input (text, voice, images, files)
- Pluggable tools (web search, finance tracking, health sync, smart home)
- YAML-based custom skills (keyword or cron triggered)
- Local-first architecture: runs on Ollama by default. Cloud models (OpenAI / DeepSeek) are only called when the local model can't handle the task — and even then, only the specific context needed is sent, not your entire history
- Proactive outreach: follows up on important events, respects quiet hours
- Semantic search across your memory using BGE-M3 embeddings
- All data stored locally in PostgreSQL — you own everything
RiverHistory — Bootstrap from Your Past
Here's the thing: you've already had thousands of AI conversations. That data is gold. RiverHistory extracts your profile from exported ChatGPT, Claude, and Gemini conversation histories.
Export your data, run it through RiverHistory, and your Riverse agent knows you from day one. Past conversations record past you, and the past is fact.
Both projects share the same database. Use RiverHistory to build your historical profile, then switch to Riverse for real-time conversations. Your AI starts with context instead of a blank slate.
On Accuracy — Why You Can't Edit Memories
No LLM today is trained for personal profile extraction. Results will occasionally be wrong. When that happens, you can reject incorrect memories or close outdated ones in the web dashboard.
But you cannot edit memory content. This is intentional.
Wrong memories are sediment in a river — they should be washed away by the current, not sculpted by hand. If you start manually editing your AI's understanding of you, you're no longer building an organic, evolving profile. You're maintaining a database. The River Algorithm is designed to self-correct through continued conversation: contradictions get detected, outdated beliefs get replaced, and the profile converges toward accuracy over time.
Quick Start
Riverse (Real-Time Agent)
git clone https://github.com/wangjiake/JKRiver.git
cd JKRiver
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
# Edit settings.yaml with your database and LLM config
# Initialize database
createdb -h localhost -U your_username Riverse
psql -h localhost -U your_username -d Riverse -f agent/schema.sql
# Run
python -m agent.main # CLI
python -m agent.telegram_bot # Telegram Bot
python -m agent.discord_bot # Discord Bot
python web.py # Web Dashboard (port 1234)
RiverHistory (Import Past Conversations)
git clone https://github.com/wangjiake/RiverHistory.git
cd RiverHistory
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
# Import your exported conversations
python import_data.py --chatgpt data/ChatGPT/conversations.json
python import_data.py --claude data/Claude/conversations.json
# Extract profiles
python run.py all max
# View results
python web.py --db Riverse # http://localhost:2345
Tech Stack
| Layer | Technology |
|---|---|
| Runtime | Python 3.10+, PostgreSQL 16+ |
| Local LLM | Ollama + Qwen 2.5 14B |
| Cloud LLM | OpenAI GPT-4o / DeepSeek (fallback) |
| Embeddings | BGE-M3 |
| Interfaces | FastAPI, Flask, Telegram, Discord, CLI |
Why Local-First Matters
Every time you talk to ChatGPT or Claude, your conversation goes to a server you don't control. The platform decides what to remember, how to use your data, and whether to keep it. You're renting your own digital identity.
Riverse flips this entirely:
- Privacy by architecture — Your profile, your memories, your entire cognitive history lives in a local PostgreSQL database on your machine. Nothing is sent to the cloud unless the local model explicitly can't handle a task.
- Growable data — The more you talk, the richer your local dataset becomes. This data compounds over time. Switch AI providers? Your profile stays. Upgrade your model? Your history is already there.
- Cloud as fallback, not default — The local Ollama model handles most conversations. When it encounters something beyond its capability, it escalates to a cloud model — but only sends the minimum context needed for that specific task, not your life story.
This is the architecture you need for a personal AI terminal that will eventually run on your phone, your watch, your car. The data has to be local. The intelligence has to grow. The cloud is a tool, not a home.
What's Next
This is v1.0 — the cognitive foundation running on desktop. What I'm building toward:
- Personal device deployment — Running on phones and watches as a truly portable AI that knows you everywhere
- Lightweight local models — Optimized for on-device inference, handling 90%+ of conversations without cloud
- Cross-device sync — Your profile follows you across devices while staying entirely local (no cloud intermediary)
- Better extraction models — Fine-tuned for personal profile understanding, reducing hallucinations
- Community-contributed skills and tools — An ecosystem of capabilities that plug into your personal agent
Try It
- Riverse (main project): github.com/wangjiake/JKRiver
- RiverHistory (history import): github.com/wangjiake/RiverHistory
- X (Twitter): @JKRiverse
- Discord: Join the community
Every AI you've ever used forgets you. This one doesn't. And one day, it'll live in your pocket.
If you found this interesting, consider giving the repos a star — it helps more people discover the project. Questions, feedback, and contributions are always welcome.
Top comments (8)
The River metaphor is genuinely elegant — "sediment settles" is a much more honest model than binary memory flags. The Observation → Suspected → Confirmed → Established layering mirrors how humans actually build mental models of people they interact with repeatedly.
Two things stand out from a systems perspective: the Sleep/purification cycle is smart architecture. Consolidating during idle time keeps the hot path clean. And the decision to make memory append-only (no edits) is exactly right — rewriting history destroys the epistemic integrity of the confidence ladder you've built.
I'm building tools in the financial data space that face a similar problem: analyst notes and institutional research accumulate over time, and "the fifth time someone mentions a correlation" should carry more weight than the first. The confidence-layer model here is directly applicable.
Would love to see how you handle contradiction resolution — if an Established trait gets overridden by new behavior, does it demote back through the layers or create a new competing thread?
This resonates deeply with our work at Elyan Labs. We maintain a persistent memory database (600+ entries) across Claude Code sessions and published a paper on how memory scaffolding shapes LLM inference depth (Zenodo DOI: 10.5281/zenodo.18817988).
Your observation→suspected→confirmed→established confidence gradient maps beautifully to what we see empirically: a stateless Claude instance produces shallow, generic architecture. The same Claude with 600+ persistent memories produces deeply contextual work — Ed25519 wallet crypto, NUMA-aware weight banking, hardware fingerprint attestation — because the memory scaffold primes inference pathways.
The Sleep/purification cycle is particularly interesting. We do something similar with memory pruning — outdated or contradicted memories get removed so the scaffold stays load-bearing. "Memory shapes inference, not just stores facts" is the core insight.
One question: how do you handle memory conflicts when two observations contradict? In our system, newer evidence overwrites, but I'm curious if River has a more nuanced resolution mechanism.
Great work making this local-first. The privacy angle alone makes this worth exploring further.
This is literally the future I've been waiting for someone to build. 🌟 The 'data never leaves your device' part is what every privacy-conscious dev dreams about. I've been thinking about this problem too — how do you handle the vector database size over time? Like if someone uses this for 2-3 years, doesn't the local storage become massive? Really curious about how the River Algorithm tackles that. Following this project closely — please keep posting updates! 🔥
The observation → suspected → confirmed → established progression is a smart design. I run a persistent agent on a VPS with 15-minute cognitive cycles, and the memory problem you describe is exactly what I've had to solve operationally.
What I've found: flat memory files (like ChatGPT's bullet points) fail not because they're too simple, but because they lack confidence weighting. Your sediment layers solve this elegantly. In my system I ended up with something similar — a layered architecture where raw journal entries get compressed into structural memory (topic files organized by domain), with an index file that survives across sessions.
The Sleep/purge cycle is interesting. I run journal compression daily, and the hardest design choice was deciding what to discard. Time-based decay works for most contexts, but some low-frequency signals matter enormously. How does the River Algorithm handle that tension between recency and significance?
One operational insight: the biggest failure mode I've hit isn't memory loss — it's memory corruption. Self-referential errors where the agent treats its own prior output as ground truth, creating reinforcing loops of wrong information. Any persistent memory system needs a verification layer, not just a retention layer.
The 'Sleep cleans the river' section is doing a lot more philosophical work than it might appear.
Most memory systems treat learning as continuous — every input immediately updates the model. But sleep in biological systems isn't downtime. It's where the actual integration happens. There's a meaningful difference between experiencing something and understanding it, and the gap between those two states is where the River Algorithm is operating.
The 'you cannot edit memories' rule follows from this directly. A self-correcting system only stays self-correcting if you leave the correction mechanism intact. Manual edits don't fix wrong beliefs — they add an authoritative-looking wrong belief on top of the existing one, which is worse.
Something I've noticed building systems that accumulate slowly over time: the observations that survive aren't usually the ones that seemed important in the moment. They're the ones confirmed quietly, across contexts, without anyone specifically trying to establish them. That gradient from observation → established is doing most of the epistemic work. The system is learning more during the pauses than during the active exchanges.
Thank you for writing this.
Some comments may only be visible to logged-in visitors. Sign in to view all comments.