DEV Community

prabhdeep
prabhdeep

Posted on

OpenKairos: Open Implementation of the Leaked KAIROS Architecture

The Context

The most interesting part of the leak wasn’t model weights or APIs—it was architecture.
Specifically, the idea of a persistent daemon: a system that observes, reacts, and schedules actions without explicit user prompts. Think less “chatbot,” more “background intelligence layer.”
That concept stuck with me.

The Build Timeline

I started building on April 19 with a simple constraint:
No massive infra. No hidden magic. Just reproducible components.
The goal wasn’t to copy anything—it was to see if the pattern could be rebuilt from scratch.
Nine days later, I had a working prototype.

The Stack (What actually matters)

Python + Asyncio → event loop for continuous execution
Watchdog → filesystem + environment triggers
Ollama → local model inference (no external API dependency)
Task Scheduler Layer → priority + interrupt handling
3-Layer Memory System:
Short-term (context window)
Mid-term (session logs)
Long-term (vector store)
Everything runs as a daemon process—not a request/response server.

Core Design Idea

Instead of:

User → Prompt → Response

It works like:

System Loop → Observe → Decide → Act → Store → Repeat

That shift changes everything:

  • 1. latency expectations
  • 2. memory handling
  • 3. failure modes
  • 4. resource management

The Weird Part: “AutoDream”

The hardest problem wasn’t inference—it was memory.

I ended up building something I call AutoDream:

  • Runs periodically (or during idle windows)
  • Compresses recent interactions
  • Promotes useful patterns into long-term memory
  • Drops noise

The constraint:

Must complete within ~15 seconds or get killed by the scheduler.

This forced aggressive tradeoffs:

  • summarization vs fidelity
  • frequency vs cost
  • stability vs adaptability
  • Still not fully solved.

What Broke (and why it matters)

  • Long-running loops drift without strong constraints
  • Memory systems become garbage collectors if unmanaged
  • Background agents need interruptibility, not just intelligence

This isn’t just “LLM engineering”—it’s closer to OS design.

Call to Action
The full implementation is open source:

If you’re exploring persistent agents, daemonized LLMs, or memory systems—I’d be interested in what approaches you’re taking.

Top comments (1)

Collapse
 
prbhhhhh profile image
prabhdeep

maker here . The “AutoDream” memory consolidation was the hardest part to get stable without breaking execution windows.
Curious how others are handling long-term memory in always-on agents—especially under strict time or resource constraints.