What I Learned Building APX’s Memory Broker and Active Threads
When I started shaping APX’s memory layer, I kept trying to make it do one thing: remember the project well enough that I would stop repeating myself.
That idea was too vague. It mixed two different jobs that only look similar from far away.
One job is semantic memory: what facts about the project matter enough to surface later. The other is recency: what just happened on another channel and might still be in flight. APX works better when it treats those as separate problems.
That split sounds small, but it changed the whole design.
The mistake I kept making
If you build memory as one giant bucket, you get a mess. Old facts, recent chatter, tool output, and notes from different surfaces all blur together. The agent sees a lot of text, but not much structure.
That is especially bad in a system like APX, because the same project can move across Telegram, desktop, web, and direct CLI calls. The useful question is not just "what do we know?" It is also "what thread are we already in somewhere else?"
Those are not the same.
What the memory broker actually does
The memory broker is the semantic side.
Before each super-agent turn, it builds a [RELEVANT MEMORY] block. The implementation is deliberately bounded:
- it reads the last 10 entries from
~/.apx/memory.md - it races RAG retrieval against an 800 ms budget
- it deduplicates bullets by normalized text
- it formats compact bullets with date, channel, and excerpt
- if nothing useful appears, it returns nothing at all
That last part matters. Silence is better than noise. If memory has nothing relevant, I do not want the model to hallucinate a relationship just because a block exists.
The same restraint shows up in compaction. APX keeps conversation history from exploding by compacting older turns in the background, out of the reply hot path. The current turn uses whatever compact already exists; the next turn benefits from the new summary. That choice keeps the work bounded and avoids turning memory maintenance into user-visible latency.
The defaults are boring on purpose:
{
"memory": {
"broker_budget_ms": 800,
"compact_threshold": 60,
"keep_recent": 40,
"active_threads": {
"enabled": true,
"window_hours": 6,
"max_lines": 3
}
}
}
I like that shape because it admits a simple truth: memory is not magic. It is a budgeted retrieval system.
Why active threads needed its own block
The recency side is different.
active threads does not try to be smart about meaning. It reads the raw cross-channel message log and surfaces the most recent meaningful turn from each other channel within a short time window. In practice, that means the agent can notice a Telegram thread, a desktop prompt, or a web chat that is still warm even if the current message uses different words.
That block is bounded too:
- only other channels, never the current one
- only recent activity, with a default 6 hour window
- only a few lines, with a default cap of 3 bullets
- only on interactive channels, not routine runs
That matters because recency and relevance fail in different ways.
Semantic memory can miss something that is still hot but not semantically similar. Recency can catch the immediate thread, but it should not pretend to be durable knowledge. If the model has to choose, I would rather it see both, separately, than one blended blob that confuses the signal.
Why the split feels right in daily use
The first real benefit was not better benchmarks. It was less repetition.
When I bounce between surfaces, APX can now recover the shape of the work without making me restate the whole setup every time. If I leave a thought on one channel and come back from another, the agent has a chance to notice the thread instead of acting blind.
The second benefit is easier debugging.
If something feels wrong, I can ask a sharper question: did the broker miss a durable fact, or did active threads fail to surface the current thread? That is much easier to diagnose than "memory feels off".
The tradeoff is complexity. Two blocks mean two mental models. More knobs. More places where I need to decide what belongs in the repo, what stays local, and what should only exist for a few hours.
I think that cost is worth it.
The part I would keep if I had to simplify
If I had to reduce APX memory to one principle, it would be this:
durable context and live context are different assets.
Durable context belongs in the broker and the repo-side notebook. Live context belongs in the active-thread view of the message logs. One answers "what is true about this project?" The other answers "what are we doing right now?"
That distinction is the real improvement.
APX gets less chatty, less forgetful, and less vague when it respects that split. It does not try to be a single giant memory bank. It acts more like a working system: one part remembers, one part keeps up.
That is the direction I want for the rest of the stack too.
Top comments (0)