Manuel Bruña

Posted on Jun 15

What I Learned Building APX’s Memory Broker and Active Threads

#ai #opensource #devtools #tutorial

What I Learned Building APX’s Memory Broker and Active Threads

When I started shaping APX’s memory layer, I kept trying to make it do one thing: remember the project well enough that I would stop repeating myself.

That idea was too vague. It mixed two different jobs that only look similar from far away.

One job is semantic memory: what facts about the project matter enough to surface later. The other is recency: what just happened on another channel and might still be in flight. APX works better when it treats those as separate problems.

That split sounds small, but it changed the whole design.

The mistake I kept making

If you build memory as one giant bucket, you get a mess. Old facts, recent chatter, tool output, and notes from different surfaces all blur together. The agent sees a lot of text, but not much structure.

That is especially bad in a system like APX, because the same project can move across Telegram, desktop, web, and direct CLI calls. The useful question is not just "what do we know?" It is also "what thread are we already in somewhere else?"

Those are not the same.

What the memory broker actually does

The memory broker is the semantic side.

Before each super-agent turn, it builds a [RELEVANT MEMORY] block. The implementation is deliberately bounded:

it reads the last 10 entries from ~/.apx/memory.md
it races RAG retrieval against an 800 ms budget
it deduplicates bullets by normalized text
it formats compact bullets with date, channel, and excerpt
if nothing useful appears, it returns nothing at all

That last part matters. Silence is better than noise. If memory has nothing relevant, I do not want the model to hallucinate a relationship just because a block exists.

The same restraint shows up in compaction. APX keeps conversation history from exploding by compacting older turns in the background, out of the reply hot path. The current turn uses whatever compact already exists; the next turn benefits from the new summary. That choice keeps the work bounded and avoids turning memory maintenance into user-visible latency.

The defaults are boring on purpose:

{
  "memory": {
    "broker_budget_ms": 800,
    "compact_threshold": 60,
    "keep_recent": 40,
    "active_threads": {
      "enabled": true,
      "window_hours": 6,
      "max_lines": 3
    }
  }
}

I like that shape because it admits a simple truth: memory is not magic. It is a budgeted retrieval system.

Why active threads needed its own block

The recency side is different.

active threads does not try to be smart about meaning. It reads the raw cross-channel message log and surfaces the most recent meaningful turn from each other channel within a short time window. In practice, that means the agent can notice a Telegram thread, a desktop prompt, or a web chat that is still warm even if the current message uses different words.

That block is bounded too:

only other channels, never the current one
only recent activity, with a default 6 hour window
only a few lines, with a default cap of 3 bullets
only on interactive channels, not routine runs

That matters because recency and relevance fail in different ways.

Semantic memory can miss something that is still hot but not semantically similar. Recency can catch the immediate thread, but it should not pretend to be durable knowledge. If the model has to choose, I would rather it see both, separately, than one blended blob that confuses the signal.

Why the split feels right in daily use

The first real benefit was not better benchmarks. It was less repetition.

When I bounce between surfaces, APX can now recover the shape of the work without making me restate the whole setup every time. If I leave a thought on one channel and come back from another, the agent has a chance to notice the thread instead of acting blind.

The second benefit is easier debugging.

If something feels wrong, I can ask a sharper question: did the broker miss a durable fact, or did active threads fail to surface the current thread? That is much easier to diagnose than "memory feels off".

The tradeoff is complexity. Two blocks mean two mental models. More knobs. More places where I need to decide what belongs in the repo, what stays local, and what should only exist for a few hours.

I think that cost is worth it.

The part I would keep if I had to simplify

If I had to reduce APX memory to one principle, it would be this:

durable context and live context are different assets.

Durable context belongs in the broker and the repo-side notebook. Live context belongs in the active-thread view of the message logs. One answers "what is true about this project?" The other answers "what are we doing right now?"

That distinction is the real improvement.

APX gets less chatty, less forgetful, and less vague when it respects that split. It does not try to be a single giant memory bank. It acts more like a working system: one part remembers, one part keeps up.

That is the direction I want for the rest of the stack too.

Top comments (2)

Ekong Ikpe • Jun 22

The physical reality of where a file lives on the disk should be its true identity (standards), and creating external tracking layers just adds brittle complexity.

Manuel Bruña • Jun 25

Partially agree. Filesystem path is the real locator, and standards matter. Where I differ: runtime memory needs an identity that survives moves, forks, and tool changes. A small reviewable repo marker can reduce brittleness if it stays boring and does not become another opaque registry.