DEV Community

Cover image for How I built a private-by-default HTTP-first MCP memory/context/task orchestrator
sheawinkler
sheawinkler

Posted on

How I built a private-by-default HTTP-first MCP memory/context/task orchestrator

If you’ve run agents on long-horizon work, you’ve probably seen the same failure mode: the agent
forgets prior decisions, repeats mistakes, and gradually degrades output quality.

Finite context windows guarantee this over time unless you add durable memory and retrieval.

I built ContextLattice to solve that problem with a local-first architecture that is
private by default and MCP-compatible by design.

## 1) Problem Framing

Long-running agent workflows need more than prompt history.

Without a durable memory/context layer:

  • prior decisions are lost
  • the same debugging loops repeat
  • retrieval quality drifts over time
  • operators keep manually re-injecting context

ContextLattice addresses this with one ingress path for writes and one orchestrated retrieval
path for reads.

## 2) Architecture

The runtime pattern is:

  • HTTP-first ingress for write/search
  • durable outbox queue
  • fanout to specialized sinks
  • federated retrieval + rerank
  • learning feedback loop

### Write flow

  1. Client posts to /memory/write
  2. Payload is validated and normalized
  3. Durable outbox stores the job
  4. Fanout writes to:
    • memory-bank (canonical)
    • Qdrant (semantic vectors)
    • Mongo (raw ledger)
    • MindsDB (+ optional Letta path)

### Retrieval flow

  1. Orchestrator issues parallel recall across sources
  2. Results are merged + reranked
  3. Composed context is returned to the caller
  4. Feedback signals are written for retrieval-quality learning

This keeps ingress simple while allowing storage/retrieval specialization behind the
orchestrator.

## 3) Operational Controls

This stack is designed for bursty real traffic, not just demos.

Key controls:

  • backpressure at fanout boundaries
  • retry + replay semantics from durable queue
  • retention and pruning policies for storage pressure
  • strict secret-handling modes (redact, block, allow)
  • local-first defaults (loopback binding, auth-required production posture)

The result is a system that can function as both:

  • a long-horizon memory/context layer, and
  • a telemetry-grade write backend.

## 4) What Actually Mattered in Practice

The most important implementation outcomes were:

  • One ingress contract reduced client integration complexity.
  • Durable queueing prevented sink instability from dropping writes.
  • Federated retrieval outperformed single-store recall on long tasks.
  • Local-first defaults reduced deployment friction and security risk.

## 5) Quickstart


bash
  cp .env.example .env
  ln -svf ../../.env infra/compose/.env
  gmake quickstart

  Then verify:

  ORCH_KEY="$(awk -F= '/^MEMMCP_ORCHESTRATOR_API_KEY=/{print substr($0,index($0,"=")+1)}' .env)"
  curl -fsS http://127.0.0.1:8075/health | jq
  curl -fsS -H "x-api-key: ${ORCH_KEY}" http://127.0.0.1:8075/status | jq

  ## 6) Closing

  If you’re building long-horizon agent systems, I’d value feedback on:

  - retrieval quality over long task horizons
  - operator ergonomics under sustained write pressure
  - practical tradeoffs between local-only and optional BYO cloud sinks
  - Docs: https://contextlattice.io/installation.html
  - Troubleshooting: https://contextlattice.io/troubleshooting.html
  - Repo: https://github.com/sheawinkler/ContextLattice
Enter fullscreen mode Exit fullscreen mode

Top comments (0)