<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: sheawinkler</title>
    <description>The latest articles on DEV Community by sheawinkler (@sheawinkler).</description>
    <link>https://dev.to/sheawinkler</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F222218%2F49a3da29-c583-4742-a1f9-68ba8b32e90c.jpeg</url>
      <title>DEV Community: sheawinkler</title>
      <link>https://dev.to/sheawinkler</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/sheawinkler"/>
    <language>en</language>
    <item>
      <title>ContextLattice v3.2.3: Faster Agent Memory with Go/Rust Runtime and Staged Retrieval</title>
      <dc:creator>sheawinkler</dc:creator>
      <pubDate>Fri, 20 Mar 2026 23:53:00 +0000</pubDate>
      <link>https://dev.to/sheawinkler/contextlattice-v323-faster-agent-memory-with-gorust-runtime-and-staged-retrieval-i1o</link>
      <guid>https://dev.to/sheawinkler/contextlattice-v323-faster-agent-memory-with-gorust-runtime-and-staged-retrieval-i1o</guid>
      <description>&lt;p&gt;I shipped ContextLattice v3.2.3, a local-first memory/context layer for apps and agents. I improved retrieval speed across all database backends and deep-source coverage for ContextLattice.&lt;/p&gt;

&lt;p&gt;Key runtime design:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Go handles ingress/orchestration&lt;/li&gt;
&lt;li&gt;Rust handles retrieval/memory hot paths&lt;/li&gt;
&lt;li&gt;interfaces with cli, desktop &amp;amp; web apps, claw/messaging platform-ready&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Retrieval design:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;fast lane: topic_rollups + qdrant + postgres_pgvector&lt;/li&gt;
&lt;li&gt;deep continuation lane: mindsdb + mongo_raw + letta + memory_bank&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Performance highlights:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;v3.2.0 Go lane mean 0.157s vs Python lane 0.255s (38.547% faster)&lt;/li&gt;
&lt;li&gt;earlier runtime cutover benchmark: ~4.94x faster mean vs legacy lane&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Project:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/sheawinkler/ContextLattice" rel="noopener noreferrer"&gt;https://github.com/sheawinkler/ContextLattice&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://contextlattice.io" rel="noopener noreferrer"&gt;https://contextlattice.io&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Next phase:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;data compression, data expansion control&lt;/li&gt;
&lt;li&gt;interactive monitoring dashboard&lt;/li&gt;
&lt;li&gt;debugging agent &amp;amp; subagent orchestration challenges&lt;/li&gt;
&lt;li&gt;future:

&lt;ul&gt;
&lt;li&gt;decrease personal computer requirements&lt;/li&gt;
&lt;li&gt;obsidian plug-in or similar/more performant knowledge visualizer&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>openclaw</category>
      <category>microservices</category>
      <category>devops</category>
    </item>
    <item>
      <title>How I built a private-by-default HTTP-first MCP memory/context/task orchestrator</title>
      <dc:creator>sheawinkler</dc:creator>
      <pubDate>Mon, 23 Feb 2026 03:43:12 +0000</pubDate>
      <link>https://dev.to/sheawinkler/how-i-built-a-private-by-default-http-first-mcp-memorycontexttask-orchestrator-4m29</link>
      <guid>https://dev.to/sheawinkler/how-i-built-a-private-by-default-http-first-mcp-memorycontexttask-orchestrator-4m29</guid>
      <description>&lt;p&gt;If you’ve run agents on long-horizon work, you’ve probably seen the same failure mode: the agent&lt;br&gt;
  forgets prior decisions, repeats mistakes, and gradually degrades output quality.&lt;/p&gt;

&lt;p&gt;Finite context windows guarantee this over time unless you add durable memory and retrieval.&lt;/p&gt;

&lt;p&gt;I built &lt;strong&gt;ContextLattice&lt;/strong&gt; to solve that problem with a &lt;strong&gt;local-first architecture&lt;/strong&gt; that is&lt;br&gt;
  &lt;strong&gt;private by default&lt;/strong&gt; and &lt;strong&gt;MCP-compatible by design&lt;/strong&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Docs: &lt;a href="https://contextlattice.io/installation.html" rel="noopener noreferrer"&gt;https://contextlattice.io/installation.html&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Repo: &lt;a href="https://github.com/sheawinkler/ContextLattice" rel="noopener noreferrer"&gt;https://github.com/sheawinkler/ContextLattice&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;## 1) Problem Framing&lt;/p&gt;

&lt;p&gt;Long-running agent workflows need more than prompt history.&lt;/p&gt;

&lt;p&gt;Without a durable memory/context layer:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;prior decisions are lost&lt;/li&gt;
&lt;li&gt;the same debugging loops repeat&lt;/li&gt;
&lt;li&gt;retrieval quality drifts over time&lt;/li&gt;
&lt;li&gt;operators keep manually re-injecting context&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;ContextLattice addresses this with one ingress path for writes and one orchestrated retrieval&lt;br&gt;
  path for reads.&lt;/p&gt;

&lt;p&gt;## 2) Architecture&lt;/p&gt;

&lt;p&gt;The runtime pattern is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;HTTP-first ingress for write/search&lt;/li&gt;
&lt;li&gt;durable outbox queue&lt;/li&gt;
&lt;li&gt;fanout to specialized sinks&lt;/li&gt;
&lt;li&gt;federated retrieval + rerank&lt;/li&gt;
&lt;li&gt;learning feedback loop&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;### Write flow&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Client posts to &lt;code&gt;/memory/write&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Payload is validated and normalized&lt;/li&gt;
&lt;li&gt;Durable outbox stores the job&lt;/li&gt;
&lt;li&gt;Fanout writes to:

&lt;ul&gt;
&lt;li&gt;memory-bank (canonical)&lt;/li&gt;
&lt;li&gt;Qdrant (semantic vectors)&lt;/li&gt;
&lt;li&gt;Mongo (raw ledger)&lt;/li&gt;
&lt;li&gt;MindsDB (+ optional Letta path)&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;### Retrieval flow&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Orchestrator issues parallel recall across sources&lt;/li&gt;
&lt;li&gt;Results are merged + reranked&lt;/li&gt;
&lt;li&gt;Composed context is returned to the caller&lt;/li&gt;
&lt;li&gt;Feedback signals are written for retrieval-quality learning&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This keeps ingress simple while allowing storage/retrieval specialization behind the&lt;br&gt;
  orchestrator.&lt;/p&gt;

&lt;p&gt;## 3) Operational Controls&lt;/p&gt;

&lt;p&gt;This stack is designed for bursty real traffic, not just demos.&lt;/p&gt;

&lt;p&gt;Key controls:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;backpressure at fanout boundaries&lt;/li&gt;
&lt;li&gt;retry + replay semantics from durable queue&lt;/li&gt;
&lt;li&gt;retention and pruning policies for storage pressure&lt;/li&gt;
&lt;li&gt;strict secret-handling modes (&lt;code&gt;redact&lt;/code&gt;, &lt;code&gt;block&lt;/code&gt;, &lt;code&gt;allow&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;local-first defaults (loopback binding, auth-required production posture)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The result is a system that can function as both:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;a long-horizon memory/context layer, and&lt;/li&gt;
&lt;li&gt;a telemetry-grade write backend.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;## 4) What Actually Mattered in Practice&lt;/p&gt;

&lt;p&gt;The most important implementation outcomes were:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;One ingress contract&lt;/strong&gt; reduced client integration complexity.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Durable queueing&lt;/strong&gt; prevented sink instability from dropping writes.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Federated retrieval&lt;/strong&gt; outperformed single-store recall on long tasks.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Local-first defaults&lt;/strong&gt; reduced deployment friction and security risk.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;## 5) Quickstart&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
bash
  cp .env.example .env
  ln -svf ../../.env infra/compose/.env
  gmake quickstart

  Then verify:

  ORCH_KEY="$(awk -F= '/^MEMMCP_ORCHESTRATOR_API_KEY=/{print substr($0,index($0,"=")+1)}' .env)"
  curl -fsS http://127.0.0.1:8075/health | jq
  curl -fsS -H "x-api-key: ${ORCH_KEY}" http://127.0.0.1:8075/status | jq

  ## 6) Closing

  If you’re building long-horizon agent systems, I’d value feedback on:

  - retrieval quality over long task horizons
  - operator ergonomics under sustained write pressure
  - practical tradeoffs between local-only and optional BYO cloud sinks
  - Docs: https://contextlattice.io/installation.html
  - Troubleshooting: https://contextlattice.io/troubleshooting.html
  - Repo: https://github.com/sheawinkler/ContextLattice
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

</description>
      <category>mcp</category>
      <category>backend</category>
      <category>ai</category>
      <category>docker</category>
    </item>
  </channel>
</rss>
