<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Highpass Studio</title>
    <description>The latest articles on DEV Community by Highpass Studio (@highpass_studio_382ce5641).</description>
    <link>https://dev.to/highpass_studio_382ce5641</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3834240%2F1f62459d-6ea5-4825-a012-8a7c395b46c5.png</url>
      <title>DEV Community: Highpass Studio</title>
      <link>https://dev.to/highpass_studio_382ce5641</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/highpass_studio_382ce5641"/>
    <language>en</language>
    <item>
      <title>AI memory is broken. We built one that forgets.</title>
      <dc:creator>Highpass Studio</dc:creator>
      <pubDate>Sun, 05 Apr 2026 01:35:33 +0000</pubDate>
      <link>https://dev.to/highpass_studio_382ce5641/ai-memory-is-broken-we-built-one-that-forgets-dmc</link>
      <guid>https://dev.to/highpass_studio_382ce5641/ai-memory-is-broken-we-built-one-that-forgets-dmc</guid>
      <description>&lt;p&gt;Every agent framework has the same problem with memory: it doesn't forget.&lt;/p&gt;

&lt;p&gt;Context windows reset between sessions. RAG and vector DBs store everything with equal weight and grow until they're noisy. So when your project changes direction two weeks in, the AI still pulls up week-one decisions like they're current.&lt;/p&gt;

&lt;h2&gt;
  
  
  What this actually looks like
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Week 1:&lt;/strong&gt; You tell the agent "we're using React for the frontend."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Week 2:&lt;/strong&gt; You switch. "Moving to Svelte, React bundle is too big."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Week 4:&lt;/strong&gt; You ask "what's our frontend stack?"&lt;/p&gt;

&lt;p&gt;A normal retrieval system hands back both answers. React and Svelte sit side by side with equal weight. Nothing in the system knows one replaced the other. So the agent might reference React, Svelte, or some confused mix of both.&lt;/p&gt;

&lt;p&gt;We kept running into this while building agent tooling, and it became clear the issue isn't retrieval quality — it's that these systems have no concept of time or obsolescence.&lt;/p&gt;

&lt;h2&gt;
  
  
  The numbers
&lt;/h2&gt;

&lt;p&gt;We ran a 4-week simulated project through both systems. 24 events total — decisions, corrections, errors, repeated observations. Two major direction changes mid-project.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;Naive&lt;/th&gt;
&lt;th&gt;Sparsion&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Top result correct&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Yes&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Pruned stale memories&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Retrievable at week 4&lt;/td&gt;
&lt;td&gt;24&lt;/td&gt;
&lt;td&gt;22&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Naive retrieval puts a stale entry on top. Sparsion puts the correction first — salience 1.65 vs 0.55 for the outdated original.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Sparsion actually does
&lt;/h2&gt;

&lt;p&gt;It treats memory as a lifecycle instead of a log.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Events → Salience Scoring → Hot → Warm → Cold → Forgotten
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Old memories weaken over time (exponential decay, configurable half-life)&lt;/li&gt;
&lt;li&gt;Repeated events get stronger (log-frequency)&lt;/li&gt;
&lt;li&gt;You can flag things as critical — those survive 4x longer&lt;/li&gt;
&lt;li&gt;Corrections score 3x higher than observations by default&lt;/li&gt;
&lt;li&gt;Anything below a salience floor gets dropped from retrieval entirely&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A critical correction enters the system at salience 13.18. A throwaway observation enters at 0.77. After six weeks with no reinforcement, the observation is gone. The correction is still there.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;sparsion&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Runtime&lt;/span&gt;

&lt;span class="n"&gt;rt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Runtime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;agent_memory.db&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Week 1
&lt;/span&gt;&lt;span class="n"&gt;rt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;record&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;decision&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Frontend framework: React&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;importance&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;high&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Week 2
&lt;/span&gt;&lt;span class="n"&gt;rt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;record&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;correction&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Switching to Svelte — React bundle too large&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;importance&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;critical&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Query
&lt;/span&gt;&lt;span class="n"&gt;memories&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;rt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;frontend&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;limit&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;memories&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;[&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;tier&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;] &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; (salience: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;salience&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# [Hot] Switching to Svelte — React bundle too large (salience: 13.18)
# [Hot] Frontend framework: React (salience: 4.39)
&lt;/span&gt;
&lt;span class="c1"&gt;# Age everything
&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;rt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sweep&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Forgot &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;forgotten&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; stale memories&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Under the hood
&lt;/h2&gt;

&lt;p&gt;Rust core, Python bindings via PyO3/maturin, SQLite for storage. No model dependency — salience scoring is heuristic for now.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Rust core
  ├── Event store (SQLite)
  ├── Salience scorer
  ├── Tier manager (hot/warm/cold)
  ├── Decay engine
  └── Ranked retrieval
       ↓
  PyO3 → Python SDK (pip install sparsion)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Tests: 12 Rust unit, 5 integration (deterministic time via MockClock), 4 Python end-to-end.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's in v0.1
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Temporal decay with configurable half-life&lt;/li&gt;
&lt;li&gt;Reinforcement through repetition&lt;/li&gt;
&lt;li&gt;Importance hints (low/normal/high/critical)&lt;/li&gt;
&lt;li&gt;Event type weighting — corrections &amp;gt; decisions &amp;gt; errors &amp;gt; actions &amp;gt; observations&lt;/li&gt;
&lt;li&gt;Tier migration and forgetting loop through storage&lt;/li&gt;
&lt;li&gt;Python SDK&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What's coming
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Plugging into real agent workflows&lt;/li&gt;
&lt;li&gt;Bigger benchmarks, longer time horizons&lt;/li&gt;
&lt;li&gt;Contradiction-aware updates&lt;/li&gt;
&lt;li&gt;LangChain memory backend&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you're building agents and keep hitting stale context problems, I'd like to hear about your use case.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Sparsion Runtime&lt;/strong&gt; — github.com/HighpassStudio/sparsion-runtime&lt;/p&gt;

</description>
      <category>ai</category>
      <category>python</category>
      <category>rust</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>Your logs are still a text file</title>
      <dc:creator>Highpass Studio</dc:creator>
      <pubDate>Sun, 22 Mar 2026 04:52:12 +0000</pubDate>
      <link>https://dev.to/highpass_studio_382ce5641/your-logs-are-still-a-text-file-3gl4</link>
      <guid>https://dev.to/highpass_studio_382ce5641/your-logs-are-still-a-text-file-3gl4</guid>
      <description>&lt;h2&gt;
  
  
  Your logs are still a text file
&lt;/h2&gt;

&lt;p&gt;Every incident investigation starts the same way:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;zgrep &lt;span class="s2"&gt;"user_id=51013"&lt;/span&gt; logs/&lt;span class="k"&gt;*&lt;/span&gt;.gz
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;...and you wait.&lt;/p&gt;

&lt;p&gt;30 seconds. A minute.&lt;/p&gt;

&lt;p&gt;You tweak the query. Run it again. Another minute.&lt;/p&gt;

&lt;p&gt;Same files. Same decompression. Same full scan.&lt;/p&gt;

&lt;p&gt;After ten queries, you've spent ten minutes rereading the same data.&lt;/p&gt;

&lt;h2&gt;
  
  
  What if grep could remember?
&lt;/h2&gt;

&lt;p&gt;I built xgrep for that.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# One-time: build index (~2 min for 1.7GB)&lt;/span&gt;
xgrep &lt;span class="nt"&gt;--build-index&lt;/span&gt; logs/&lt;span class="k"&gt;*&lt;/span&gt;.gz

&lt;span class="c"&gt;# Every query after that&lt;/span&gt;
xgrep &lt;span class="s2"&gt;"user_id=51013"&lt;/span&gt; logs/&lt;span class="k"&gt;*&lt;/span&gt;.gz    &lt;span class="c"&gt;# 25ms&lt;/span&gt;
xgrep &lt;span class="s2"&gt;"ERROR"&lt;/span&gt; logs/&lt;span class="k"&gt;*&lt;/span&gt;.gz            &lt;span class="c"&gt;# 25ms&lt;/span&gt;
xgrep &lt;span class="s2"&gt;"timeout.*conn"&lt;/span&gt; logs/&lt;span class="k"&gt;*&lt;/span&gt;.gz    &lt;span class="c"&gt;# 25ms&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Instead of decompressing everything every time, xgrep:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;splits logs into 64KB blocks&lt;/li&gt;
&lt;li&gt;builds a bloom filter per block&lt;/li&gt;
&lt;li&gt;only reads blocks that might match&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Everything else is skipped.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The result: read 1% of the data instead of 100%.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;That's the whole idea.&lt;/p&gt;

&lt;h2&gt;
  
  
  Benchmarks on real production logs
&lt;/h2&gt;

&lt;p&gt;Datasets from &lt;a href="https://github.com/logpai/loghub" rel="noopener noreferrer"&gt;LogHub&lt;/a&gt;: Hadoop (HDFS), Blue Gene/L (BGL), and Spark.&lt;/p&gt;

&lt;h3&gt;
  
  
  HDFS — 7.5GB decompressed
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Query&lt;/th&gt;
&lt;th&gt;xgrep&lt;/th&gt;
&lt;th&gt;zgrep&lt;/th&gt;
&lt;th&gt;Speedup&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Block ID&lt;/td&gt;
&lt;td&gt;30ms&lt;/td&gt;
&lt;td&gt;27s&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;913x&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;WARN&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;28ms&lt;/td&gt;
&lt;td&gt;25s&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;907x&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;INFO&lt;/code&gt; (very common)&lt;/td&gt;
&lt;td&gt;23ms&lt;/td&gt;
&lt;td&gt;28s&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;1,217x&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  BGL — 5.0GB decompressed
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Query&lt;/th&gt;
&lt;th&gt;xgrep&lt;/th&gt;
&lt;th&gt;zgrep&lt;/th&gt;
&lt;th&gt;Speedup&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Node ID&lt;/td&gt;
&lt;td&gt;26ms&lt;/td&gt;
&lt;td&gt;17s&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;655x&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;FATAL&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;25ms&lt;/td&gt;
&lt;td&gt;17s&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;708x&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Spark — 3,852 files
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Query&lt;/th&gt;
&lt;th&gt;xgrep&lt;/th&gt;
&lt;th&gt;zgrep&lt;/th&gt;
&lt;th&gt;Speedup&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;Executor&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;5s&lt;/td&gt;
&lt;td&gt;10m&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;118x&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;ERROR&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;2.7s&lt;/td&gt;
&lt;td&gt;10m&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;220x&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;These are repeated-query (cached) results.&lt;/p&gt;

&lt;p&gt;First query is still ~18x faster than zgrep (parallel decompression), but the real win is every query after that — which is how incident debugging actually works.&lt;br&gt;
Here's the replacement JSON section:&lt;/p&gt;




&lt;h2&gt;
  
  
  JSON logs
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;zcat logs.json.gz | jq 'select(.user_id == "42042")'&lt;/code&gt; is the standard workflow. It works. It's also brutally slow — full decompression, full JSON parse, zero skipping, on every query.&lt;/p&gt;

&lt;p&gt;xgrep's &lt;code&gt;-j&lt;/code&gt; flag does field-aware search on NDJSON/JSONL logs:&lt;/p&gt;

&lt;h3&gt;
  
  
  Benchmark
&lt;/h3&gt;

&lt;p&gt;1M NDJSON lines, 244MB uncompressed, 22MB gzip.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Query&lt;/th&gt;
&lt;th&gt;Matches&lt;/th&gt;
&lt;th&gt;Baseline&lt;/th&gt;
&lt;th&gt;xgrep -j&lt;/th&gt;
&lt;th&gt;Speedup&lt;/th&gt;
&lt;th&gt;Block skip&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;user_id=42042&lt;/td&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;td&gt;40.6s&lt;/td&gt;
&lt;td&gt;0.22s&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;188x&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;97%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;status=503&lt;/td&gt;
&lt;td&gt;111,130&lt;/td&gt;
&lt;td&gt;40.6s&lt;/td&gt;
&lt;td&gt;1.75s&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;23x&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;0%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;level=error status=503&lt;/td&gt;
&lt;td&gt;15,838&lt;/td&gt;
&lt;td&gt;40.6s&lt;/td&gt;
&lt;td&gt;1.71s&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;24x&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;0%&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Baseline: &lt;code&gt;zcat logs.json.gz | jq 'select(...)'&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Every count matches jq exactly. 9/9, 111,130/111,130, 15,838/15,838. No approximations, no missed lines.&lt;/p&gt;

&lt;h3&gt;
  
  
  How it works
&lt;/h3&gt;

&lt;p&gt;During index build, xgrep hashes three things per JSON field into each block's bloom filter: the field name, the value, and the field-value pair. When you query &lt;code&gt;user_id=42042&lt;/code&gt;, the bloom can distinguish "42042 appears in the user_id field" from "42042 appears somewhere in the line." That precision is what drives the skip rate.&lt;/p&gt;

&lt;h3&gt;
  
  
  Benchmark
&lt;/h3&gt;

&lt;p&gt;1M NDJSON lines, 244MB uncompressed, 22MB gzip.&lt;/p&gt;

&lt;p&gt;| Query | Matches | &lt;code&gt;zcat \| jq&lt;/code&gt; | &lt;code&gt;xgrep -j&lt;/code&gt; | Speedup | Block skip |&lt;br&gt;
|---|---|---|---|---|---|&lt;br&gt;
| &lt;code&gt;user_id=42042&lt;/code&gt; | 9 | 40.6s | 0.22s | &lt;strong&gt;188x&lt;/strong&gt; | 97% |&lt;br&gt;
| &lt;code&gt;status=503&lt;/code&gt; | 111,130 | 40.6s | 1.75s | &lt;strong&gt;23x&lt;/strong&gt; | 0% |&lt;br&gt;
| &lt;code&gt;level=error status=503&lt;/code&gt; | 15,838 | 40.6s | 1.71s | &lt;strong&gt;24x&lt;/strong&gt; | 0% |&lt;/p&gt;

&lt;p&gt;Every count matches jq exactly. 9/9, 111,130/111,130, 15,838/15,838. No approximations, no missed lines.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why it's still fast at 0% skip
&lt;/h3&gt;

&lt;p&gt;The selective query (188x) is the classic block-pruning win — 97% of blocks never get read. But the broad queries are the interesting result. At 0% skip, every block is searched, and xgrep is still 23x faster than &lt;code&gt;zcat | jq&lt;/code&gt;. That's because jq parses every line into a full JSON AST and evaluates an expression tree. xgrep does a targeted field lookup — no AST, no expression evaluator, just hash check then verify.&lt;/p&gt;

&lt;p&gt;Two advantages compound: I/O avoidance (skip blocks) and CPU avoidance (lighter evaluation). Even when the first one doesn't apply, the second one still delivers.&lt;/p&gt;

&lt;h2&gt;
  
  
  How it works (short version)
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Index&lt;/strong&gt;: decompress once, split into blocks, build bloom filters&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Query&lt;/strong&gt;: check filters, read only candidate blocks&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Execution&lt;/strong&gt;: memory-mapped, OS loads only what's needed&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The key metric isn't speed. It's &lt;strong&gt;bytes touched per query.&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;zgrep: 100% every time&lt;/li&gt;
&lt;li&gt;xgrep: 0.1-1%&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That's why the gap grows with data size.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tradeoffs (honest)
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Cache size&lt;/strong&gt;: ~5x compressed size (stores decompressed data)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;First run&lt;/strong&gt;: ~2 min index build (amortized quickly)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Not universal grep&lt;/strong&gt;: built for compressed logs + repeated search&lt;/li&gt;
&lt;li&gt;For plain text: use ripgrep.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Who this is for
&lt;/h2&gt;

&lt;p&gt;If you've ever:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;waited on &lt;code&gt;zgrep&lt;/code&gt; during an incident&lt;/li&gt;
&lt;li&gt;rerun the same search 10 times&lt;/li&gt;
&lt;li&gt;dealt with rotated &lt;code&gt;.gz&lt;/code&gt; logs&lt;/li&gt;
&lt;li&gt;wanted log-platform speed without log-platform overhead&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;cargo &lt;span class="nb"&gt;install &lt;/span&gt;xgrep-cli
xgrep &lt;span class="s2"&gt;"ERROR"&lt;/span&gt; logs/&lt;span class="k"&gt;*&lt;/span&gt;.gz
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://github.com/HighpassStudio/xgrep" rel="noopener noreferrer"&gt;github.com/HighpassStudio/xgrep&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Deep dive
&lt;/h2&gt;

&lt;p&gt;Architecture + benchmark methodology: &lt;a href="https://github.com/HighpassStudio/xgrep/blob/main/ARCHITECTURE.md" rel="noopener noreferrer"&gt;ARCHITECTURE.md&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;xgrep is Apache-2.0 licensed. Built with Rust, rayon, memchr, and flate2.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>cli</category>
      <category>performance</category>
      <category>showdev</category>
      <category>tooling</category>
    </item>
    <item>
      <title>Stop decompressing entire archives to get one file — introducing ARCX</title>
      <dc:creator>Highpass Studio</dc:creator>
      <pubDate>Thu, 19 Mar 2026 20:43:35 +0000</pubDate>
      <link>https://dev.to/highpass_studio_382ce5641/stop-decompressing-entire-archives-to-get-one-file-introducing-arcx-5dhn</link>
      <guid>https://dev.to/highpass_studio_382ce5641/stop-decompressing-entire-archives-to-get-one-file-introducing-arcx-5dhn</guid>
      <description>&lt;p&gt;Most archive formats make a simple task unnecessarily expensive: you need one file, so you download and decompress everything.&lt;/p&gt;

&lt;p&gt;I built &lt;strong&gt;ARCX&lt;/strong&gt;, a compressed archive format designed to fix that.&lt;/p&gt;

&lt;p&gt;ARCX combines cross-file compression (like tar+zstd) with indexed random access (like zip), so you can retrieve a single file from a large archive in milliseconds without decompressing the rest.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;

&lt;p&gt;GitHub: &lt;a href="https://github.com/getarcx/arcx" rel="noopener noreferrer"&gt;https://github.com/getarcx/arcx&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Install:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;cargo &lt;span class="nb"&gt;install &lt;/span&gt;arcx
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Benchmark results
&lt;/h2&gt;

&lt;p&gt;Across 5 real-world datasets:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;~7ms to retrieve a file from a ~200MB archive&lt;/li&gt;
&lt;li&gt;up to 200x less data read vs tar+zstd&lt;/li&gt;
&lt;li&gt;compression within ~3% of tar+zstd&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Example:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dataset&lt;/th&gt;
&lt;th&gt;ARCX Bytes Read&lt;/th&gt;
&lt;th&gt;TAR+ZSTD Bytes Read&lt;/th&gt;
&lt;th&gt;Reduction&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Python ML&lt;/td&gt;
&lt;td&gt;326 KB&lt;/td&gt;
&lt;td&gt;63.1 MB&lt;/td&gt;
&lt;td&gt;198x less&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Build Artifacts&lt;/td&gt;
&lt;td&gt;714 KB&lt;/td&gt;
&lt;td&gt;140.4 MB&lt;/td&gt;
&lt;td&gt;202x less&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Why this matters
&lt;/h2&gt;

&lt;p&gt;Modern systems don't need entire archives. They need one file, immediately.&lt;/p&gt;

&lt;p&gt;This shows up in:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;CI/CD pipelines (artifacts)&lt;/li&gt;
&lt;li&gt;cloud storage (partial retrieval)&lt;/li&gt;
&lt;li&gt;large codebases&lt;/li&gt;
&lt;li&gt;package registries&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;ARCX reduces archive access to a manifest lookup, one block read, and one block decompress.&lt;/p&gt;

&lt;h2&gt;
  
  
  How it works
&lt;/h2&gt;

&lt;p&gt;ARCX uses:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;block-based compression&lt;/li&gt;
&lt;li&gt;a binary manifest index&lt;/li&gt;
&lt;li&gt;direct offset reads&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Instead of scanning or decompressing the full archive:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Look up the file in the index&lt;/li&gt;
&lt;li&gt;Seek to the relevant block&lt;/li&gt;
&lt;li&gt;Decompress only that block&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Comparison
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Format&lt;/th&gt;
&lt;th&gt;Compression&lt;/th&gt;
&lt;th&gt;Selective Access&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;ZIP&lt;/td&gt;
&lt;td&gt;weaker&lt;/td&gt;
&lt;td&gt;fast&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;tar+zstd&lt;/td&gt;
&lt;td&gt;strong&lt;/td&gt;
&lt;td&gt;slow&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;ARCX&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;strong&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;fast&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Tradeoffs
&lt;/h2&gt;

&lt;p&gt;ARCX is not designed for streaming (like tar). The archive must be complete before reading because the manifest is written at the end.&lt;/p&gt;

&lt;h2&gt;
  
  
  Current limitations
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Remote/S3 range-read workflows not fully benchmarked yet&lt;/li&gt;
&lt;li&gt;Metadata/index overhead still being optimized for very large file counts&lt;/li&gt;
&lt;li&gt;Full extraction benchmarks in Rust are still in progress&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Feedback
&lt;/h2&gt;

&lt;p&gt;Still early -- feedback welcome.&lt;/p&gt;

</description>
      <category>performance</category>
      <category>rust</category>
      <category>showdev</category>
      <category>tooling</category>
    </item>
  </channel>
</rss>
