<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Mario Gutierrez</title>
    <description>The latest articles on DEV Community by Mario Gutierrez (@terrizoaguimor).</description>
    <link>https://dev.to/terrizoaguimor</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3880854%2F56d8a0a4-d491-4869-8f35-908d1b3d2b95.jpeg</url>
      <title>DEV Community: Mario Gutierrez</title>
      <link>https://dev.to/terrizoaguimor</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/terrizoaguimor"/>
    <language>en</language>
    <item>
      <title>Why I'm building Hyphae: provenance over prediction (and the 3-line baseline that tied it)</title>
      <dc:creator>Mario Gutierrez</dc:creator>
      <pubDate>Fri, 29 May 2026 14:55:20 +0000</pubDate>
      <link>https://dev.to/terrizoaguimor/why-im-building-hyphae-provenance-over-prediction-and-the-3-line-baseline-that-tied-it-2e32</link>
      <guid>https://dev.to/terrizoaguimor/why-im-building-hyphae-provenance-over-prediction-and-the-3-line-baseline-that-tied-it-2e32</guid>
      <description>&lt;p&gt;A few months ago I set out to build a cognitive substrate without a large language model in the answering path. I had a thesis I liked, a Rust workspace, and a lot of conviction.&lt;/p&gt;

&lt;p&gt;Then I wrote a three-line baseline that tied it on every metric I cared about.&lt;/p&gt;

&lt;p&gt;This is the story of why that was the best thing that happened to the project — and why I'm still building it, just pointed at a sharper target.&lt;/p&gt;

&lt;h2&gt;
  
  
  The problem I actually care about
&lt;/h2&gt;

&lt;p&gt;When a language model answers a grounded question, it &lt;em&gt;paraphrases&lt;/em&gt; its sources. That paraphrase is fluent, often correct, and — this is the part that bothers me — &lt;strong&gt;impossible to bind back to its source byte-for-byte&lt;/strong&gt;. You can cite a document. You cannot prove, after the fact, that the words in the answer are the words that were stored, unaltered, at a known position.&lt;/p&gt;

&lt;p&gt;For a chatbot that doesn't matter. For anything that has to be &lt;em&gt;audited&lt;/em&gt; — a compliance trail, a medical or legal memory, an agent acting on your behalf over months — it matters a lot. "Trust me, I read the docs" is not a property you can verify.&lt;/p&gt;

&lt;p&gt;So I started building &lt;strong&gt;Hyphae&lt;/strong&gt;: a substrate that answers by emitting &lt;strong&gt;byte-identical quotations&lt;/strong&gt; of stored memory fragments, over a SHA-256 hash-chained journal, with no LLM in the cognition path. Rust, CPU-only, a single binary.&lt;/p&gt;

&lt;p&gt;The shape of it is simple:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Every stored fragment is appended verbatim to a hash chain.&lt;/span&gt;
&lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;seq&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;head&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;journal&lt;/span&gt;&lt;span class="nf"&gt;.append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"memory_op"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;fragment&lt;/span&gt;&lt;span class="nf"&gt;.bytes&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;&lt;span class="o"&gt;?&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;// An answer span is a byte-identical quotation of a stored fragment.&lt;/span&gt;
&lt;span class="c1"&gt;// Tamper with any historical entry and the recomputed chain breaks&lt;/span&gt;
&lt;span class="c1"&gt;// at the next link — verify() localises exactly where.&lt;/span&gt;
&lt;span class="n"&gt;journal&lt;/span&gt;&lt;span class="nf"&gt;.verify&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;?&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Nothing here is cryptographically novel. Hash-chained logs are old and well understood — Haber &amp;amp; Stornetta in 1991, Merkle before that, Certificate Transparency, git. I want to be honest about that up front, because the interesting part isn't the chain.&lt;/p&gt;

&lt;h2&gt;
  
  
  The day a three-line baseline tied me
&lt;/h2&gt;

&lt;p&gt;I wanted to show Hyphae was &lt;em&gt;better&lt;/em&gt; than an LLM+RAG pipeline at grounded answering. So I built the comparison properly: a real retriever, reranking, six models across three retrieval modes, two corpora, twelve metrics.&lt;/p&gt;

&lt;p&gt;Then a reviewer asked the obvious question: &lt;em&gt;what does a trivial baseline score?&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;So I wrote &lt;code&gt;echo&lt;/code&gt; — a few lines that just print the retrieved fragment back. It tied Hyphae on every correctness and grounding metric. So did &lt;code&gt;echo + journal&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;That stings for about a day. Then it becomes the whole point.&lt;/p&gt;

&lt;p&gt;The measured correctness and grounding were never properties of &lt;em&gt;my system&lt;/em&gt;. They are properties of &lt;strong&gt;verbatim quotation&lt;/strong&gt; itself. If you emit a stored span unchanged, of course it's "grounded" — it &lt;em&gt;is&lt;/em&gt; the source. Hyphae's seventeen subsystems weren't what made the answers auditable. The verbatim-emission-over-a-journal layer was. And that layer is &lt;strong&gt;addable to any extractive retrieval system&lt;/strong&gt; — it isn't Hyphae-specific at all.&lt;/p&gt;

&lt;p&gt;So I stopped claiming Hyphae was a better brain and started claiming something narrower and, I think, truer:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Verifiable provenance is a property you can add to grounded retrieval. A paraphrase destroys byte-level bindability to its source; a verbatim quotation preserves it, and a hash chain makes that binding independently auditable.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The contribution isn't the hash chain. It's the &lt;em&gt;observation&lt;/em&gt;, and the &lt;em&gt;measurement&lt;/em&gt; of it against eighteen LLM configurations and a tamper-detection benchmark.&lt;/p&gt;

&lt;h2&gt;
  
  
  Closing the gaps, honestly
&lt;/h2&gt;

&lt;p&gt;Once you claim "tamper-evident," people who know what they're doing immediately ask where it breaks. Good. The threat model is the product.&lt;/p&gt;

&lt;p&gt;A bare hash chain catches a store-only attacker who edits a record in place. It does &lt;strong&gt;not&lt;/strong&gt; catch a &lt;em&gt;chain-aware&lt;/em&gt; attacker who recomputes every hash forward and rewrites the head — because the head lives in the same store. So I anchor the head with an Ed25519 signature held outside the store (the attacker can't re-sign). That closes it.&lt;/p&gt;

&lt;p&gt;But a single signature pins &lt;em&gt;a&lt;/em&gt; valid head, not &lt;em&gt;the latest&lt;/em&gt; one. Every head the journal ever had was, at its time, legitimately signed. An attacker can roll back to an earlier state and replay its genuine-but-stale anchor — and a lone signature check accepts it. So the heads get published to an &lt;strong&gt;append-only, hash-chained ledger&lt;/strong&gt;, and an auditor checks the current head against the ledger's &lt;em&gt;tail&lt;/em&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="c1"&gt;// A single signature pins *a* valid head.&lt;/span&gt;
&lt;span class="c1"&gt;// An append-only ledger pins *the latest* one.&lt;/span&gt;
&lt;span class="nf"&gt;verify_fresh_head&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;current_head&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;ledger&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;verifying_key&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// rollback rejected&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's the pattern from Certificate Transparency and git, applied to memory provenance: the value isn't the chain, it's publishing the head to a monotonic log that third parties can compare.&lt;/p&gt;

&lt;p&gt;And I keep a column for what I &lt;em&gt;haven't&lt;/em&gt; closed: a store that withholds later ledger entries is only caught once an auditor gets the true tail from an external witness (a timestamp authority, a gossiped tree head). That's deployment work, and I'd rather write it down than pretend it's solved.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where this is going
&lt;/h2&gt;

&lt;p&gt;The direction got clearer the moment the echo baseline humbled me. I'm not building a better answer engine. I'm building &lt;strong&gt;provenance as a first-class, measurable property of grounded AI&lt;/strong&gt;, in the open. Concretely:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;A provenance benchmark.&lt;/strong&gt; Correctness benchmarks compare RAG systems on answer quality. There's no standard way to compare &lt;em&gt;verifiable-generation&lt;/em&gt; systems on whether tampering is detectable and localisable. So I built one: a tampering taxonomy, an adversary-capability matrix, and a scoring protocol any system can plug into. That's the axis I think actually matters for AI you have to trust over time.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Provenance as an addable layer.&lt;/strong&gt; The realizer-independence is the feature, not a caveat. The goal is for any extractive retriever to be able to adopt the layer.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;External witnessing and key rotation&lt;/strong&gt; to harden the ledger for real deployments.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The bigger picture&lt;/strong&gt; — this is one piece of &lt;a href="https://celiums.ai" rel="noopener noreferrer"&gt;Celiums&lt;/a&gt;, where the bet is that &lt;em&gt;memory&lt;/em&gt; is the foundation for AI agents you can actually audit, not an afterthought bolted onto a model.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  It's all open
&lt;/h2&gt;

&lt;p&gt;The substrate, the LLM+RAG comparator, every result envelope, the tamper-detection experiment, the provenance benchmark, and the full preprint are public. Code is Apache-2.0; the docs, corpora, and preprint are CC-BY-4.0.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Code:&lt;/strong&gt; &lt;a href="https://github.com/terrizoaguimor/hyphae-v2" rel="noopener noreferrer"&gt;https://github.com/terrizoaguimor/hyphae-v2&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Preprint (Zenodo DOI):&lt;/strong&gt; &lt;a href="https://doi.org/10.5281/zenodo.20436643" rel="noopener noreferrer"&gt;https://doi.org/10.5281/zenodo.20436643&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I'm a solo, self-taught founder building this in public, which means the dead ends are public too — the echo baseline being the best example. If you work on retrieval, tamper-evident logs, or grounded generation, I'd genuinely like to hear where you think this breaks. The threat model only gets better when someone smarter than me attacks it.&lt;/p&gt;

</description>
      <category>rust</category>
      <category>ai</category>
      <category>opensource</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>I Have ADHD and I Keep Losing Context — So I Taught My AI to Remember for Me</title>
      <dc:creator>Mario Gutierrez</dc:creator>
      <pubDate>Fri, 17 Apr 2026 13:38:29 +0000</pubDate>
      <link>https://dev.to/terrizoaguimor/i-have-adhd-and-i-keep-losing-context-so-i-taught-my-ai-to-remember-for-me-afl</link>
      <guid>https://dev.to/terrizoaguimor/i-have-adhd-and-i-keep-losing-context-so-i-taught-my-ai-to-remember-for-me-afl</guid>
      <description>&lt;p&gt;I'm going to be honest about something that most people in tech don't talk about openly.&lt;/p&gt;

&lt;p&gt;I have ADHD. I'm 40. I'm a self-taught developer with no CS degree. And I lose context constantly.&lt;/p&gt;

&lt;p&gt;Not in the "oh I forgot where I put my keys" way. In the "I just spent 3 hours deep in a codebase, got interrupted by a Slack message, and now I genuinely cannot remember what I was doing or why" way.&lt;/p&gt;

&lt;p&gt;If you have ADHD, you know exactly what I'm talking about. If you don't — imagine your browser crashing and losing 47 tabs. That panic? That's every Tuesday for me.&lt;/p&gt;

&lt;h2&gt;
  
  
  The AI Paradox for ADHD Brains
&lt;/h2&gt;

&lt;p&gt;Here's the thing nobody warned me about: &lt;strong&gt;AI coding tools made my ADHD worse.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Not better. Worse.&lt;/p&gt;

&lt;p&gt;Why? Because they're stateless. Every session is a blank slate. And for someone who already struggles with working memory, having to rebuild context from scratch every time I open Claude Code or ChatGPT is genuinely exhausting.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Me at 9am: "I'm working on a Next.js app with Supabase auth, 
           JWT refresh tokens, the user table has these columns..."

Me at 2pm: "I'm working on a Next.js app with Supabase auth,
           JWT refresh tokens, the user table has these columns..."

Me at 9pm: "I'm working on a Next.js app with Supabase auth..."
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I type the same context paragraph 5-8 times a day. Every. Single. Day.&lt;/p&gt;

&lt;p&gt;For a neurotypical developer, this is annoying. For me, each repetition drains the limited executive function I have. By the third time, I'm frustrated. By the fifth, I'm done for the day.&lt;/p&gt;

&lt;p&gt;The tool that's supposed to help me think is making me spend my mental energy on &lt;em&gt;remembering what to tell it&lt;/em&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Breaking Point
&lt;/h2&gt;

&lt;p&gt;It happened on a Thursday at midnight. I was debugging a race condition. I'd been at it for two hours. Claude gave me a suggestion I'd already tried — because it didn't know I'd already tried it. Because it doesn't remember anything.&lt;/p&gt;

&lt;p&gt;I typed (and I'm not proud of this):&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"I ALREADY TRIED THAT. WE DISCUSSED THIS 20 MINUTES AGO IN THE PREVIOUS SESSION."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Then I realized: I was yelling at a machine for not having memory. And the machine was right to not remember — it was designed that way. Every AI tool is designed that way.&lt;/p&gt;

&lt;p&gt;That's when something clicked. Not the bug (that took another hour). Something else:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The problem isn't AI intelligence. It's AI amnesia.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Actually Needed
&lt;/h2&gt;

&lt;p&gt;I sat down and wrote a list of what would actually help my ADHD brain:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Don't make me repeat context.&lt;/strong&gt; Ever. If I told you once, you should know it forever.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Remember what worked AND what didn't.&lt;/strong&gt; So you don't suggest the thing I already tried and rejected.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Know when I'm frustrated.&lt;/strong&gt; If I've been debugging the same thing for 2 hours at midnight, maybe don't give me a 500-word explanation. Just give me the fix.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Carry context across projects.&lt;/strong&gt; My coding style doesn't change when I switch repos. My preferences don't reset.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Forget things naturally.&lt;/strong&gt; Not everything is worth remembering forever. That random CSS hack from 6 months ago? Let it fade.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I looked for tools that did this. I found vector databases pretending to be memory. I found RAG pipelines that retrieve documents but don't understand context. I found "memory" features that are just conversation logs with a search bar.&lt;/p&gt;

&lt;p&gt;Nothing actually modeled how memory works. How &lt;em&gt;my&lt;/em&gt; brain works (when it works).&lt;/p&gt;

&lt;h2&gt;
  
  
  So I Started Building
&lt;/h2&gt;

&lt;p&gt;I'm not going to pitch you anything. But I will tell you what I learned, because maybe it's useful to someone else.&lt;/p&gt;

&lt;p&gt;I started reading neuroscience papers. Real ones. Not Medium articles — actual research on how human memory works:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;The Ebbinghaus forgetting curve&lt;/strong&gt; — memories decay exponentially unless reinforced. Your brain doesn't delete things; they just fade. This is the opposite of how databases work (store forever, delete explicitly).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;The PAD emotional model&lt;/strong&gt; — Pleasure, Arousal, Dominance. Psychologists use three dimensions to map any emotional state. Turns out, emotions are not decorations on memory — they're part of the retrieval mechanism. You remember emotional events better. Your brain literally prioritizes them.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Circadian rhythms affect cognition&lt;/strong&gt; — you're not the same developer at 9am and 11pm. Your ability to focus, recall, and make decisions fluctuates. Any system that treats you the same at all times is ignoring biology.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I started modeling these concepts in code. Not as a product — as a personal tool. Something that would sit between me and my AI tools and just... remember things for me.&lt;/p&gt;

&lt;p&gt;The first version was ugly. 200 lines of JavaScript and a JSON file. But it worked. I told it my preferences once, and the next day, they were still there. I told it about a debugging approach that failed, and it remembered that too.&lt;/p&gt;

&lt;p&gt;For the first time in years, I felt like my tools were adapting to me instead of the other way around.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Learned About Memory (and ADHD)
&lt;/h2&gt;

&lt;p&gt;Building this taught me things about my own brain:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Context is not content.&lt;/strong&gt; There's a difference between "what was said" and "what matters." Most AI memory solutions store conversations. But what I need is &lt;em&gt;context&lt;/em&gt; — the decisions, the preferences, the patterns. Not the 400 lines of chat where we discussed them.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Forgetting is a feature, not a bug.&lt;/strong&gt; My ADHD brain forgets things constantly, and yes, it's frustrating. But it also means I don't carry stale assumptions. I approach problems fresh. The best memory system isn't one that remembers everything — it's one that remembers the right things and lets the rest fade.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Emotions are metadata.&lt;/strong&gt; When I'm frustrated with a solution, that frustration is information. It means "this approach has a problem, even if the code technically works." Any memory system that strips emotional context is throwing away signal.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. My ADHD is a design constraint, not a disability.&lt;/strong&gt; When I design for my own limitations — short working memory, need for external structure, sensitivity to context switches — I end up building things that work better for everyone. Turns out, neurotypical developers also hate re-explaining context to their AI tools. They're just more patient about it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Uncomfortable Truth
&lt;/h2&gt;

&lt;p&gt;Here's what I think about late at night:&lt;/p&gt;

&lt;p&gt;Every major AI company is racing to build bigger models, longer context windows, better reasoning. And that's great. But none of them are building memory.&lt;/p&gt;

&lt;p&gt;Not real memory. Not the kind that persists, decays, carries emotion, and adapts to you over time.&lt;/p&gt;

&lt;p&gt;They're building incredibly smart assistants with permanent amnesia. And we've all just... accepted that?&lt;/p&gt;

&lt;p&gt;I didn't accept it. I couldn't. My brain wouldn't let me.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where I Am Now
&lt;/h2&gt;

&lt;p&gt;I'm a solo founder. Venezuelan, living in Medellín, Colombia. Zero employees. I run the company with AI agents (which is a whole other post).&lt;/p&gt;

&lt;p&gt;I've been working on this problem for months. Some days I feel like I'm building something important. Other days I feel like I'm a guy with ADHD who got hyperfocused on a niche problem and can't stop.&lt;/p&gt;

&lt;p&gt;Both are probably true.&lt;/p&gt;

&lt;p&gt;If you're a developer with ADHD — or honestly, any developer who's tired of being the memory system for your AI tools — I'd love to hear how you deal with it. What workarounds have you found? What do you wish your tools remembered?&lt;/p&gt;

&lt;p&gt;And if you're building AI tools: please, for the love of everything, add persistent memory. Your ADHD users will thank you. Your neurotypical users will thank you too, they just won't know why.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This is my first post here. I'm Mario. I'll probably write about building things alone, neurodivergent entrepreneurship, and why AI tools should be designed for how brains actually work — not how we wish they worked.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>adhd</category>
      <category>productivity</category>
      <category>beginners</category>
    </item>
  </channel>
</rss>
