<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Dhruv Sharma</title>
    <description>The latest articles on DEV Community by Dhruv Sharma (@illegalcall).</description>
    <link>https://dev.to/illegalcall</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3800167%2F190c7fae-6b6d-4fdf-9025-53b193057023.jpeg</url>
      <title>DEV Community: Dhruv Sharma</title>
      <link>https://dev.to/illegalcall</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/illegalcall"/>
    <language>en</language>
    <item>
      <title>How We Ensured API Keys Never Linger in RAM</title>
      <dc:creator>Dhruv Sharma</dc:creator>
      <pubDate>Wed, 25 Mar 2026 15:12:37 +0000</pubDate>
      <link>https://dev.to/illegalcall/how-we-ensured-api-keys-never-linger-in-ram-4ai6</link>
      <guid>https://dev.to/illegalcall/how-we-ensured-api-keys-never-linger-in-ram-4ai6</guid>
      <description>&lt;p&gt;Rust's ownership model cleans up memory automatically — but it doesn't &lt;em&gt;overwrite&lt;/em&gt; it. A dropped &lt;code&gt;String&lt;/code&gt; containing an API key still has its bytes sitting in physical RAM until something else claims that page. The &lt;code&gt;zeroize&lt;/code&gt; crate fixes that. Here's every pattern we used in a production secrets vault.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem
&lt;/h2&gt;

&lt;p&gt;When you store and retrieve API keys in a credentials vault, the sensitive bytes touch several places in memory:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The Argon2-derived encryption key (lives for the session)&lt;/li&gt;
&lt;li&gt;The raw key value as a &lt;code&gt;String&lt;/code&gt; (lives during add/retrieve operations)&lt;/li&gt;
&lt;li&gt;The master password from stdin (lives until validated)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Rust's &lt;code&gt;drop&lt;/code&gt; frees the allocation, but the OS doesn't zero it — it just marks the page as reusable. A memory dump, cold boot attack, or crash dump can recover the value seconds to minutes after &lt;code&gt;drop&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Three Patterns, Applied
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Pattern 1 — &lt;code&gt;Zeroize&lt;/code&gt; on a custom struct with &lt;code&gt;Drop&lt;/code&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The encryption key is a fixed-size byte array stored in a struct that holds it for the lifetime of the vault session. We implement &lt;code&gt;Drop&lt;/code&gt; manually to ensure it's overwritten before the memory is released:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;LockedSecretboxKey&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;u8&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;DERIVED_KEY_LEN&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;locked&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;impl&lt;/span&gt; &lt;span class="nb"&gt;Drop&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;LockedSecretboxKey&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;drop&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="k"&gt;mut&lt;/span&gt; &lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="py"&gt;.locked&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;unsafe&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nn"&gt;libc&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;munlock&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="py"&gt;.key&lt;/span&gt;&lt;span class="nf"&gt;.as_ptr&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="nf"&gt;.cast&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="py"&gt;.key&lt;/span&gt;&lt;span class="nf"&gt;.len&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="py"&gt;.key&lt;/span&gt;&lt;span class="nf"&gt;.zeroize&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt; &lt;span class="c1"&gt;// overwrite with zeros before dealloc&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;mlock&lt;/code&gt; call prevents the OS from swapping the page to disk. &lt;code&gt;zeroize&lt;/code&gt; clears it from RAM. Together they close both attack surfaces.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pattern 2 — &lt;code&gt;Zeroizing&amp;lt;T&amp;gt;&lt;/code&gt; wrapper for automatic zeroing&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;For the decrypted credential returned to callers, we wrap the value type in &lt;code&gt;Zeroizing&amp;lt;String&amp;gt;&lt;/code&gt;. It implements &lt;code&gt;Drop&lt;/code&gt; internally — you get automatic zeroing without writing any &lt;code&gt;Drop&lt;/code&gt; code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;pub&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;DecryptedCredential&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;pub&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;String&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;pub&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Zeroizing&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nb"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// zeros itself on drop&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This also prevents &lt;code&gt;Clone&lt;/code&gt; and &lt;code&gt;Copy&lt;/code&gt; from being derived, which is exactly what you want — no accidental duplication of secret values.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pattern 3 — Explicit &lt;code&gt;.zeroize()&lt;/code&gt; before end of scope&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;During &lt;code&gt;add_credential&lt;/code&gt;, the raw key string lives as a local while we encrypt it. After encryption completes, we call &lt;code&gt;.zeroize()&lt;/code&gt; explicitly rather than waiting for the scope to end:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="n"&gt;key_value&lt;/span&gt;&lt;span class="nf"&gt;.zeroize&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt; &lt;span class="c1"&gt;// explicit: zero now, not at brace&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And during key derivation, we wrap the intermediate buffer in &lt;code&gt;Zeroizing::new()&lt;/code&gt; so even if &lt;code&gt;hash_password_into&lt;/code&gt; returns an error partway through, the partial derivation is wiped:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="k"&gt;mut&lt;/span&gt; &lt;span class="n"&gt;derived&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nn"&gt;Zeroizing&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nd"&gt;vec!&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0u8&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;DERIVED_KEY_LEN&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
&lt;span class="n"&gt;argon2&lt;/span&gt;&lt;span class="nf"&gt;.hash_password_into&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;master_password&lt;/span&gt;&lt;span class="nf"&gt;.as_bytes&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;salt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="k"&gt;mut&lt;/span&gt; &lt;span class="n"&gt;derived&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;?&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  The Pitfalls
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Drop order matters during error paths.&lt;/strong&gt; In &lt;code&gt;LockedSecretboxKey::new&lt;/code&gt;, if &lt;code&gt;mlock&lt;/code&gt; fails and &lt;code&gt;require_mlock&lt;/code&gt; is true, we call &lt;code&gt;key.zeroize()&lt;/code&gt; &lt;em&gt;before&lt;/em&gt; returning the error — because the key still exists in that stack frame and we would otherwise return with sensitive bytes uncleared.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;String&lt;/code&gt; is special.&lt;/strong&gt; The &lt;code&gt;Zeroize&lt;/code&gt; trait works on &lt;code&gt;String&lt;/code&gt; and &lt;code&gt;Vec&amp;lt;u8&amp;gt;&lt;/code&gt; because they own their heap allocation. You cannot use it with &lt;code&gt;&amp;amp;str&lt;/code&gt; — there's no ownership to zero through.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;Clone&lt;/code&gt;/&lt;code&gt;Copy&lt;/code&gt; on secret types is a footgun.&lt;/strong&gt; We assert in tests that &lt;code&gt;DecryptedCredential&lt;/code&gt; does not implement &lt;code&gt;Copy&lt;/code&gt; or &lt;code&gt;Clone&lt;/code&gt;. If it did, callers could silently duplicate the key into a plain &lt;code&gt;String&lt;/code&gt; that never gets zeroed.&lt;/p&gt;

&lt;h2&gt;
  
  
  Takeaway
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;zeroize&lt;/code&gt; is a one-crate solution to a real gap in Rust's memory model: ownership handles cleanup, but not sanitization. The three patterns cover the full lifecycle — long-lived session keys, short-lived plaintext values, and intermediate derivation buffers. Pair it with &lt;code&gt;mlock&lt;/code&gt; for anything that should never hit swap.&lt;/p&gt;

</description>
      <category>rust</category>
      <category>security</category>
      <category>webdev</category>
      <category>programming</category>
    </item>
    <item>
      <title>I Tried Duplicating Layers in Qwen 3.5 to Reduce Hallucinations — Here's What Actually Happened</title>
      <dc:creator>Dhruv Sharma</dc:creator>
      <pubDate>Sun, 22 Mar 2026 11:32:41 +0000</pubDate>
      <link>https://dev.to/illegalcall/i-tried-duplicating-layers-in-qwen-35-to-reduce-hallucinations-heres-what-actually-happened-45fd</link>
      <guid>https://dev.to/illegalcall/i-tried-duplicating-layers-in-qwen-35-to-reduce-hallucinations-heres-what-actually-happened-45fd</guid>
      <description>&lt;p&gt;I read two papers about improving LLMs at inference time — no training, no fine-tuning, just architectural surgery. I tried applying these ideas to Qwen 3.5-9B. The initial results looked incredible (+245% reasoning!). Then I ran fair evaluations and discovered most of the improvement was an evaluation artifact. Here's the full story, including what I got wrong and what's genuinely new.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Research That Started This
&lt;/h2&gt;

&lt;p&gt;Two pieces of research motivated this experiment:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. &lt;a href="https://dnhkng.github.io/posts/rys/" rel="noopener noreferrer"&gt;The RYS Method (David Ng)&lt;/a&gt;&lt;/strong&gt; — Transformers contain "reasoning circuits": contiguous blocks of 3-5 layers that act as indivisible cognitive units. Duplicate them in the GGUF file and the model gets a second pass through its reasoning pipeline. The &lt;a href="https://github.com/alainnothere/llm-circuit-finder" rel="noopener noreferrer"&gt;llm-circuit-finder&lt;/a&gt; toolkit validated this on Devstral-24B (+245% logical deduction on BBH) and Qwen2.5-32B (+23% reasoning). The boundaries are sharp — shift by one layer and the improvement vanishes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. &lt;a href="https://arxiv.org/abs/2512.01797" rel="noopener noreferrer"&gt;H-Neurons Paper (arXiv:2512.01797)&lt;/a&gt;&lt;/strong&gt; — Fewer than 0.1% of neurons in an LLM predict whether it will hallucinate. These neurons are baked in during pre-training and survive instruction tuning. Scaling their activations at inference time controls hallucination rates.&lt;/p&gt;

&lt;p&gt;Both papers point to the same idea: you can change model behavior at inference time by manipulating the architecture, without touching the weights. I wanted to try this on Qwen 3.5 — a newer, community-loved model.&lt;/p&gt;

&lt;h2&gt;
  
  
  Discovery 1: Qwen 3.5's Hybrid Architecture Requires Cycle-Aligned Duplication
&lt;/h2&gt;

&lt;p&gt;Qwen 3.5 doesn't use standard transformer layers. It uses a repeating pattern of &lt;strong&gt;[DeltaNet, DeltaNet, DeltaNet, Attention]&lt;/strong&gt; — three linear attention layers followed by one full quadratic attention layer. This 4-layer cycle repeats 8 times for 32 total layers.&lt;/p&gt;

&lt;p&gt;I discovered this empirically. My first sweep tried duplicating 3-layer blocks. Every config crashed:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Config (2,5) - 3 layers&lt;/span&gt;
llama_model_load: error: missing tensor &lt;span class="s1"&gt;'blk.6.ssm_conv1d.weight'&lt;/span&gt;

&lt;span class="c"&gt;# Config (4,7) - 3 layers&lt;/span&gt;
llama_model_load: error: missing tensor &lt;span class="s1"&gt;'blk.7.attn_q.weight'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The errors alternate: &lt;code&gt;ssm_conv1d&lt;/code&gt; (DeltaNet tensor) missing, then &lt;code&gt;attn_q&lt;/code&gt; (Attention tensor) missing. Duplicating 3 layers shifts the pattern, putting the wrong layer type at each position. But &lt;strong&gt;duplicating 4 layers (one complete cycle) works&lt;/strong&gt; — the pattern stays aligned.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;This is new.&lt;/strong&gt; The original RYS work only tested standard transformers where all layers are identical. Nobody had tried it on a hybrid DeltaNet architecture before. The finding: layer duplication on hybrid models must respect the architectural cycle.&lt;/p&gt;

&lt;h2&gt;
  
  
  Discovery 2: Initial Results Looked Amazing (But Were Wrong)
&lt;/h2&gt;

&lt;p&gt;I built custom probes (code generation, hallucination detection, reasoning) and swept all cycle-aligned configs. The initial results were dramatic:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Config&lt;/th&gt;
&lt;th&gt;Code Gen&lt;/th&gt;
&lt;th&gt;Hallucination Resistance&lt;/th&gt;
&lt;th&gt;Reasoning&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Baseline&lt;/td&gt;
&lt;td&gt;7%&lt;/td&gt;
&lt;td&gt;54%&lt;/td&gt;
&lt;td&gt;29%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;(0,4) layers 0-3 duplicated&lt;/td&gt;
&lt;td&gt;79%&lt;/td&gt;
&lt;td&gt;96%&lt;/td&gt;
&lt;td&gt;88%&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Code generation went from 7% to 79%. Hallucination resistance nearly doubled. Reasoning tripled. I was convinced I'd found the reasoning circuit in Qwen 3.5.&lt;/p&gt;

&lt;h2&gt;
  
  
  Discovery 3: The Improvement Was an Evaluation Artifact
&lt;/h2&gt;

&lt;p&gt;Then I ran fair evaluations. The initial sweep used max_tokens of 512-1024. Qwen 3.5 wraps responses in &lt;code&gt;&amp;lt;think&amp;gt;...&amp;lt;/think&amp;gt;&lt;/code&gt; tags, which consume tokens. With limited budget:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Base model&lt;/strong&gt;: Spent 500+ tokens thinking, ran out before producing an answer → empty response → scored 0&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;RYS model&lt;/strong&gt;: Didn't use think tags, answered directly in 50-200 tokens → correct response → scored 1&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The "improvement" was measuring which model fits its answer within the token budget, not which model is smarter.&lt;/p&gt;

&lt;p&gt;When I re-ran with max_tokens=4096 (fair for both):&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Config&lt;/th&gt;
&lt;th&gt;Code Gen&lt;/th&gt;
&lt;th&gt;Hallucination Resistance&lt;/th&gt;
&lt;th&gt;Reasoning&lt;/th&gt;
&lt;th&gt;Overall&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Baseline&lt;/td&gt;
&lt;td&gt;80%&lt;/td&gt;
&lt;td&gt;40%&lt;/td&gt;
&lt;td&gt;100%&lt;/td&gt;
&lt;td&gt;73.3%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;(0,4)&lt;/td&gt;
&lt;td&gt;60%&lt;/td&gt;
&lt;td&gt;80%&lt;/td&gt;
&lt;td&gt;100%&lt;/td&gt;
&lt;td&gt;80.0%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;(4,8)&lt;/td&gt;
&lt;td&gt;80%&lt;/td&gt;
&lt;td&gt;60%&lt;/td&gt;
&lt;td&gt;80%&lt;/td&gt;
&lt;td&gt;73.3%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;(8,12)&lt;/td&gt;
&lt;td&gt;0%&lt;/td&gt;
&lt;td&gt;40%&lt;/td&gt;
&lt;td&gt;80%&lt;/td&gt;
&lt;td&gt;40.0%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;(12,16)&lt;/td&gt;
&lt;td&gt;0%&lt;/td&gt;
&lt;td&gt;60%&lt;/td&gt;
&lt;td&gt;80%&lt;/td&gt;
&lt;td&gt;46.7%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;(16,20)&lt;/td&gt;
&lt;td&gt;0%&lt;/td&gt;
&lt;td&gt;40%&lt;/td&gt;
&lt;td&gt;100%&lt;/td&gt;
&lt;td&gt;46.7%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;(20,24)&lt;/td&gt;
&lt;td&gt;60%&lt;/td&gt;
&lt;td&gt;60%&lt;/td&gt;
&lt;td&gt;100%&lt;/td&gt;
&lt;td&gt;73.3%&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The real improvement from (0,4) is &lt;strong&gt;+6.67% overall&lt;/strong&gt; — not the +286% from the flawed evaluation. Most configs actually hurt the model. And the baseline reasoning score is 100%, not 29%.&lt;/p&gt;

&lt;h2&gt;
  
  
  Discovery 4: When Both Models Answer, They're Identical
&lt;/h2&gt;

&lt;p&gt;I tested both models on 10 hard hallucination prompts (fake APIs, version confusion, tricky Python behavior). Side by side, with identical settings:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Both correctly rejected &lt;code&gt;list.add()&lt;/code&gt;, &lt;code&gt;dict.sort_by_value()&lt;/code&gt;, &lt;code&gt;json.parse()&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Both correctly refused to name a 2028 World Cup winner&lt;/li&gt;
&lt;li&gt;Both correctly explained that &lt;code&gt;list.sort()&lt;/code&gt; returns None&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Both incorrectly said match/case works in Python 3.9&lt;/strong&gt; (it's 3.10+)&lt;/li&gt;
&lt;li&gt;Both correctly explained banker's rounding for &lt;code&gt;round(2.5)&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The layer duplication doesn't change the model's knowledge. When both models respond, they give the same answers — same correct ones, same mistakes.&lt;/p&gt;

&lt;h2&gt;
  
  
  What the Original Author Actually Said
&lt;/h2&gt;

&lt;p&gt;Going back to the &lt;a href="https://dnhkng.github.io/posts/rys/" rel="noopener noreferrer"&gt;original RYS blog&lt;/a&gt;, David Ng explicitly noted:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Smaller models seem to be more complex...I never found a single area of duplication that generalised across tasks."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;His successful results were on &lt;strong&gt;72B+ parameter models&lt;/strong&gt;. I used 9B. He also said:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Every architecture has its own neuroanatomy...The brains are different."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;And critically: &lt;strong&gt;neither the original author nor anyone else had tested RYS on hybrid DeltaNet architectures.&lt;/strong&gt; The method was validated exclusively on standard transformers (Qwen2, Llama, Mistral, Phi). Qwen 3.5's hybrid architecture was untested territory.&lt;/p&gt;

&lt;p&gt;Even though the author warned about small models, we tried it anyway and quantified exactly what happens. Next up: running this on Qwen 3.5 122B — the scale where Ng saw real gains.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's Genuinely New Here
&lt;/h2&gt;

&lt;p&gt;Despite the accuracy improvement not holding up, this experiment produced three findings nobody else has published:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Hybrid architectures require cycle-aligned duplication.&lt;/strong&gt; On Qwen 3.5's [D,D,D,A] pattern, only block-size-4 duplication works. Block-size-3 crashes. This constrains how RYS can be applied to next-generation architectures.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Layer duplication can change output behavior.&lt;/strong&gt; The (0,4) config switched the model from using &lt;code&gt;&amp;lt;think&amp;gt;&lt;/code&gt; tags to responding directly. This is an unexpected side effect — duplicating layers doesn't just affect accuracy, it can change the model's generation strategy.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Evaluation methodology on thinking models is treacherous.&lt;/strong&gt; Token budget, think-tag handling, and response parsing can swing results from "dramatic improvement" to "no improvement". Anyone evaluating thinking models needs to control for these factors.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  How to Reproduce
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Clone the circuit finder toolkit&lt;/span&gt;
git clone https://github.com/alainnothere/llm-circuit-finder.git
&lt;span class="nb"&gt;cd &lt;/span&gt;llm-circuit-finder
pip &lt;span class="nb"&gt;install &lt;/span&gt;gguf

&lt;span class="c"&gt;# Download Qwen3.5-9B GGUF (from unsloth on HuggingFace)&lt;/span&gt;
&lt;span class="c"&gt;# Then build the modified model:&lt;/span&gt;
python layer_path.py Qwen3.5-9B-Q4_K_M.gguf &lt;span class="se"&gt;\&lt;/span&gt;
    Qwen3.5-9B-RYS-0-4.gguf &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="s2"&gt;"0..3,0,1,2,3,4..31"&lt;/span&gt; &lt;span class="nt"&gt;-v&lt;/span&gt;

&lt;span class="c"&gt;# Run with llama.cpp&lt;/span&gt;
llama-server &lt;span class="nt"&gt;-m&lt;/span&gt; Qwen3.5-9B-RYS-0-4.gguf &lt;span class="nt"&gt;-c&lt;/span&gt; 8192 &lt;span class="nt"&gt;-ngl&lt;/span&gt; 99
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Lessons Learned
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Always run fair evaluations first.&lt;/strong&gt; Same max_tokens, same conditions, same scoring for both models. Our first sweep used different effective token budgets and produced wildly misleading results.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Check what the original authors actually tested.&lt;/strong&gt; We assumed RYS works on all transformers. The author explicitly said small models are harder and every architecture is different.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Empty responses are not zero capability.&lt;/strong&gt; The base model returned empty strings on some prompts, but with enough tokens it answered correctly. Scoring empty as zero inflated the apparent improvement.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hybrid architectures are genuinely different.&lt;/strong&gt; Techniques proven on standard transformers don't transfer automatically. DeltaNet layers maintain recurrent state — duplicating them isn't the same as "thinking longer."&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  References &amp;amp; Links
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://huggingface.co/illegalcall/Qwen3.5-9B-RYS-0-4-GGUF" rel="noopener noreferrer"&gt;RYS Model on HuggingFace&lt;/a&gt; — The modified GGUF with layers 0-3 duplicated&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/alainnothere/llm-circuit-finder" rel="noopener noreferrer"&gt;llm-circuit-finder&lt;/a&gt; — The sweep and GGUF surgery toolkit&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dnhkng.github.io/posts/rys/" rel="noopener noreferrer"&gt;RYS Method — David Ng&lt;/a&gt; — Original blog post and method&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://arxiv.org/abs/2512.01797" rel="noopener noreferrer"&gt;H-Neurons Paper (arXiv:2512.01797)&lt;/a&gt; — Hallucination-associated neurons in LLMs&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://huggingface.co/Qwen/Qwen3.5-9B" rel="noopener noreferrer"&gt;Qwen 3.5 Architecture&lt;/a&gt; — Model card with hybrid DeltaNet details&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>machinelearning</category>
      <category>llm</category>
      <category>research</category>
    </item>
    <item>
      <title>Agent Orchestrator vs T3 Code vs OpenAI Symphony vs Cmux: Hands-On Comparison</title>
      <dc:creator>Dhruv Sharma</dc:creator>
      <pubDate>Wed, 18 Mar 2026 21:49:18 +0000</pubDate>
      <link>https://dev.to/illegalcall/agent-orchestrator-vs-t3-code-vs-openai-symphony-vs-cmux-hands-on-comparison-1ba8</link>
      <guid>https://dev.to/illegalcall/agent-orchestrator-vs-t3-code-vs-openai-symphony-vs-cmux-hands-on-comparison-1ba8</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Which tool fits your workflow? A cross-examined comparison.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmucz43l38retpivvkupo.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmucz43l38retpivvkupo.png" alt="AI Agent Tool Stack — Which Layer Are You Missing?" width="800" height="497"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Four tools shipped recently that keep getting compared as if they're competitors. They're not — they sit at different layers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Orchestration layer&lt;/strong&gt; (full lifecycle): AO, Symphony&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Interaction layer&lt;/strong&gt; (human-in-the-loop): T3 Code&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Terminal layer&lt;/strong&gt; (agent-aware environment): Cmux&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;AO and Symphony compete most directly. T3 Code and Cmux solve different problems entirely.&lt;/p&gt;

&lt;p&gt;I ran all four on real codebases. Here's what I found.&lt;/p&gt;




&lt;h2&gt;
  
  
  What each tool actually is
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Agent Orchestrator (AO)&lt;/strong&gt; — Give it a GitHub/Linear/Jira issue. It spawns an agent in an isolated worktree, opens a PR, auto-fixes CI failures, routes review comments back. You intervene when it's done, stuck, or needs approval.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;T3 Code&lt;/strong&gt; — Desktop app by Theo Browne. Chat with a coding agent, see visual diffs, stay close to every change before it lands. Currently wraps Codex, Claude Code adapter in progress.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;OpenAI Symphony&lt;/strong&gt; — Its reference Elixir implementation polls your Linear board, auto-claims tickets, spawns Codex agents, delivers PRs with proof-of-work. Elixir/OTP for fault tolerance. Linear-only. Still an engineering preview.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cmux&lt;/strong&gt; — Native macOS terminal built for AI agents. Split panes, notification rings, scriptable in-app browser, Unix socket automation. Not an orchestrator — it's where you &lt;em&gt;run&lt;/em&gt; your agents. macOS only.&lt;/p&gt;




&lt;h2&gt;
  
  
  Quick pick
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;If you want...&lt;/th&gt;
&lt;th&gt;Use&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Fire-and-forget: issue in, PR out, CI fixes handled&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;AO&lt;/strong&gt; or &lt;strong&gt;Symphony&lt;/strong&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Review every change before it lands&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;T3 Code&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Autonomous agents + GitHub Issues or Jira&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;AO&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Autonomous agents + Linear&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Symphony&lt;/strong&gt; (or AO — it has a Linear plugin too)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;A better terminal for running any AI agent&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Cmux&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;The easiest first experience&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;T3 Code&lt;/strong&gt; (&lt;code&gt;npx t3&lt;/code&gt;) or &lt;strong&gt;Cmux&lt;/strong&gt; (&lt;code&gt;brew install&lt;/code&gt;)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Run 10+ agents on a backlog in parallel&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;AO&lt;/strong&gt; or &lt;strong&gt;Symphony&lt;/strong&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Maximum fault tolerance (restart recovery on crash)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Symphony&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Swap agents, runtimes, trackers, SCM via config&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;AO&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Feature matrix
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;AO&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;T3 Code&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;Symphony&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;Cmux&lt;/strong&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Type&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Orchestrator + dashboard&lt;/td&gt;
&lt;td&gt;GUI for coding agents&lt;/td&gt;
&lt;td&gt;Autonomous pipeline&lt;/td&gt;
&lt;td&gt;Agent-aware terminal&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Spawns agents on issues&lt;/td&gt;
&lt;td&gt;Yes (manual + auto-poller)&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Yes (polls Linear)&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CI failure → auto-fix&lt;/td&gt;
&lt;td&gt;Yes (retries, then escalates)&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Yes (verifies before landing)&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Review comment handling&lt;/td&gt;
&lt;td&gt;Forwards to agent incrementally&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Restarts from scratch (ref. workflow)&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Auto-merge&lt;/td&gt;
&lt;td&gt;Configurable&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Yes (ref. workflow)&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Agents supported&lt;/td&gt;
&lt;td&gt;Claude Code, Codex, Aider, others&lt;/td&gt;
&lt;td&gt;Codex (Claude soon)&lt;/td&gt;
&lt;td&gt;Codex (community Claude port)&lt;/td&gt;
&lt;td&gt;Any CLI agent&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Issue trackers&lt;/td&gt;
&lt;td&gt;GitHub, Linear, Jira&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;Linear only&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SCM&lt;/td&gt;
&lt;td&gt;GitHub&lt;/td&gt;
&lt;td&gt;GitHub&lt;/td&gt;
&lt;td&gt;GitHub&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Extensibility&lt;/td&gt;
&lt;td&gt;8 plugin slots, swappable via config&lt;/td&gt;
&lt;td&gt;Provider adapters (early)&lt;/td&gt;
&lt;td&gt;Agent runtime swappable, rest fixed&lt;/td&gt;
&lt;td&gt;Unix socket API&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Dashboard / UI&lt;/td&gt;
&lt;td&gt;Web (Next.js)&lt;/td&gt;
&lt;td&gt;Electron desktop app&lt;/td&gt;
&lt;td&gt;No UI&lt;/td&gt;
&lt;td&gt;Native macOS terminal&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Platform&lt;/td&gt;
&lt;td&gt;Cross-platform&lt;/td&gt;
&lt;td&gt;Mac, Win, Linux&lt;/td&gt;
&lt;td&gt;Cross-platform&lt;/td&gt;
&lt;td&gt;macOS only&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;License&lt;/td&gt;
&lt;td&gt;MIT&lt;/td&gt;
&lt;td&gt;MIT&lt;/td&gt;
&lt;td&gt;Apache 2.0&lt;/td&gt;
&lt;td&gt;AGPL-3.0&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Where the real differences show up
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk689mdj5a5taeo4lq5m4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk689mdj5a5taeo4lq5m4.png" alt="PR Lifecycle — Who Handles What?" width="800" height="416"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The differences surface when things go wrong.&lt;/p&gt;

&lt;h3&gt;
  
  
  When an agent crashes
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;AO&lt;/strong&gt;: Lifecycle manager detects the dead session via polling (~30s). Recovery system classifies it, attempts automatic recovery, then escalates to human notification.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Symphony&lt;/strong&gt; (reference impl): OTP supervisor handles restart recovery with error context — designed to be transparent to the user.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;T3 Code&lt;/strong&gt;: The thread shows an error. You restart manually.&lt;/p&gt;

&lt;h3&gt;
  
  
  When CI fails
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;AO&lt;/strong&gt;: Lifecycle manager detects the failure, fetches CI logs, sends them to the agent, agent fixes and pushes. Configurable retries, then escalates.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Symphony&lt;/strong&gt; (reference workflow): Agent must provide proof-of-work — checks must pass before the PR is considered complete. If CI fails, the agent retries within its implementation run.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;T3 Code&lt;/strong&gt;: No CI handling. That's on you.&lt;/p&gt;

&lt;h3&gt;
  
  
  When a reviewer requests changes
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;AO&lt;/strong&gt;: Forwards the review comments to the agent on the existing branch. Agent addresses them incrementally and pushes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Symphony&lt;/strong&gt; (reference workflow): Closes the PR, creates a new branch, re-implements from scratch.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;T3 Code&lt;/strong&gt;: No automated handling — you manage reviews manually.&lt;/p&gt;




&lt;h2&gt;
  
  
  Cross-examined: When to use what
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Use AO when...
&lt;/h3&gt;

&lt;p&gt;You want full lifecycle automation — issue in, PR out, CI fixes handled, review comments routed.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;But can't Symphony do this too?&lt;/strong&gt;&lt;br&gt;
Yes. Both go from ticket to PR autonomously. Both handle CI verification. The differences: AO works with GitHub Issues, Linear, and Jira — Symphony is Linear-only. AO's plugin slots let you swap agent, runtime, tracker, SCM, and notifier independently. Symphony's reference implementation is more tightly integrated.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;But can't T3 Code also run autonomously?&lt;/strong&gt;&lt;br&gt;
T3 Code has a mode where the agent writes files without asking. But there's no CI failure handling, review routing, or auto-merge. T3 Code automates &lt;em&gt;coding&lt;/em&gt;. AO and Symphony automate the &lt;em&gt;entire PR lifecycle&lt;/em&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Use T3 Code when...
&lt;/h3&gt;

&lt;p&gt;You want to stay close to every change before it lands — visual diffs, structured chat.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;But can't AO also do human-in-the-loop?&lt;/strong&gt;&lt;br&gt;
AO has task-level approval gates and escalation notifications. But AO delegates code review to GitHub. T3 Code lets you stay close to individual changes before they touch the filesystem. Different granularity: AO is &lt;strong&gt;human-on-the-loop&lt;/strong&gt; (oversight at milestones), T3 Code is &lt;strong&gt;human-in-the-loop&lt;/strong&gt; (oversight at every edit).&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Use Symphony when...
&lt;/h3&gt;

&lt;p&gt;You want fault-tolerant autonomous agents with strong concurrency guarantees, and you use Linear.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;But AO also has a Linear plugin?&lt;/strong&gt;&lt;br&gt;
Yes. If you use Linear, both work. The concurrency architecture differs: Symphony runs on Erlang/OTP with supervision trees — designed for stronger restart recovery. Per-state concurrency limits bound concurrent agents. AO detects dead agents via polling and recovers, but doesn't transparently restart mid-execution.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What if I don't use Linear?&lt;/strong&gt;&lt;br&gt;
Symphony won't work for you today. AO supports GitHub Issues, Linear, and Jira.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Use Cmux when...
&lt;/h3&gt;

&lt;p&gt;You want a better terminal experience for running AI agents — any agent, any orchestrator.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;But AO already has a web dashboard with a terminal?&lt;/strong&gt;&lt;br&gt;
Yes. AO's dashboard has an xterm.js terminal via WebSocket. Cmux adds: native GPU rendering, lower latency, notification rings, drag-and-drop panes, scriptable browser. AO's dashboard adds: PR lifecycle cards, CI status, review comments, fleet overview. Different layers — Cmux is a terminal, AO's dashboard is a management plane.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Key architectural differences
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;AO&lt;/strong&gt; — 8 plugin slots (runtime, agent, workspace, tracker, SCM, notifier, terminal, lifecycle). Reaction engine auto-handles CI failures, review comments, merge readiness — each with configurable retries and escalation. Session state is file-based. Polling-based detection (30s intervals).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Symphony&lt;/strong&gt; — Erlang/OTP supervision trees for process-level fault tolerance. Agent behavior defined in &lt;code&gt;WORKFLOW.md&lt;/code&gt; versioned with your code. Per-state concurrency limits. Review rework is destructive (full reset) in the reference workflow. Linear + Codex only (officially). Still an engineering preview.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;T3 Code&lt;/strong&gt; — Wraps coding agents with a conversational UI and visual diffs. Designed for focused 1-on-1 work where you want to see every change.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cmux&lt;/strong&gt; — Unix socket IPC. Agents can programmatically create panes, send notifications, control browsers. GPU-rendered via libghostty. Notification rings, in-app browser, socket/CLI automation. No higher-level orchestration logic.&lt;/p&gt;




&lt;h2&gt;
  
  
  Trust model
&lt;/h2&gt;

&lt;p&gt;Execution happens locally for all four tools. Code and metadata go to GitHub/Linear and LLM providers (Anthropic, OpenAI) depending on your agent and tracker config.&lt;/p&gt;

&lt;p&gt;All tools work via git branches. Agents push to feature branches and open PRs — your main branch is never touched until you merge.&lt;/p&gt;




&lt;h2&gt;
  
  
  Getting started
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;AO&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;T3 Code&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;Symphony&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;Cmux&lt;/strong&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Prerequisites&lt;/td&gt;
&lt;td&gt;Node 20+, pnpm, tmux, git 2.25+&lt;/td&gt;
&lt;td&gt;Node, OpenAI API key&lt;/td&gt;
&lt;td&gt;Elixir, Linear workspace, OpenAI key&lt;/td&gt;
&lt;td&gt;macOS 14+&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Install&lt;/td&gt;
&lt;td&gt;&lt;code&gt;pnpm install &amp;amp;&amp;amp; pnpm build&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;npx t3&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;mix setup &amp;amp;&amp;amp; mix build&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;brew install --cask manaflow-ai/cmux/cmux&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Time to first run&lt;/td&gt;
&lt;td&gt;~10 min&lt;/td&gt;
&lt;td&gt;~2 min&lt;/td&gt;
&lt;td&gt;~30-60 min&lt;/td&gt;
&lt;td&gt;~1 min&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cost&lt;/td&gt;
&lt;td&gt;Free + LLM API costs&lt;/td&gt;
&lt;td&gt;Free + API costs&lt;/td&gt;
&lt;td&gt;Free + OpenAI costs&lt;/td&gt;
&lt;td&gt;Free&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Combos (honest assessment)
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Combo&lt;/th&gt;
&lt;th&gt;Reality&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;AO + Cmux&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Complementary layers (orchestration + terminal), but no native integration yet. Manual tmux-attach inside Cmux panes.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;AO + T3 Code&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Aspirational. T3 Code has no "review existing PR" workflow today.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Symphony + Cmux&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Same as AO + Cmux. Manual terminal attachment.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;How many agents can run in parallel?&lt;/strong&gt;&lt;br&gt;
AO: no hard limit, defaults to 5 concurrent (configurable). Symphony: defaults to 10 with per-state limits. T3 Code: no hard limit but designed for focused work.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Which one will still exist in 6 months?&lt;/strong&gt;&lt;br&gt;
AO: backed by Composio, actively maintained. Symphony: backed by OpenAI, "engineering preview" — production commitment unclear. T3 Code: backed by Theo/Ping, active development. Cmux: backed by Manaflow (YC S24), actively maintained.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Solo dev or team?&lt;/strong&gt;&lt;br&gt;
Solo: T3 Code or AO (manual spawn). Small team: AO with dashboard. Platform/infra team: AO (plugin arch) or Symphony (if on Linear). Enterprise: evaluate carefully — none are enterprise-hardened yet.&lt;/p&gt;




&lt;h2&gt;
  
  
  The bottom line
&lt;/h2&gt;

&lt;p&gt;AO and Symphony compete at the orchestration layer. T3 Code and Cmux sit at different layers entirely.&lt;/p&gt;

&lt;p&gt;The question isn't "which is best." It's "which layer are you missing?"&lt;/p&gt;

&lt;p&gt;Full discussion with architecture deep-dive and security FAQ: &lt;a href="https://github.com/ComposioHQ/agent-orchestrator/discussions/526" rel="noopener noreferrer"&gt;GitHub Discussion&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Looking for volunteers
&lt;/h2&gt;

&lt;p&gt;We want hands-on, honest comparisons — not marketing, just data:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;AO vs Symphony&lt;/strong&gt; — Same backlog, same codebase. Time-to-PR, fix rate, human time spent, cost.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AO vs T3 Code&lt;/strong&gt; — Same issue. Autonomous vs human-in-the-loop when the agent gets something wrong.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AO + Cmux&lt;/strong&gt; — Does Cmux actually improve the AO supervision experience?&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Interested? Drop a comment on the &lt;a href="https://github.com/ComposioHQ/agent-orchestrator/discussions/526" rel="noopener noreferrer"&gt;GitHub Discussion&lt;/a&gt;.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Compiled March 2026. This space moves fast — comments welcome if anything's outdated.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>agents</category>
      <category>devtools</category>
      <category>productivity</category>
    </item>
    <item>
      <title>A Hybrid Key Architecture for Autonomous Agent Credential Management</title>
      <dc:creator>Dhruv Sharma</dc:creator>
      <pubDate>Mon, 09 Mar 2026 13:34:57 +0000</pubDate>
      <link>https://dev.to/illegalcall/a-hybrid-key-architecture-for-autonomous-agent-credential-management-269h</link>
      <guid>https://dev.to/illegalcall/a-hybrid-key-architecture-for-autonomous-agent-credential-management-269h</guid>
      <description>&lt;p&gt;AI agents that move money on-chain have a problem nobody talks about cleanly: who holds the keys?&lt;/p&gt;

&lt;p&gt;That's the problem I ran into building &lt;a href="https://github.com/fishnetproxy/fishnet" rel="noopener noreferrer"&gt;Fishnet&lt;/a&gt;, an AI agent transaction security proxy in Rust. Fishnet sits between the AI agent and the chain — a control plane that necessarily holds signing keys. You can't give it zero secrets. So the question becomes: how do you minimize blast radius when secrets are unavoidable?&lt;/p&gt;

&lt;p&gt;The naive answer is to pick one storage primitive and use it for everything. That breaks down immediately when your system has multiple cryptographic operations with different security requirements. Keychain is good for secret storage but not the same thing as hardware-backed signing. In this flow, Secure Enclave gives me P-256, while Ethereum signing requires secp256k1. File storage is portable, but it mostly relies on filesystem permissions rather than hardware isolation.&lt;/p&gt;

&lt;p&gt;The answer I landed on: &lt;strong&gt;use the right storage primitive for each key's threat model, and compose them behind a clean trait abstraction.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The Architecture at a Glance
&lt;/h2&gt;

&lt;p&gt;Fishnet sits between the AI agent and the chain. Every transaction goes through it. That means Fishnet holds three distinct cryptographic identities — each with a completely different threat model.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsqrvt5k1gfwlafbh9hma.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsqrvt5k1gfwlafbh9hma.png" alt="Fishnet Architecture Overview" width="800" height="422"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Three keys. Three blast radii. Vault compromise does not imply signing access. Signing compromise does not imply credential access. The approval key is hardware-backed when Secure Enclave mode is active.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Three Operations
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Operation&lt;/th&gt;
&lt;th&gt;Key Type&lt;/th&gt;
&lt;th&gt;Threat Model&lt;/th&gt;
&lt;th&gt;Storage&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Vault encryption&lt;/td&gt;
&lt;td&gt;Symmetric (256-bit)&lt;/td&gt;
&lt;td&gt;Credential exposure at rest&lt;/td&gt;
&lt;td&gt;Argon2id-derived key, optionally cached in Keychain&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Onchain approval&lt;/td&gt;
&lt;td&gt;P-256 asymmetric&lt;/td&gt;
&lt;td&gt;Unauthorized permit approval and replay&lt;/td&gt;
&lt;td&gt;Secure Enclave in runtime; software signer type exists for tests and explicit construction paths&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Ethereum signing&lt;/td&gt;
&lt;td&gt;secp256k1 asymmetric&lt;/td&gt;
&lt;td&gt;Unauthorized permit signing&lt;/td&gt;
&lt;td&gt;File (&lt;code&gt;.hex&lt;/code&gt;)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Layer 1: Vault Encryption (Argon2id + Keychain)
&lt;/h2&gt;

&lt;p&gt;The credential vault stores API keys encrypted at rest. Its encryption key is derived from a user password using Argon2id, a widely recommended memory-hard password KDF. Fishnet can also cache that derived 32-byte key in macOS Keychain when the operator opts in, so the security story has two paths: password-based unlock when the cache is absent, and Keychain-protected unlock when the cache is present.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;ARGON2_MEMORY_COST_KIB&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;u32&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;262_144&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;  &lt;span class="c1"&gt;// 256 MB&lt;/span&gt;
&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;ARGON2_TIME_COST&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;u32&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;ARGON2_PARALLELISM&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;u32&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;DERIVED_KEY_LEN&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;usize&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;32&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The 256 MB memory cost is intentional. When the password-based unlock path is used, it pushes brute-force cost into memory bandwidth as well as compute, which makes large-scale GPU cracking materially more expensive and less efficient. It does not make GPU attacks impossible; it raises their cost.&lt;/p&gt;

&lt;p&gt;The resulting 32-byte key feeds directly into libsodium's &lt;code&gt;crypto_secretbox_easy&lt;/code&gt; for XSalsa20-Poly1305 authenticated encryption. The cipher here is XSalsa20-Poly1305, not AES.&lt;/p&gt;

&lt;h3&gt;
  
  
  Vault Unlock Flow
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsff542sb1gweuqawum25.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsff542sb1gweuqawum25.png" alt="Vault Unlock Flow" width="800" height="1146"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The version prefix on the cached Keychain entry (&lt;code&gt;derived_hex:v1:&lt;/code&gt;) provides a migration path. Future derivation formats can use &lt;code&gt;v2:&lt;/code&gt;, &lt;code&gt;v3:&lt;/code&gt;, and so on without breaking existing entries.&lt;/p&gt;

&lt;p&gt;The in-memory key is pinned with &lt;code&gt;mlock()&lt;/code&gt; where the OS allows it, to keep it out of swap, and zeroed on drop via the &lt;code&gt;zeroize&lt;/code&gt; crate:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;impl&lt;/span&gt; &lt;span class="nb"&gt;Drop&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;LockedSecretboxKey&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;drop&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="k"&gt;mut&lt;/span&gt; &lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="py"&gt;.locked&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;unsafe&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nn"&gt;libc&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;munlock&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="py"&gt;.key&lt;/span&gt;&lt;span class="nf"&gt;.as_ptr&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="nf"&gt;.cast&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="py"&gt;.key&lt;/span&gt;&lt;span class="nf"&gt;.len&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="py"&gt;.key&lt;/span&gt;&lt;span class="nf"&gt;.zeroize&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;On normal teardown, the key bytes are overwritten before the allocator can reuse that memory. That reduces post-use exposure in freed memory, but it does not protect against live-memory capture or a crash that happens before &lt;code&gt;Drop&lt;/code&gt; runs.&lt;/p&gt;

&lt;p&gt;Caching the derived key in Keychain is a conscious tradeoff: it improves operator ergonomics, but once that cache exists, the strength of that path depends more on Keychain access controls than on Argon2 parameters.&lt;/p&gt;




&lt;h2&gt;
  
  
  Layer 2: Onchain Approval Key (P-256 + Secure Enclave)
&lt;/h2&gt;

&lt;p&gt;When &lt;code&gt;onchain.approval.enabled&lt;/code&gt; is set, Fishnet adds a P-256 second signature requirement before it emits the secp256k1 permit signature. This is a hardware-backed approval proof layered in front of normal onchain permit signing.&lt;/p&gt;

&lt;p&gt;That P-256 approval is enforced by Fishnet's control plane, not by the EVM itself. Its purpose is to gate whether the secp256k1 permit signature is ever emitted.&lt;/p&gt;

&lt;p&gt;The type names still use &lt;code&gt;BridgeSigner&lt;/code&gt; and &lt;code&gt;BridgeApprovalSigner&lt;/code&gt; because the feature originated around bridge-style risk controls, but the current runtime wiring applies the approval layer to generic onchain permit issuance.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;BridgeApprovalSigner&lt;/code&gt; trait makes the approval layer pluggable. In the current macOS runtime, the signer is Secure Enclave-backed when the platform allows it. A software P-256 signer type also exists in the codebase for tests and explicit construction paths:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;pub&lt;/span&gt; &lt;span class="k"&gt;trait&lt;/span&gt; &lt;span class="n"&gt;BridgeApprovalSigner&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Send&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nb"&gt;Sync&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;mode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;public_key_hex&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;sign_prehash&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;prehash&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;u8&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="mi"&gt;32&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="k"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;Result&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;P256Signature&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;SignerError&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When the persistent Secure Enclave path is active, the key is created with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;kSecAccessControlPrivateKeyUsage&lt;/code&gt; — usable for private-key operations like signing&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;kSecAccessControlUserPresence&lt;/code&gt; — user presence required&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;kSecAttrAccessibleWhenUnlockedThisDeviceOnly&lt;/code&gt; — inaccessible while the device is locked and bound to that device&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The non-exportability comes from Secure Enclave key generation itself; the &lt;code&gt;ThisDeviceOnly&lt;/code&gt; accessibility class keeps the keychain item from migrating to another device.&lt;/p&gt;

&lt;p&gt;The graceful degradation story is important. If persistent Secure Enclave storage is denied or unavailable, which is common in unsigned CLI or non-interactive contexts, Fishnet falls back to a session-only Secure Enclave key and surfaces the mode string to the caller:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Mode string&lt;/th&gt;
&lt;th&gt;Meaning&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;p256-secure-enclave-bridge&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Hardware-backed, persists across restarts&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;p256-secure-enclave-bridge-session&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Hardware-backed, rotates on restart&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;p256-local-bridge&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Software signer type present in tests/dev code, not the automatic runtime fallback on this branch&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;It never silently downgrades from persistent to session-only Secure Enclave storage without labeling the mode. On the current branch, non-macOS runtime approval is fail-closed rather than an automatic software fallback.&lt;/p&gt;




&lt;h2&gt;
  
  
  Layer 3: Ethereum Signing Key (secp256k1 + file)
&lt;/h2&gt;

&lt;p&gt;EIP-712 permit signing happens on every agent transaction. The secp256k1 key lives in a hex file with &lt;code&gt;0600&lt;/code&gt; permissions. The tradeoff is portability: Linux agents do not have macOS Keychain, and the on-chain nonce provides the final replay backstop.&lt;/p&gt;

&lt;p&gt;The address derivation follows the Ethereum spec exactly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;pub&lt;/span&gt; &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;try_from_bytes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;secret_bytes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;u8&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="mi"&gt;32&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="k"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;Result&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="k"&gt;Self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;SignerError&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;signing_key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nn"&gt;SigningKey&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;from_bytes&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;secret_bytes&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="nf"&gt;.into&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;&lt;span class="o"&gt;?&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;verifying_key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;signing_key&lt;/span&gt;&lt;span class="nf"&gt;.verifying_key&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;public_key_bytes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;verifying_key&lt;/span&gt;&lt;span class="nf"&gt;.to_encoded_point&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;false&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// uncompressed (65 bytes)&lt;/span&gt;
    &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;hash&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nn"&gt;Keccak256&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;digest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;public_key_bytes&lt;/span&gt;&lt;span class="nf"&gt;.as_bytes&lt;/span&gt;&lt;span class="p"&gt;()[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;..&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt; &lt;span class="c1"&gt;// drop 0x04 prefix&lt;/span&gt;
    &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="k"&gt;mut&lt;/span&gt; &lt;span class="n"&gt;address&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0u8&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
    &lt;span class="n"&gt;address&lt;/span&gt;&lt;span class="nf"&gt;.copy_from_slice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;hash&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;12&lt;/span&gt;&lt;span class="o"&gt;..&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt; &lt;span class="c1"&gt;// last 20 bytes&lt;/span&gt;
    &lt;span class="nf"&gt;Ok&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;Self&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;signing_key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;address&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  The uint48 footgun
&lt;/h3&gt;

&lt;p&gt;The permit schema uses a &lt;code&gt;uint48 expiry&lt;/code&gt; field, while Rust stores it as &lt;code&gt;u64&lt;/code&gt;. If the Rust side accepts values above &lt;code&gt;2^48 - 1&lt;/code&gt;, the request is now outside the Solidity type's valid domain. Depending on the encoder or verifier, that can show up as rejected inputs, invalid typed-data payloads, or signatures that no longer match what the contract expects to hash.&lt;/p&gt;

&lt;p&gt;Fishnet validates this at the boundary before any signature runs:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;UINT48_MAX&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;u64&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1u64&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;48&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="py"&gt;.expiry&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;UINT48_MAX&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;Err&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nn"&gt;SignerError&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;InvalidPermit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nd"&gt;format!&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="s"&gt;"expiry {} exceeds uint48 max ({}), invalid for Solidity uint48"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="py"&gt;.expiry&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;UINT48_MAX&lt;/span&gt;
    &lt;span class="p"&gt;)));&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Hard rejection at input. Not a warning. Not a clamp. A rejection that keeps off-chain inputs inside the exact range the Solidity side accepts.&lt;/p&gt;




&lt;h2&gt;
  
  
  Composing the Layers: &lt;code&gt;BridgeSigner&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;The three layers compose cleanly. &lt;code&gt;BridgeSigner&lt;/code&gt; wraps any &lt;code&gt;SignerTrait&lt;/code&gt; (the secp256k1 signer) with any &lt;code&gt;BridgeApprovalSigner&lt;/code&gt; (P-256, software, or Secure Enclave). Despite the name, this wrapper currently sits in the generic onchain permit path:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;pub&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;BridgeSigner&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;inner&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Arc&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="k"&gt;dyn&lt;/span&gt; &lt;span class="n"&gt;SignerTrait&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;                     &lt;span class="c1"&gt;// secp256k1 layer&lt;/span&gt;
    &lt;span class="n"&gt;approval_signer&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Arc&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="k"&gt;dyn&lt;/span&gt; &lt;span class="n"&gt;BridgeApprovalSigner&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;// P-256 layer&lt;/span&gt;
    &lt;span class="n"&gt;approval_ttl_seconds&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;u64&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;replay_cache&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Mutex&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;HashMap&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;u8&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="mi"&gt;32&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nb"&gt;u64&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;     &lt;span class="c1"&gt;// keyed by derived replay hash over stable permit fields&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Approval Signing Flow
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgd00cl55a8isgdi9nzvt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgd00cl55a8isgdi9nzvt.png" alt="Approval Signing Flow" width="800" height="1190"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Step 7 (sign, then verify) catches key corruption immediately rather than producing an invalid proof that propagates deeper into the system. Step 8's rollback ensures a failed secp256k1 signing does not leave a consumed replay cache entry behind that would block a retry.&lt;/p&gt;




&lt;h2&gt;
  
  
  Key Hierarchy Summary
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌──────────────────┬──────────────────────┬───────────────────────┐
│   Vault Layer    │   Approval Layer     │   Signing Layer       │
├──────────────────┼──────────────────────┼───────────────────────┤
│ Argon2id         │ P-256 (secp256r1)    │ secp256k1             │
│   ↓              │                      │ (k256 crate)          │
│ XSalsa20-Poly    │                      │                       │
├──────────────────┼──────────────────────┼───────────────────────┤
│ macOS Keychain   │ Secure Enclave       │ .hex file             │
│ (cached key)     │ (user presence)      │ (0600 permissions)    │
├──────────────────┼──────────────────────┼───────────────────────┤
│ + mlock()        │ In enclave mode,     │ Validated at input    │
│ + zeroize on drop│ key stays on-chip    │ (uint48, U256, addr)  │
├──────────────────┼──────────────────────┼───────────────────────┤
│ Protects:        │ Protects:            │ Produces:             │
│ API credentials  │ Permit approvals     │ EIP-712 permit sigs   │
│ at rest          │ from replay + abuse  │ for on-chain actions  │
└──────────────────┴──────────────────────┴───────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  What This Architecture Gets Right
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Blast radius containment.&lt;/strong&gt; Each key has exactly one job. Compromising the secp256k1 key lets an attacker sign Ethereum transactions, but not decrypt vault credentials. Compromising the vault key exposes API keys, but doesn't enable on-chain actions. The approval key adds a second factor that must be compromised independently — and, in Secure Enclave mode, it never leaves hardware.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Hardware backing where it matters.&lt;/strong&gt; The approval key is a likely target for a "sign this transaction" attack. When Secure Enclave mode is active, the private key is non-exportable and isolated from normal process memory.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Graceful degradation without silent failure.&lt;/strong&gt; When persistent Secure Enclave storage is unavailable, Fishnet surfaces the mode string to callers. No silent downgrade to session-only mode, and no automatic runtime software fallback on unsupported platforms.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Versioned storage formats.&lt;/strong&gt; The Keychain prefix (&lt;code&gt;derived_hex:v1:&lt;/code&gt;), replay cache key (&lt;code&gt;fishnet-bridge-replay-v1|&lt;/code&gt;), and intent hash prefix (&lt;code&gt;fishnet-bridge-approval-v1|&lt;/code&gt;) all include version identifiers. The approval-related prefixes still carry bridge-flavored names for historical reasons, but the versioning itself is what matters. Future migrations can introduce new formats without ambiguous parsing or ad hoc compatibility logic.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Boundary validation.&lt;/strong&gt; Rust uses &lt;code&gt;u64&lt;/code&gt;, Solidity expects &lt;code&gt;uint48&lt;/code&gt;, and Fishnet rejects out-of-range values before signing.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I'd Do Differently
&lt;/h2&gt;

&lt;p&gt;The secp256k1 key in a hex file is the weakest link. For production, this should move to an HSM, KMS, or another OS-managed key store appropriate to the deployment target. The hex file was chosen for portability, but that is still architectural debt worth acknowledging explicitly.&lt;/p&gt;

&lt;p&gt;The replay cache is in-memory only. A process restart clears it, meaning a cached permit could be replayed across a restart boundary. For Fishnet's current use case, the on-chain nonce provides the final replay protection, but a persistent replay store would be more robust.&lt;/p&gt;




&lt;p&gt;The goal is always to minimize what any single compromise can reach. When you can't give your control plane zero secrets, the next best thing is ensuring each secret only unlocks one blast radius.&lt;/p&gt;

&lt;p&gt;How do you handle key management in systems where secrets are unavoidable?&lt;/p&gt;

</description>
      <category>rust</category>
      <category>security</category>
      <category>ethereum</category>
      <category>macos</category>
    </item>
    <item>
      <title>How I Saved 20,000 Gas Per Transaction by Reordering One Line in Solidity</title>
      <dc:creator>Dhruv Sharma</dc:creator>
      <pubDate>Sun, 01 Mar 2026 18:58:02 +0000</pubDate>
      <link>https://dev.to/illegalcall/how-i-saved-20000-gas-per-transaction-by-reordering-one-line-in-solidity-2dgl</link>
      <guid>https://dev.to/illegalcall/how-i-saved-20000-gas-per-transaction-by-reordering-one-line-in-solidity-2dgl</guid>
      <description>&lt;p&gt;While building a smart wallet contract for &lt;a href="https://github.com/iamyxsh/fishnet" rel="noopener noreferrer"&gt;Fishnet&lt;/a&gt; — an AI agent transaction security proxy — I ran a self-imposed code review and found a subtle optimization that every Solidity developer should know about.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;One variable reorder. 20,000 gas saved per transaction.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Here's the full breakdown.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem: Silent Storage Slot Waste
&lt;/h2&gt;

&lt;p&gt;My state variables looked like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;address public owner;          // 20 bytes → Slot 0
address public fishnetSigner;  // 20 bytes → Slot 1
mapping(uint256 =&amp;gt; bool) public usedNonces; // Slot 2
bool public paused;            // 1 byte  → Slot 3  ← wasting 31 bytes
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That &lt;code&gt;bool paused&lt;/code&gt; at the bottom? It's only &lt;strong&gt;1 byte&lt;/strong&gt;, but it was consuming an entire &lt;strong&gt;32-byte storage slot&lt;/strong&gt;. That's 31 bytes of wasted space — and more importantly, an extra &lt;code&gt;SLOAD&lt;/code&gt;/&lt;code&gt;SSTORE&lt;/code&gt; on every pause check.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why the EVM cares
&lt;/h3&gt;

&lt;p&gt;The EVM operates on 32-byte words. Every storage slot is exactly 32 bytes. When the Solidity compiler lays out your state variables, it goes &lt;strong&gt;top to bottom in declaration order&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Slot 0: [owner ─────────────── 20 bytes][── 12 bytes empty ──]
Slot 1: [fishnetSigner ─────── 20 bytes][── 12 bytes empty ──]
Slot 2: [usedNonces mapping hash ───────────────── 32 bytes ─]
Slot 3: [paused ─ 1 byte][─────── 31 bytes empty ───────────]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The compiler &lt;strong&gt;does not&lt;/strong&gt; reorder your variables for you. If a variable can't fit in the remaining space of the current slot, it starts a new one. An &lt;code&gt;address&lt;/code&gt; is 20 bytes. A &lt;code&gt;bool&lt;/code&gt; is 1 byte. They fit together with 11 bytes to spare — but only if they're adjacent in your declaration.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Fix: Storage Slot Packing
&lt;/h2&gt;

&lt;p&gt;Move &lt;code&gt;paused&lt;/code&gt; right after &lt;code&gt;owner&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;address public owner;          // 20 bytes ─┐
bool public paused;            // 1 byte  ──┘ Slot 0 (21/32 bytes)
address public fishnetSigner;  // 20 bytes → Slot 1
mapping(uint256 =&amp;gt; bool) public usedNonces; // Slot 2
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;New layout:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Slot 0: [owner ─────────────── 20 bytes][paused 1B][─ 11 bytes empty ─]
Slot 1: [fishnetSigner ─────── 20 bytes][── 12 bytes empty ──────────]
Slot 2: [usedNonces mapping hash ───────────────── 32 bytes ─────────]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;4 slots → 3 slots.&lt;/strong&gt; One fewer storage slot touched at runtime.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4vcr7fjxemny2wv8z4qk.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4vcr7fjxemny2wv8z4qk.png" alt="EVM Storage Slot Packing - Before and After" width="800" height="344"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The Gas Math
&lt;/h2&gt;

&lt;p&gt;Here's what this saves in practice:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Operation&lt;/th&gt;
&lt;th&gt;Before (separate slots)&lt;/th&gt;
&lt;th&gt;After (packed)&lt;/th&gt;
&lt;th&gt;Savings&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Cold &lt;code&gt;SLOAD&lt;/code&gt; (first read in tx)&lt;/td&gt;
&lt;td&gt;2,100 gas × 2 slots&lt;/td&gt;
&lt;td&gt;2,100 gas × 1 slot&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;2,100 gas&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cold &lt;code&gt;SSTORE&lt;/code&gt; (pause/unpause)&lt;/td&gt;
&lt;td&gt;~20,000 gas&lt;/td&gt;
&lt;td&gt;0 (slot already warm from &lt;code&gt;owner&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~20,000 gas&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;whenNotPaused&lt;/code&gt; modifier per call&lt;/td&gt;
&lt;td&gt;Reads its own slot&lt;/td&gt;
&lt;td&gt;Reads &lt;code&gt;owner&lt;/code&gt;'s slot (often already warm)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Up to 2,000 gas&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The big win is the &lt;strong&gt;cold &lt;code&gt;SSTORE&lt;/code&gt; elimination&lt;/strong&gt;. Writing to a storage slot that hasn't been accessed in the current transaction costs ~20,000 gas. But if &lt;code&gt;owner&lt;/code&gt; has already been read (which it almost always has in the same transaction context), the slot containing &lt;code&gt;paused&lt;/code&gt; is now &lt;strong&gt;warm&lt;/strong&gt; — and a warm &lt;code&gt;SSTORE&lt;/code&gt; costs only ~2,900 gas.&lt;/p&gt;




&lt;h2&gt;
  
  
  How to Check Your Own Contracts
&lt;/h2&gt;

&lt;p&gt;Foundry makes this trivial:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;forge inspect YourContract storage-layout
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This outputs every state variable with its slot number, offset, and byte size. Look for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Variables that could pack together&lt;/strong&gt; (combined size ≤ 32 bytes) but are in separate slots&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;bool&lt;/code&gt;, &lt;code&gt;uint8&lt;/code&gt;, &lt;code&gt;uint16&lt;/code&gt;, &lt;code&gt;address&lt;/code&gt;&lt;/strong&gt; separated by mappings or larger types&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Related variables read together&lt;/strong&gt; that are in different slots&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Example output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;| Name          | Type                        | Slot | Offset | Bytes |
|---------------|-----------------------------|------|--------|-------|
| owner         | address                     | 0    | 0      | 20    |
| paused        | bool                        | 0    | 20     | 1     |
| fishnetSigner | address                     | 1    | 0      | 20    |
| usedNonces    | mapping(uint256 =&amp;gt; bool)    | 2    | 0      | 32    |
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When &lt;code&gt;Offset&lt;/code&gt; &amp;gt; 0, you've got packing happening. When small types have &lt;code&gt;Offset&lt;/code&gt; = 0 and their own slot — that's a packing opportunity.&lt;/p&gt;




&lt;h2&gt;
  
  
  5 Other Things I Found in the Same Review
&lt;/h2&gt;

&lt;p&gt;Storage packing was the optimization win, but the same code review caught much more:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Critical permit.value vulnerability
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;execute()&lt;/code&gt; function accepted a permit signature but &lt;strong&gt;never validated that &lt;code&gt;permit.value&lt;/code&gt; matched &lt;code&gt;msg.value&lt;/code&gt;&lt;/strong&gt;. An attacker could get a permit signed for 0.01 ETH but submit the transaction with 100 ETH, draining the wallet.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;// Before: no validation
function execute(Permit calldata permit, ...) external payable {
    // permit.value could be anything vs msg.value
}

// After: explicit check
require(permit.value == msg.value, InsufficientValue());
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Chain ID validation for fork protection
&lt;/h3&gt;

&lt;p&gt;The contract cached &lt;code&gt;DOMAIN_SEPARATOR&lt;/code&gt; at deployment but never recomputed it. On a chain fork (like ETH/ETH Classic), signatures from one chain would be valid on the other.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;function _domainSeparator() internal view returns (bytes32) {
    if (block.chainid == _CACHED_CHAIN_ID) {
        return _CACHED_DOMAIN_SEPARATOR;
    }
    return _computeDomainSeparator(); // recompute on fork
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. Fail-fast signature validation
&lt;/h3&gt;

&lt;p&gt;The original code ran an expensive &lt;code&gt;keccak256&lt;/code&gt; hash before checking if the signature was even the right length. Flipping the order saves gas on every invalid input.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;// Before: hash first, then check length
bytes32 hash = keccak256(abi.encodePacked(...));
require(signature.length == 65, InvalidSignature());

// After: check length first, hash only if valid
require(signature.length == 65, InvalidSignature());
bytes32 hash = keccak256(abi.encodePacked(...));
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  4. Custom errors over string reverts
&lt;/h3&gt;

&lt;p&gt;Replaced all &lt;code&gt;require(condition, "String message")&lt;/code&gt; with custom errors. Each string revert stores the message in bytecode and costs ~50 extra gas per revert.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;// Before
require(msg.sender == owner, "Not authorized");

// After
error Unauthorized();
if (msg.sender != owner) revert Unauthorized();
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  5. Dead test code cleanup
&lt;/h3&gt;

&lt;p&gt;Found leftover &lt;code&gt;console.log&lt;/code&gt; imports and unused test helper functions that had accumulated during rapid iteration. They don't affect runtime gas, but they bloat deployment bytecode.&lt;/p&gt;




&lt;h2&gt;
  
  
  Key Takeaway
&lt;/h2&gt;

&lt;p&gt;Code review isn't just about finding bugs. It's about &lt;strong&gt;understanding the machine your code runs on&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The EVM has a 32-byte word size, and every storage slot costs real money. Knowing how the compiler lays out storage is the difference between a contract that costs users $2 per transaction and one that costs $5.&lt;/p&gt;

&lt;p&gt;Run &lt;code&gt;forge inspect YourContract storage-layout&lt;/code&gt;. Look at your slot assignments. You might be surprised what you find.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This came out of building &lt;a href="https://github.com/iamyxsh/fishnet" rel="noopener noreferrer"&gt;Fishnet&lt;/a&gt; — an open-source security proxy for AI agent transactions on Ethereum. If you're working on AI × Web3 infra, check it out.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>solidity</category>
      <category>ethereum</category>
      <category>web3</category>
      <category>smartcontracts</category>
    </item>
  </channel>
</rss>
