<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Sergio Solis</title>
    <description>The latest articles on DEV Community by Sergio Solis (@sergio_solis).</description>
    <link>https://dev.to/sergio_solis</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3852386%2F60af3385-a6a1-494a-892d-803ffb0843a0.jpg</url>
      <title>DEV Community: Sergio Solis</title>
      <link>https://dev.to/sergio_solis</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/sergio_solis"/>
    <language>en</language>
    <item>
      <title>The Frozen Context Pattern: Adding State to Deep Equilibrium Models</title>
      <dc:creator>Sergio Solis</dc:creator>
      <pubDate>Mon, 30 Mar 2026 22:38:30 +0000</pubDate>
      <link>https://dev.to/sergio_solis/the-frozen-context-pattern-adding-state-to-deep-equilibrium-models-3b2e</link>
      <guid>https://dev.to/sergio_solis/the-frozen-context-pattern-adding-state-to-deep-equilibrium-models-3b2e</guid>
      <description>&lt;p&gt;DEQ models converge only if their update function is a contraction. We found a design pattern that lets you inject arbitrary external state — Mamba memory, attention, anything — without touching the Lipschitz bound.&lt;br&gt;
tags: rust, machinelearning, ai, webgpu&lt;/p&gt;



&lt;p&gt;We're building &lt;a href="https://github.com/SergioAriel/aideen" rel="noopener noreferrer"&gt;AIDEEN&lt;/a&gt;, an open-source AI engine in Rust that runs on consumer GPUs via WebGPU. The core is a &lt;strong&gt;Deep Equilibrium Model&lt;/strong&gt; (DEQ) — a single parameter block iterated until it converges to a fixed point.&lt;/p&gt;

&lt;p&gt;DEQs have a hard constraint: the update function must be a &lt;strong&gt;contraction&lt;/strong&gt;. Every component you add risks widening the Lipschitz bound and breaking convergence. We ran into this wall trying to add Mamba-style temporal memory, then attention. Both broke convergence the same way. Both were fixed the same way.&lt;/p&gt;

&lt;p&gt;Here's the pattern.&lt;/p&gt;
&lt;h2&gt;
  
  
  The Constraint
&lt;/h2&gt;

&lt;p&gt;A DEQ finds h* by iterating:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;h^(k+1) = f(h^(k); x)    until    |h^(k+1) - h^(k)| &amp;lt; epsilon
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Convergence requires L(f) &amp;lt; 1. We enforce this via spectral normalization on every weight matrix every 4 gradient steps. This works cleanly as long as the Jacobian &lt;code&gt;df/dh&lt;/code&gt; only contains h-dependent terms.&lt;/p&gt;

&lt;p&gt;The moment you add something that couples back to h inside the loop, the Jacobian grows cross-terms that blow up the bound.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Happens When You Ignore It
&lt;/h2&gt;

&lt;p&gt;We tried adding a Mamba-style SSM inside the loop:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;h^(k+1) = f(h^(k), M^(k); x)
M^(k+1) = g(h^(k); M^(k-1))
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now h depends on M depends on h. The combined Jacobian has cross-terms. In practice:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Spectral norm of the combined system exceeded 1.0&lt;/li&gt;
&lt;li&gt;Picard iteration hit the cap without converging&lt;/li&gt;
&lt;li&gt;The model oscillated indefinitely&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;No amount of damping fixed it. The feedback loop is structural — you can't regularize your way out of it.&lt;/p&gt;

&lt;p&gt;Same thing happened when we naively put slot attention inside the iteration: the attention weights depend on h, V depends on h, the output feeds back into h. Same instability.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Pattern: Frozen Context
&lt;/h2&gt;

&lt;p&gt;The fix is the same in both cases. Any external component — Mamba state, attention, anything — can safely enter the DEQ if it follows this structure:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Compute it once&lt;/strong&gt;, before the Picard loop starts, from the previous converged state&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Freeze it&lt;/strong&gt; — treat it as a constant during iteration (stop-gradient)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Inject it additively&lt;/strong&gt; into the loop body&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Update it&lt;/strong&gt; after convergence, using h*
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# ── Prelude (before the loop) ───────────────────────────
&lt;/span&gt;&lt;span class="n"&gt;ctx_A&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;component_A&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prev_state_A&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;   &lt;span class="c1"&gt;# frozen — computed once
&lt;/span&gt;&lt;span class="n"&gt;ctx_B&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;component_B&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prev_state_B&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;   &lt;span class="c1"&gt;# frozen — computed once
&lt;/span&gt;
&lt;span class="c1"&gt;# ── Picard loop ─────────────────────────────────────────
&lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_iters&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;h_next&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;f&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;h_curr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x_t&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;ctx_A&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;ctx_B&lt;/span&gt;   &lt;span class="c1"&gt;# ctx never changes
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;converged&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;h_next&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;h_curr&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="k"&gt;break&lt;/span&gt;
    &lt;span class="n"&gt;h_curr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;h_next&lt;/span&gt;

&lt;span class="c1"&gt;# ── Post-convergence updates ────────────────────────────
&lt;/span&gt;&lt;span class="n"&gt;state_A&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;update_A&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;h_star&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;prev_state_A&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;state_B&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;update_B&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;h_star&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;prev_state_B&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Why this works: &lt;code&gt;ctx_A&lt;/code&gt; and &lt;code&gt;ctx_B&lt;/code&gt; are constants with respect to h. The Jacobian &lt;code&gt;df/dh&lt;/code&gt; contains no cross-terms from them. Spectral normalization of the h-dependent path alone is sufficient.&lt;/p&gt;

&lt;p&gt;The frozen terms still shift h* — they participate in the final fixed point. They just don't affect the convergence guarantee.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa1x1s7co0w280v34vl3v.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa1x1s7co0w280v34vl3v.png" alt="AIDEEN architecture: Mamba outside the DEQ loop"&gt;&lt;/a&gt;---&lt;/p&gt;

&lt;h2&gt;
  
  
  Applied to Mamba
&lt;/h2&gt;

&lt;p&gt;For temporal memory across tokens:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Prelude: M_{t-1} → hist_ctx (frozen)
&lt;/span&gt;&lt;span class="n"&gt;hist_ctx&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;gate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;W_hist&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;M_prev&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;   &lt;span class="c1"&gt;# stop-gradient
&lt;/span&gt;
&lt;span class="c1"&gt;# Loop: hist_ctx is read-only
&lt;/span&gt;&lt;span class="n"&gt;h_next&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;RMSNorm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;attn_signal&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;slot_bias&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;hist_ctx&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Post-convergence: update M
&lt;/span&gt;&lt;span class="n"&gt;M_t&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;M_prev&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nf"&gt;x_proj&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;h_star&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The Mamba state carries temporal information token-to-token. The DEQ sees it as a fixed bias — shifts the fixed point but doesn't affect contractivity.&lt;/p&gt;

&lt;h2&gt;
  
  
  Applied to Slot Attention
&lt;/h2&gt;

&lt;p&gt;Same pattern. Q, K, V are projected from the previous converged state, attention weights are computed once, frozen:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Prelude: compute attention from H_prev
&lt;/span&gt;&lt;span class="n"&gt;Q&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;K&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;V&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;project&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;H_prev&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;attn_ctx&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;softmax&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Q&lt;/span&gt; &lt;span class="o"&gt;@&lt;/span&gt; &lt;span class="n"&gt;K&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="nf"&gt;sqrt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="o"&gt;@&lt;/span&gt; &lt;span class="n"&gt;V&lt;/span&gt;   &lt;span class="c1"&gt;# frozen
&lt;/span&gt;
&lt;span class="c1"&gt;# Loop: attn_ctx is read-only
&lt;/span&gt;&lt;span class="n"&gt;h_next&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;RMSNorm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;signal&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;attn_ctx&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;hist_ctx&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;slot_bias&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Cross-slot attention runs once per token at full cost, but zero times per Picard iteration. The DEQ refines h given fixed attention context, not competing with it.&lt;/p&gt;

&lt;h2&gt;
  
  
  In the GPU Shader
&lt;/h2&gt;

&lt;p&gt;The boundary is explicit in our WGSL shaders. Inside the Picard loop:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;// hist_ctx was computed in the prelude from M_{t-1}.
// ∂hist_ctx/∂h = 0 — no contribution to the Lipschitz bound.
let hist_ctx    = Scratch[hist_ctx_base + slot * d_model + d];  // READ ONLY
let attn_ctx    = Scratch[attn_base    + slot * d_model + d];  // READ ONLY
let attn_signal = Scratch[signal_base  + slot * d_model + d];  // h-dependent

let final_h = attn_signal + attn_ctx + hist_ctx + slot_bias;
H_next[...] = final_h;   // h updated — external state untouched
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After convergence:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;// M_t written exactly once, here.
let m_new = alpha * M_prev + (1.0 - alpha) * x_proj(h_star);
H_curr[carry_base + slot * d_model + d] = m_new;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;∂/∂h = 0&lt;/code&gt; annotation is load-bearing. It's what lets spectral normalization work on the h-dependent path alone.&lt;/p&gt;

&lt;h2&gt;
  
  
  Does It Hold?
&lt;/h2&gt;

&lt;p&gt;After adopting this pattern for both Mamba and attention:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Picard convergence rate&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;100%&lt;/strong&gt; (0 unconverged tokens)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Average iterations&lt;/td&gt;
&lt;td&gt;5-6 per token (cap: 20)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Contractivity&lt;/td&gt;
&lt;td&gt;&amp;lt; 0.85 throughout training&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Training stability&lt;/td&gt;
&lt;td&gt;12+ hours continuous on AMD Radeon 780M (2GB VRAM)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The history signal contributes a stable context (hist/inj ratio ~ 0.25) — temporal information flows into the DEQ without destabilizing it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The General Rule
&lt;/h2&gt;

&lt;p&gt;Any component can be added to a DEQ using this pattern:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Eligible&lt;/strong&gt;: anything whose output can be computed from the &lt;em&gt;previous&lt;/em&gt; converged state — recurrent memory, attention over past tokens, retrieved embeddings, external signals&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Not eligible&lt;/strong&gt;: anything that must see the &lt;em&gt;current&lt;/em&gt; h to compute its output and feeds back into f — that creates the circular dependency&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The frozen context pattern lets DEQ models grow in expressivity without paying in convergence stability. The Lipschitz constraint stays local to the h-dependent core.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try It
&lt;/h2&gt;

&lt;p&gt;AIDEEN is open source (MIT license), written entirely in Rust with WGSL GPU compute shaders. No Python, no CUDA — runs on any GPU with Vulkan/Metal/DX12/WebGPU support.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/SergioAriel/aideen.git
&lt;span class="nb"&gt;cd &lt;/span&gt;aideen
cargo build &lt;span class="nt"&gt;--release&lt;/span&gt; &lt;span class="nt"&gt;--workspace&lt;/span&gt;
cargo &lt;span class="nb"&gt;test&lt;/span&gt; &lt;span class="nt"&gt;--workspace&lt;/span&gt; &lt;span class="nt"&gt;--exclude&lt;/span&gt; aideen-block &lt;span class="nt"&gt;--exclude&lt;/span&gt; aideen-engine &lt;span class="nt"&gt;--exclude&lt;/span&gt; aideen-node
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We're currently training our first full model and will publish DEQ vs. transformer benchmarks soon.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;References:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Bai, Kolter &amp;amp; Koltun (2019). &lt;a href="https://arxiv.org/abs/1909.01377" rel="noopener noreferrer"&gt;Deep Equilibrium Models&lt;/a&gt;. NeurIPS.&lt;/li&gt;
&lt;li&gt;Bai et al. (2021). &lt;a href="https://arxiv.org/abs/2106.14342" rel="noopener noreferrer"&gt;Stabilizing Equilibrium Models by Jacobian Regularization&lt;/a&gt;. ICML.&lt;/li&gt;
&lt;li&gt;Gu &amp;amp; Dao (2023). &lt;a href="https://arxiv.org/abs/2312.00752" rel="noopener noreferrer"&gt;Mamba: Linear-Time Sequence Modeling with Selective State Spaces&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Winston &amp;amp; Kolter (2020). &lt;a href="https://arxiv.org/abs/2006.08591" rel="noopener noreferrer"&gt;Monotone Operator Equilibrium Networks&lt;/a&gt;. NeurIPS.&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;We're two developers building this from scratch. If you're interested in DEQ architectures, Rust ML, or WebGPU compute — &lt;a href="https://github.com/SergioAriel/aideen/blob/main/CONTRIBUTING.md" rel="noopener noreferrer"&gt;contributions welcome&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>deeplearning</category>
      <category>machinelearning</category>
      <category>rust</category>
    </item>
  </channel>
</rss>
