<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: alfredopalconit</title>
    <description>The latest articles on DEV Community by alfredopalconit (@alfredopalconit).</description>
    <link>https://dev.to/alfredopalconit</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3879019%2F79b31933-ac8f-4f37-85a3-4ebff7734f50.png</url>
      <title>DEV Community: alfredopalconit</title>
      <link>https://dev.to/alfredopalconit</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/alfredopalconit"/>
    <language>en</language>
    <item>
      <title>How to Differentiate Through a Simulation with Mutable State</title>
      <dc:creator>alfredopalconit</dc:creator>
      <pubDate>Tue, 14 Apr 2026 16:56:18 +0000</pubDate>
      <link>https://dev.to/alfredopalconit/how-to-differentiate-through-a-simulation-with-mutable-state-1lje</link>
      <guid>https://dev.to/alfredopalconit/how-to-differentiate-through-a-simulation-with-mutable-state-1lje</guid>
      <description>&lt;p&gt;If you've tried to differentiate through a physics simulation in JAX, you've hit this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;jnp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;array&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;2.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;3.0&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;10.0&lt;/span&gt;  &lt;span class="c1"&gt;# ERROR: JAX arrays are immutable
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;JAX requires pure functions. No mutation. No in-place updates.&lt;/p&gt;

&lt;p&gt;But simulations &lt;strong&gt;are&lt;/strong&gt; mutation. Particles have positions that change. Temperatures evolve. Pressures update. That's how physics works.&lt;/p&gt;

&lt;p&gt;I kept hitting this wall. So I built a language that solves it.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Wall
&lt;/h2&gt;

&lt;p&gt;I work in computational science. I write simulations — time-stepping loops where state updates every iteration. When I need to optimize parameters, I need gradients through those simulations.&lt;/p&gt;

&lt;p&gt;Here's what that looks like with today's tools:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;JAX&lt;/strong&gt; — Rewrite your entire simulation in a functional style. Every &lt;code&gt;x += delta&lt;/code&gt; becomes &lt;code&gt;x = x.at[i].set(...)&lt;/code&gt;. Every loop becomes &lt;code&gt;lax.scan&lt;/code&gt;. Months of refactoring. The code becomes unrecognizable.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Zygote (Julia)&lt;/strong&gt; — Handles some mutation, but produces &lt;a href="https://github.com/FluxML/Zygote.jl/issues" rel="noopener noreferrer"&gt;silent incorrect gradients&lt;/a&gt; on other patterns. You don't get an error. You get a wrong answer. Worse.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Hand-coded adjoint&lt;/strong&gt; — Write 10,000 lines of backward-pass code maintained in parallel with your forward solver. Every bug exists twice.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;LAMMPS / OpenFOAM / GROMACS&lt;/strong&gt; — Industry-standard simulators. Black boxes. No gradients come out.&lt;/p&gt;

&lt;p&gt;None of these options are acceptable. So I built a fifth one.&lt;/p&gt;




&lt;h2&gt;
  
  
  Meridian
&lt;/h2&gt;

&lt;p&gt;A programming language where &lt;code&gt;grad&lt;/code&gt; works through mutation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight swift"&gt;&lt;code&gt;&lt;span class="n"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;simulate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;k&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;Float64&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;Float64&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;var&lt;/span&gt; &lt;span class="nv"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;1.0&lt;/span&gt;
    &lt;span class="k"&gt;var&lt;/span&gt; &lt;span class="nv"&gt;v&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.0&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;step&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;..&lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="nv"&gt;a&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;
        &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.01&lt;/span&gt;          &lt;span class="c1"&gt;// mutation&lt;/span&gt;
        &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.01&lt;/span&gt;          &lt;span class="c1"&gt;// mutation&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;x&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;Float64&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nf"&gt;grad&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;simulate&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;wrt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This compiles. The gradient matches finite differences to 10⁻¹⁰.&lt;/p&gt;

&lt;p&gt;No functional rewrite. No tape management. No special syntax for the mutable parts. You write your simulation naturally. You call &lt;code&gt;grad&lt;/code&gt;. It works.&lt;/p&gt;




&lt;h2&gt;
  
  
  How It Actually Works
&lt;/h2&gt;

&lt;p&gt;The trick is &lt;strong&gt;SSA transformation&lt;/strong&gt; — a standard compiler technique, repurposed for automatic differentiation.&lt;/p&gt;

&lt;p&gt;When you write:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;var&lt;/span&gt; &lt;span class="nx"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;1.0&lt;/span&gt;
&lt;span class="nx"&gt;x&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nx"&gt;delta&lt;/span&gt;
&lt;span class="nx"&gt;x&lt;/span&gt; &lt;span class="o"&gt;*=&lt;/span&gt; &lt;span class="mf"&gt;3.0&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The compiler transforms it to Static Single Assignment form:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight llvm"&gt;&lt;code&gt;&lt;span class="nv"&gt;%x.0&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="err"&gt;const&lt;/span&gt; &lt;span class="m"&gt;1.0&lt;/span&gt;
&lt;span class="nv"&gt;%x.1&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;add&lt;/span&gt; &lt;span class="nv"&gt;%x.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;%delta&lt;/span&gt;
&lt;span class="nv"&gt;%x.2&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;mul&lt;/span&gt; &lt;span class="nv"&gt;%x.1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="err"&gt;const&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;3.0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In SSA, there is no mutation. Every "update" creates a new variable. The compiler then applies reverse-mode AD on this SSA code, producing &lt;strong&gt;new code&lt;/strong&gt; that computes gradients.&lt;/p&gt;

&lt;p&gt;The key insight: because the AD output is &lt;strong&gt;code&lt;/strong&gt; (not a runtime tape), you can differentiate the differentiated code. This means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Second derivatives work&lt;/strong&gt;: &lt;code&gt;d²/dx²(sin(x))&lt;/code&gt; correctly returns &lt;code&gt;-sin(x)&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cross-variable nested grad works&lt;/strong&gt;: differentiate forces with respect to potential parameters&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Arbitrary nesting&lt;/strong&gt;: differentiate as many times as you want&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These are exactly the things that break in tape-based AD systems (PyTorch, the prototype I built first). The compiler approach fixes them by design.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Broke Along the Way
&lt;/h2&gt;

&lt;p&gt;I built a prototype first — a Python package with tape-based AD. It passed 75 out of 80 tests. But it failed on the most important pattern:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# The NNP force-matching pattern:
# Inner grad: forces = -d(energy)/d(positions)
# Outer grad: d(loss)/d(model_parameters)
&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;loss&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;weight&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;forces&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;grad&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;energy&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;wrt&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;positions&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="n"&gt;pos&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;weight&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# inner
&lt;/span&gt;    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;mse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;forces&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;target&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;grad&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;loss&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;wrt&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;weight&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# outer — BROKE
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The tape couldn't see through the inner &lt;code&gt;grad&lt;/code&gt;. The inner backward pass returned a plain number, disconnected from the outer tape. The gradient was wrong.&lt;/p&gt;

&lt;p&gt;The compiler fixes this. The inner &lt;code&gt;grad&lt;/code&gt; produces SSA code. The outer &lt;code&gt;grad&lt;/code&gt; transforms that code. No information is lost. The result:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Test&lt;/th&gt;
&lt;th&gt;Prototype&lt;/th&gt;
&lt;th&gt;Compiler&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;d²/dx²(sin(x)) at x=1&lt;/td&gt;
&lt;td&gt;0.540 (wrong)&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;-0.841&lt;/strong&gt; (correct)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;d(force)/d(weight)&lt;/td&gt;
&lt;td&gt;wrong&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;correct&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MD loss through 10 steps&lt;/td&gt;
&lt;td&gt;not testable&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;correct&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  93 Tests, 8 Domains
&lt;/h2&gt;

&lt;p&gt;I validated the language specification with 8 simulated domain experts:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Neural network potentials&lt;/strong&gt; — optimize force field parameters through MD&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Topology optimization&lt;/strong&gt; — structural design with density gradients&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Climate science&lt;/strong&gt; — calibrate cloud physics against satellite data&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Robotics&lt;/strong&gt; — differentiate through contact-rich simulation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Protein design&lt;/strong&gt; — end-to-end differentiable sequence optimization&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Inverse rendering&lt;/strong&gt; — material estimation from photographs&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CFD&lt;/strong&gt; — shape optimization through Navier-Stokes&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Teaching&lt;/strong&gt; — a professor who wants to assign &lt;code&gt;grad&lt;/code&gt; as homework&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Each expert reviewed the spec twice. 16 reviews total. The compiler passes domain-specific tests for all 8.&lt;/p&gt;




&lt;h2&gt;
  
  
  Try It Right Now
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/alfredopalconit/meridian.git
&lt;span class="nb"&gt;cd &lt;/span&gt;meridian

&lt;span class="c"&gt;# Hello, Derivative: f(x) = x² + 3x + 1, f'(2) = 7&lt;/span&gt;
python3 compiler/main.py programs/01_hello_derivative.mer

&lt;span class="c"&gt;# Grad through 100 timesteps of a pendulum simulation&lt;/span&gt;
python3 compiler/main.py programs/02_pendulum.mer

&lt;span class="c"&gt;# Second derivative (broke the prototype, works in compiler)&lt;/span&gt;
python3 compiler/main.py programs/03_second_derivative.mer

&lt;span class="c"&gt;# Cross-variable nested grad (the NNP pattern)&lt;/span&gt;
python3 compiler/main.py programs/04_cross_variable.mer

&lt;span class="c"&gt;# Run all tests&lt;/span&gt;
python3 tests/test_mer_files.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;p&gt;This is Phase 0. Scalars only. Interpreted. No GPU. The architecture is proven — now it needs to become useful.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Milestone&lt;/th&gt;
&lt;th&gt;Target&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Tensor operations&lt;/td&gt;
&lt;td&gt;Month 3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Native binary via Cranelift&lt;/td&gt;
&lt;td&gt;Month 4&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GPU via MLIR&lt;/td&gt;
&lt;td&gt;Month 6&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;First external user runs a real simulation&lt;/td&gt;
&lt;td&gt;Month 12&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  What I'm Looking For
&lt;/h2&gt;

&lt;p&gt;I need computational scientists who have a simulation they can't differentiate. Not people who want a faster NumPy. People who have tried JAX and hit the mutation wall. People who maintain hand-coded adjoint solvers. People who gave up on getting gradients through their time-stepping loop.&lt;/p&gt;

&lt;p&gt;If that's you — or you know someone — I'd love to hear what you'd try to differentiate first.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;GitHub&lt;/strong&gt;: &lt;a href="https://github.com/alfredopalconit/meridian" rel="noopener noreferrer"&gt;github.com/alfredopalconit/meridian&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Meridian is open source (MIT). Built from a blank page to a working compiler in one very long conversation with an AI. The spec, the prototype, the compiler, and all 93 tests are in the repo.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>programming</category>
      <category>machinelearning</category>
      <category>python</category>
      <category>opensource</category>
    </item>
  </channel>
</rss>
