DEV Community

Cover image for 🍒 hatsu-yakitori 0.4.8 & 0.4.9: Lean-Verified FMM, HDF5 Input, and a Smarter Frontier
Yoshihiro Hasegawa
Yoshihiro Hasegawa

Posted on

🍒 hatsu-yakitori 0.4.8 & 0.4.9: Lean-Verified FMM, HDF5 Input, and a Smarter Frontier

Hey everyone! πŸ‘‹ Two releases in quick succession means one blog post β€” let's dig into what landed in hatsu-yakitori 0.4.8 and 0.4.9, a Chicken Scheme framework that mashes together the Fast Multipole Method (FMM), Goppa codes on algebraic curves, and Lean 4 formal proofs. Yes, really.

If you haven't heard of the project before: the core idea is treating FMM hierarchies not as ad-hoc data structures but as a genuine algebraic object β€” points on a curve, potentials as rational functions, multipole expansions as Laurent series. It sounds wild, and it is, but it also turns out to be extremely useful for bounded, reproducible numerical computation.

Let's walk through what's new. ✨


πŸ”¬ FMM.lean: Machine-Checked Invariants Now Gate Every Build

The headline feature across both releases is that FMM.lean is now hooked into CI as a hard gate. The lean-fmm-gate.yml GitHub Actions workflow builds HatsuYakitori.FMM and HatsuYakitori.HopfStructure on every push, and the build fails if any proof breaks.

What does that mean in practice? It means the following properties are machine-checked β€” not just unit-tested, but proven β€” before a release ships:

The Four Well-Formedness Invariants

The FmmEvalState.WellFormed predicate enforces four conditions that every state transition must preserve:

  1. Frontier discipline is immutable β€” frontierMode is frozen at initialization from FmmConfig and never changes during evaluation.
  2. Golay class is immutable β€” golayWeight is similarly frozen; the entropy class can't drift mid-run.
  3. All pending level indices are in-bounds β€” every β„“βˆˆpendingLevels\ell \in \texttt{pendingLevels} satisfies β„“<∣hierarchy∣\ell < |\texttt{hierarchy}| .
  4. The current level, if present, is in-bounds β€” same bound check for currentLevel?. These aren't just defensive assertions. The Lean proofs establish them as loop invariants: any well-formed state that produces a valid step yields a well-formed next state. Formally:
WF(s)β€…β€Šβˆ§β€…β€Šstep?(s)=someΒ stepβ€…β€ŠβŸΉβ€…β€ŠWF(sβ€²) \mathrm{WF}(s) \;\wedge\; \texttt{step?}(s) = \texttt{some step} \;\Longrightarrow\; \mathrm{WF}(s')

The push and pop operators both come with their own preservation lemmas (pushLevel_wellFormed, popLevel_wellFormed, etc.), so there's no escape hatch.

The CI Gate Theorem: smallBranchCoverage_ci_gate

The most visible proof in the gate is smallBranchCoverage_ci_gate, which asserts that both the far-field and near-field execution branches behave correctly end-to-end on a pair of minimal oracle inputs. The proof body itself is a single line:

theorem smallBranchCoverage_ci_gate : (...) ∧ (...) := by
  exact ⟨smallOracle_ci_gate, smallNearOracle_ci_gate⟩
Enter fullscreen mode Exit fullscreen mode

All the work is in the two per-branch lemmas; the gate just conjoins them. This means adding a third branch in the future is literally one more pair β€” the architecture scales cleanly. Let's look at each oracle:

Far-field oracle (smallOracle): Grid = 1,2βŠ‚C{1, 2} \subset \mathbb{C} , charge q0=1q_0 = 1 , target = 22 . The cell center is z0=1z_0 = 1 , distance d=1β‰₯r⋆=0.5d = 1 \geq r_\star = 0.5 , so this is classified far-field.

  • P2M gives M0=q0β‹…(z0βˆ’z0)0=1M_0 = q_0 \cdot (z_0 - z_0)^0 = 1
  • M2L gives L0=1/(zTβˆ’z0)β‹…M0=1L_0 = 1 / (z_T - z_0) \cdot M_0 = 1
  • Result: Ο•=1\phi = 1 βœ…
    Near-field oracle (smallNearOracle): Same grid, but r⋆=2r_\star = 2 , so d=1<r⋆d = 1 < r_\star β€” near-field now.

  • Direct sum: Ο•near=q0/(zTβˆ’z0)=1/(2βˆ’1)=1\phi_{\text{near}} = q_0 / (z_T - z_0) = 1 / (2 - 1) = 1

  • Result: Ο•=1\phi = 1 βœ…
    Both oracles arrive at the same potential through completely different code paths. The theorem pins down the P2M β†’ M2L pipeline, the direct-sum kernel, the frontier pop mechanics, and the potential accumulation arithmetic β€” all in one shot. If any of that regresses, the build breaks. 🚦


🌊 HDF5 Input Support: Real Datasets into the Proof Flow

Release 0.4.9 ships first-class HDF5 support, which is one of the more practically useful additions if you're running actual simulations.

The entry point is the new --hdf5 FILE CLI option:

hatsu-fmm --hdf5 examples/fmm/plasma_landau_mock.h5 sbv-so-fmm
Enter fullscreen mode Exit fullscreen mode

This forwards an HDF5 dataset directly to the SBV/SMT proof harness. The Shake rule hdf5-scan additionally produces JSON header dumps from any HDF5 files under the repository:

cabal run shake -- hdf5-scan
# β†’ build/hdf5/<name>.json from h5dump -H
Enter fullscreen mode Exit fullscreen mode

The CI runner installs libhdf5-dev and hdf5-tools automatically on Ubuntu, so HDF5-enabled runs work out of the box on the release pipeline. A sample dataset (examples/fmm/plasma_landau_mock.h5) is included so you can reproduce the canonical run locally without needing your own data.

One thing to note: h5dump needs to be in PATH. If it's missing, the Shake rule fails loudly and tells you exactly what to install β€” no silent wrong-answer behavior.


πŸŽ›οΈ CLI Options Reference

The hatsu-fmm binary (available as a .deb for Linux or via the GitHub Pages APT repo) now exposes a fairly complete option surface. Here's the full breakdown:

Operating Modes

Mode What it does
--help Prints CLI usage and examples
--check-env Verifies kernel import and reports machine constants
--check Lightweight self-check: constants, grid/hierarchy generation, one synthetic eval
--list-caps Prints the packaged capability surface for this release
--dry-run Parses options and reports the planned evaluation without executing
--benchmark Repeats evaluation for timing measurements
--explain TOPIC Prints explanation for a known runtime topic or limitation

Main Options

Option Meaning Default
-p, --precision INT Accuracy target β†’ effective multipole order 8
--order INT Explicit multipole order override 8
-t, --threads INT Requested worker count (best-effort) 1
--theta FLOAT Admissibility hint for near/far separation 0.5
--input PATH Read a Scheme problem form from file synthetic
--grid-size INT Particle count for generated synthetic input 64
--target-index INT Target particle index in generated grid 0
--hdf5 FILE HDF5 input for proof/SBV targets and hdf5-scan none
--frontier-bits INT Golay-controlled frontier bits for traversal 0
--iterations INT Repetitions in benchmark mode 3

Quick examples:

# Sanity check after install
hatsu-fmm --check-env

# Dry run at scale before committing to a big job
hatsu-fmm --dry-run --grid-size 1000000 -p 12

# Benchmark with a real input file, 5 repetitions
hatsu-fmm --input examples/fmm/sample_problem.scm --benchmark --iterations 5

# Run SBV proof with a real HDF5 dataset
hatsu-fmm --hdf5 examples/fmm/plasma_landau_mock.h5 sbv-so-fmm
Enter fullscreen mode Exit fullscreen mode

🧠 How the Golay Frontier Invariants Actually Work

This is the part people find most surprising about the project, so let's slow down on it.

The core idea: instead of hardcoding "use DFS" or "use BFS", the traversal strategy is derived from the Hamming weight of a Golay[24,12] codeword. The Golay code has a beautiful combinatorial structure β€” its valid codeword weights form the set 0,8,12,16,24{0, 8, 12, 16, 24} β€” and that structure is now enforced by the Lean type system.

The Phase Boundary at Ο„=12\tau = 12

FrontierMode.ofTau implements the key decision:

FrontierMode(w)={stackΟ„<12(lowΒ entropyβ†’DFS,Β exploitation)Β queueΟ„β‰₯12(highΒ entropyβ†’BFS,Β exploration) \texttt{FrontierMode}(w) = \begin{cases} \texttt{stack} & \tau < 12 \quad (\text{low entropy} \to \text{DFS, exploitation}) \ \texttt{queue} & \tau \geq 12 \quad (\text{high entropy} \to \text{BFS, exploration}) \end{cases}

The midpoint Ο„=12\tau = 12 is the dodecad class of the Golay code β€” the most symmetric codewords. It acts as a phase boundary between two extremal traversal regimes. You can think of it as an information-theoretic thermostat for your N-body solver.

Push and Pop Are Provably Discipline-Preserving

The frontier mode is frozen at initialization and can never change during evaluation β€” this is Invariant #1 from the well-formedness predicate above. Both push and pop are proven to leave frontierMode and golayWeight unchanged:

push(s,β„“)={β„“::s.pendingLevelsifΒ frontierMode=stack(LIFO)Β s.pendingLevels++[β„“]ifΒ frontierMode=queue(FIFO) \texttt{push}(s, \ell) = \begin{cases} \ell :: s.\texttt{pendingLevels} & \text{if frontierMode} = \texttt{stack} \quad (\text{LIFO}) \ s.\texttt{pendingLevels} \mathbin{++} [\ell] & \text{if frontierMode} = \texttt{queue} \quad (\text{FIFO}) \end{cases}

So the same state machine produces DFS or BFS from its initial configuration, with no runtime switching. The Lean lemmas pushLevel_frontierMode and popLevel_frontierMode make this definitionally true, not just documentationally true.

From Scheme to Lean: The REPL Side

In the actual Chicken Scheme egg, make-adaptive-frontier reads the Golay bits and enforces the same weight classes 0,8,12,16,24{0, 8, 12, 16, 24} along with the Non-Happus antitone profile (20 10 0):

(import golay_frontier)

(define cfg (make-adaptive-frontier #x123))

(list (adaptive-frontier-mode cfg)      ; β†’ stack or queue
      (adaptive-frontier-tau cfg)        ; β†’ effective Ο„
      (frontier-respects-witt-symmetry? cfg)  ; β†’ #t or #f
      (golay-valid-weight? (adaptive-frontier-tau cfg)))
Enter fullscreen mode Exit fullscreen mode

frontier-respects-witt-symmetry? checks that the frontier configuration satisfies the Lean-derived invariants at the Scheme level. If you pass in a codeword that doesn't map to a valid weight class, the frontier won't be constructed.


πŸ” The Far-Field Pipeline in Detail

The Geometry Comes First: FmmTransition

Before any arithmetic runs, FmmTransition classifies the step geometrically. What's interesting is that the distance bound is literally baked into the constructor signature β€” it's not a runtime check, it's part of the proof:

inductive FmmTransition (input : FmmInput) (state : FmmEvalState) : FmmStep β†’ Prop where
  | directSum : ... β†’ step.distance < input.config.admissibilityRadius β†’ ...
  | multipole : ... β†’ input.config.admissibilityRadius ≀ step.distance β†’ ...
Enter fullscreen mode Exit fullscreen mode

The near/far cutoff r⋆r_\star isn't just a float comparison buried in an if β€” it's an inequality proposition that the type checker demands evidence for. step?_transition then proves every step produced by a well-formed evaluation satisfies one of these two constructors.

The 4-Stage Multipole Pipeline

Once a cell is classified far-field, four inductive relations chain together as a proof-relevant certificate stack. Each stage requires evidence that the previous stage ran:

Stage Lean type What it certifies
1. Branch selection FmmMultipoleTransition step? state = some step and kernelBranch = .multipole
2. P2M expansion FmmP2MExpansion payload.multipoleCoeffs = p2mCoeffs payload
3. M2L translation FmmM2LTranslation payload.localCoeffs = m2lCoeffs payload
4. State update FmmMultipoleUpdate updatedState = nextState.applyMultipole payload

You physically cannot construct a FmmMultipoleUpdate without a FmmM2LTranslation, which requires a FmmP2MExpansion, which requires a FmmMultipoleTransition. The ordering isn't enforced by a comment or a test β€” it's enforced by dependent types.

P2M (Particle to Multipole): For cell center z0z_0 and source charges qiq_i :

Mk=βˆ‘jqijβ‹…(zijβˆ’z0)k,k=0,…,pβˆ’1 M_k = \sum_j q_{i_j} \cdot (z_{i_j} - z_0)^k, \qquad k = 0, \ldots, p-1

M2L (Multipole to Local): Translate to target zTz_T via the binomial formula:

Lj=βˆ‘k=0pβˆ’1Mkβ‹…(βˆ’1)k(j+kk)(zTβˆ’z0)j+k+1,j=0,…,pβˆ’1 L_j = \sum_{k=0}^{p-1} M_k \cdot \frac{(-1)^k \binom{j+k}{k}}{(z_T - z_0)^{j+k+1}}, \qquad j = 0, \ldots, p-1

The leading coefficient L0L_0 is the far-field potential contribution. The key lemma multipolePayload_leadingLocalCoeff_eq_m2lHead pins this down:

(input.multipolePayload step).leadingLocalCoeff =
  (input.m2lCoeffs (input.multipolePayload step)).getD 0 0
Enter fullscreen mode Exit fullscreen mode

The Observability Layer: FmmExecutionObserves

One more design worth highlighting: FmmExecution (which unifies both branches) is paired with a separate FmmExecutionObserves relation that explicitly names the potential increment Ξ΄\delta :

inductive FmmExecutionObserves ... : FmmStep β†’ FmmEvalState β†’ Potential β†’ Prop where
  | directSum : FmmDirectSumUpdate ... β†’ FmmExecutionObserves ... payload.totalContribution
  | multipole : FmmMultipoleUpdate ... β†’ FmmExecutionObserves ... ((input.m2lCoeffs payload).getD 0 0)
Enter fullscreen mode Exit fullscreen mode

The theorem executionObservation_totalPotential then gives the additive update formula:

sβ€².totalPotential=step.nextState.totalPotential+Ξ΄ s'.\texttt{totalPotential} = \texttt{step.nextState}.\texttt{totalPotential} + \delta

This separation means the CI gate can assert both that the execution happened and that the potential increment is what we expect β€” two distinct claims, cleanly separated.


πŸ—οΈ CI / CD What Changed

Here's the complete CI picture after 0.4.9:

Workflow Trigger What it checks
lean-fmm-gate.yml Every push/PR Builds HatsuYakitori.FMM + HatsuYakitori.HopfStructure in Lean 4
lean-fmm-gate.yml (Scheme step) Every push/PR Runs csi -s tests/fmm_tests.scm on Ubuntu with CHICKEN 5
fmm-deb-release.yml On fmm-v* tags Builds .deb, installs hdf5-tools, publishes to GitHub Releases + APT repo

The Lean gate and the Scheme smoke test run together β€” they're not separate jobs. This means you can't have a state where the proofs pass but the actual implementation is broken, or vice versa.

The APT repository is backed by GitHub Pages and signed with the key at https://yoshyhyrro.github.io/hatsu-yakitori/public.asc. The .deb release URL tracks releases/latest/download so downstream install scripts don't need manual version bumps.


πŸ“¦ Installing

# APT repo (recommended)
curl -fsSL https://yoshyhyrro.github.io/hatsu-yakitori/public.asc \
  | gpg --dearmor \
  | sudo tee /usr/share/keyrings/hatsu-yakitori-archive-keyring.gpg >/dev/null

echo "deb [arch=amd64 signed-by=/usr/share/keyrings/hatsu-yakitori-archive-keyring.gpg] \
  https://yoshyhyrro.github.io/hatsu-yakitori stable main" \
  | sudo tee /etc/apt/sources.list.d/hatsu-yakitori.list >/dev/null

sudo apt update && sudo apt install hatsu-fmm

# Verify
hatsu-fmm --check-env
hatsu-fmm --list-caps
Enter fullscreen mode Exit fullscreen mode

For HDF5 support, you'll also need:

sudo apt install libhdf5-dev hdf5-tools
Enter fullscreen mode Exit fullscreen mode

For the Chicken Scheme egg itself:

chicken-install -n  # validate without installing
chicken-install     # install to your repo
Enter fullscreen mode Exit fullscreen mode

πŸ”­ What's Next

A few directions the project is heading:

  • Elliptic curve grids (genus g=1): Extending make-goppa-grid from the unit circle to Weierstrass β„˜-functions, which would give a unified algebraic alternative to Ewald summation for periodic boundary conditions. This is the big one mathematically.
  • JSON stdin for the classroom RPC demo: Making handle-request callable from any language for educational use.

- Full-library proof CI: A few unrelated Lean modules (WittFoundation, HidaArikiKoikeNotes) need repair before the whole library can gate on CI. Working on it.

πŸ’¬ Wrapping Up

The headline for these two releases is: the math in the comments is now the math in the proofs, and the proofs now block the build. Distance bounds are constructor arguments. Pipeline ordering is enforced by dependent types. The CI gate theorem is one line because all the real work is already proven. HDF5 input connects the abstract verification pipeline to real simulation data. And the CLI is usable enough to actually run experiments without digging into Scheme source.

If this kind of thing β€” formal methods meeting numerical physics, Goppa codes meeting N-body algorithms β€” sounds interesting to you, the repo is at github.com/Yoshyhyrro/hatsu-yakitori. Issues, questions, and PRs welcome. 🎯


Cross-posted from the project wiki. Lean 4 is hard. Golay codes are beautiful. Multipole expansions are Laurent series. That's the whole pitch.

Top comments (0)