ORCHESTRATE

Posted on Apr 20 • Edited on Apr 22

Active Inference, The Learn Arc — Part 5: Chapter 4 — A, B, C, D and the Generative Model You Can Actually Build

#activeinference #pomdp #bayesian #elixir

Series: The Learn Arc — 50 posts teaching Active Inference through a live BEAM-native workbench. ← Part 4: The High Road. This is Part 5.

The hero line

The Workbench renders Chapter 4 as:

Every belief, every action, every thought — inside one generative model.

This is the chapter that turns Chapters 1–3's abstract P(observations, states, policies) into four matrices you can write down — and then shows you how to update them using one equation per update type.

If Chapter 2 gave you the functional and Chapter 3 gave you the plan, Chapter 4 gives you the blueprint.

The four matrices

A discrete-time Active Inference agent is, at root, four matrices and two vectors:

A — P(observation | state). The sensory likelihood. "If the true state is X, what observation would I see?"
B — P(next_state | state, action). The transition model. "If I'm in state X and do action a, where do I end up?"
C — P(observation). The preference prior. "Which observations do I prefer to see?" (Chapter 3's P(o).)
D — P(initial_state). The starting prior. "Where do I think I am at time 0?"
π — policy. A sequence of actions.
q — Q(states, policies) — the factorized variational posterior the agent actually holds.

That is the whole generative model. You don't need anything else. Everything in Chapters 5–10 is specialisations of these four matrices (continuous, hierarchical, hybrid, learned, etc.).

Eq. 4.13 — the belief update

Give the agent an observation. The question "what state am I in?" becomes one equation:

q(s_τ) ∝ exp(  E_Q[ log A[o_τ, s_τ] ]
              + E_Q[ log B_fwd[s_τ, s_{τ−1}, π] ]
              + E_Q[ log B_back[s_τ, s_{τ+1}, π] ] )

Three terms. The observation likelihood. Forward messages from the past. Backward messages from the future. The agent reads its sensors, then passes messages forward and backward along its policy's state sequence until the posterior stops changing.

The Workbench implements this as a pure function: ActiveInferenceCore.DiscreteTime.state_belief_update/.... Every signal it emits carries equation_id: "eq_4_13_state_belief_update" so Glass can trace it.

Eq. 4.14 — the policy posterior

For each candidate policy, compute its expected free energy G (Chapter 3). Then the posterior over policies is:

Q(π) = softmax( −G(π) − F(π) )

The softmax gives you a distribution over plans. The expected free energy G has the two columns from Chapter 3. The variational free energy F is the Chapter-2 quantity, scored against the observations already seen. Two terms of cost. One softmax. One plan distribution.

Then the agent acts: argmax over the first-step action marginal, or sample.

The Builder canvas

Now that you know the moving parts, /builder/new becomes a lot more interesting:

Drag blocks onto the canvas. Each block is a typed Jido module:

Perceive — runs Eq. 4.13 on the current observation.
Plan — computes G for each candidate policy and softmaxes.
Act — emits the first-step action marginal.
DirichletAUpdater — learns A online (Chapter 7, Eq. 7.10).
DirichletBUpdater — learns B online.
SophisticatedPlanner — deep-horizon belief-propagated policy search.
Plus skill blocks: EpistemicExplorer, Softmax, KLDivergence, GeneralizedFilter, ...

Each block has typed input/output ports. Drag a wire between obs and obs. The Inspector on the right validates every param through server-side Zoi. Save the graph as a spec; instantiate it as a real Jido.AgentServer; attach it to a world in Studio. That's it.

This is the moment the book stops being a book. You can now put new generative models on the canvas that aren't in any chapter yet, and run them.

The POMDP recipe

The canonical example is pomdp-tiny-corridor:

Four hidden corridor states {L, M1, M2, R}. Four observations (the agent sees a blurred version of its position). Two actions {left, right}. Preference concentrated on state R. That's the whole problem.

Press Run in Studio and you'll watch:

The agent's belief heatmap (pumping out of Eq. 4.13).
Its policy posterior (top-3 plans, F + G per plan).
Its predicted trajectory (what positions it expects to visit under the top plan).
The actual step count and terminal flag.

Everything in the UI is wired to one of the four matrices. The heatmap is A × q(s). The policy bars are the Eq. 4.14 softmax. The trajectory is B applied to q(s) for each planned step.

Glass: every signal, every equation

This is where Glass Engine pays off:

Glass subscribes to the agent's telemetry bus and renders every signal with its provenance: agent_id, spec_id, world_run_id, equation_id. Click any signal in the river and you land on its equation page.

So when the agent does something that looks weird — say it picks left when you expected right — you can scroll back, find the signal, click the Eq. 4.14 label, read the softmax that produced that choice, and see the actual F + G values at that tick. No guessing, no "what was the model thinking" — the model is a small, traceable program and every step is a committed event in Mnesia.

That is the payoff of building this on BEAM. Active Inference is a glass-box theory of agency. The implementation should be glass-box too.

Eq. 4.19 — the continuous-time bonus

Chapter 4 ends with a beautiful move: the same generative model in continuous time, with states that have velocities, accelerations, jerks — "generalised coordinates." Eq. 4.19 rewrites free energy as a quadratic form in those generalised coords. Predictive coding falls out one hierarchy level at a time.

The Workbench scaffolds this in WorldPlane.ContinuousWorlds and the generalized_filter skill. We'll come back to Eq. 4.19 in Part 9 (Chapter 8, Continuous Time). For now: note that the discrete A/B/C/D story and the continuous-coords story are the same functional, just different coordinate systems on the same manifold. That's not a coincidence; that's the whole bet of the Free Energy Principle.

Run it yourself

/cookbook/pomdp-tiny-corridor — the canonical four-state example.
/builder/new?recipe=pomdp-tiny-corridor — open the same spec on the canvas and tweak it.
/cookbook/dirichlet-learn-a-matrix — watch the sensory likelihood A get learned online.
/cookbook/sophisticated-plan-tree-search — deeper policy horizon via belief-propagated search.
/equations/eq_4_13_state_belief_update — the belief-update equation page (verified against source + appendix).
/glass — stream every signal every agent emits, labeled with the equation that produced it.

Open the POMDP recipe, press Run in Studio, then open /glass/agent/<id> in another tab. Keep both tabs visible. Now every tick of the agent shows you the matrix math in the first tab and the signal provenance in the second. You are debugging the cortex.

The mental move

We now have the whole shape of Active Inference:

A generative model — four matrices A, B, C, D plus a variational posterior Q over states and policies.
A Perceive step — Eq. 4.13 updates q(s).
A Plan step — Eq. 4.14 softmaxes over −G − F for each candidate policy.
An Act step — sample or argmax from the first-step action marginal.
A Learn step (optional) — Dirichlet updates to A, B (Chapter 7).

Five primitives. Every subsequent chapter specialises them. Chapter 5 makes the Perceive step biologically plausible. Chapter 6 gives you a step-by-step design recipe. Chapter 7 puts it all together on proper POMDPs. Chapter 8 lifts it to continuous time. Chapter 9 fits it to data. Chapter 10 asks what else it might mean.

This is the book's hinge. Everything before here was motivation; everything after is variation.

What the first 5 posts gave you

Part 1 — the map: why this exists, what the Workbench is, how to follow.
Part 2 — the loop: Perceive → Plan → Act, and the Markov blanket that separates agent from world.
Part 3 — the low road: Bayes → variational free energy, in one gradient.
Part 4 — the high road: Expected Free Energy as a two-column bill, risk + ambiguity.
Part 5 — the concrete model: A, B, C, D, Eq. 4.13, Eq. 4.14, Builder, Glass.

If you've been reading along, you now have enough to:

Open any cookbook recipe and know what the Math block is saying.
Read any signal in Glass and understand what equation emitted it.
Drop blocks on the Builder canvas and know which matrix each one touches.
Ship your first Active Inference agent. (Chapter 6 — Part 7 of this series — is the step-by-step recipe.)

Part 6: Chapter 5 — Message Passing and Neurobiology. The cortex as a factor graph. Why ACh, noradrenaline, dopamine, and serotonin show up as precision knobs in exactly the places Eq. 4.13 puts them. The clearest map from free-energy math to actual neural circuits the book offers.

⭐ Repo: github.com/TMDLRG/TheORCHESTRATEActiveInferenceWorkbench · MIT license

📖 Active Inference, Parr, Pezzulo, Friston — MIT Press 2022, CC BY-NC-ND: mitpress.mit.edu/9780262045353/active-inference

← Part 4: The High Road · Part 5: Generative Models (this post) · Part 6: Message Passing → coming soon