Active Inference, The Learn Arc — Part 29: Session §5.2 — Predictive Coding, Where the Gradient Lives

#activeinference #neuroscience #ai #elixir

Series: The Learn Arc — 50 posts teaching Active Inference through a live BEAM-native workbench. ← Part 28: Session 5.1. This is Part 29.

The session

Chapter 5, §2. Session title: Predictive coding. Route: /learn/session/5/s2_predictive_coding.

Session 5.1 gave you the factor-graph picture. Session 5.2 is where that picture starts doing biologically plausible computation. Predictive coding is the name for the gradient-descent step the cortex takes when free-energy minimization runs in continuous time — residuals ascend, predictions descend, the mode of the belief slides toward the data.

The loop

At every level of a hierarchical generative model:

The upper level sends down a prediction of what it thinks the lower level should see.
The lower level compares the prediction to its observation — the difference is the prediction error ε.
The lower level sends the prediction error up — this is the bottom-up signal.
The upper level updates its belief to reduce the prediction error on the next tick.

Four steps. Each level of cortex runs this loop in parallel. In continuous time, the loop is a gradient descent of F in generalised coordinates.

The equation

Eq. 4.19 in its predictive-coding form:

dμ/dt  =  Dμ  −  ∂F/∂μ

μ is the posterior mean over hidden states (in generalised coordinates: position, velocity, acceleration...).
Dμ is the shift operator — rolls the generalised coords forward.
∂F/∂μ is the gradient of free energy at the current mean. Its structure: prediction error weighted by precision.

Minimise F → the mean slides along the gradient → the prediction error shrinks. That's predictive coding, end to end.

Where this matters for design

The practical upshot for the Workbench: when you configure a hierarchical agent with two Perceive blocks (a "lower" level reading observations and an "upper" level inferring a slow context), the two levels communicate only via prediction errors. The upper level does not see the raw observation; it sees the lower level's residual.

This is why /cookbook/predictive-coding-two-level-pass works the way it does — the second-level agent's input is, literally, the first-level agent's bottom-up prediction error signal.

Why precision is the star

Predictive coding weights each prediction-error by a precision (inverse variance). High precision = "trust this error, learn from it." Low precision = "this error is noise, ignore it."

Precisions live on every edge of the factor graph:

Precision on observation errors → ACh (Chapter 5 Session 5.3).
Precision on state-transition errors → NA.
Precision on policy-selection errors → DA.

The precision knobs ARE the neuromodulators. Session 5.3 will walk the table explicitly; Session 5.2 plants the concept.

The recipes that exercise this

/cookbook/predictive-coding-two-level-pass — the basic two-level hierarchy.
/cookbook/predictive-coding-top-down-prior — what happens when the upper level's prior dominates.
/cookbook/predictive-coding-precision-gates-error — turning precision up or down changes which errors propagate.
/cookbook/predictive-coding-oscillation — what goes wrong when precisions cycle.

Each has the Workbench-standard "Run in Studio" button, the Math block, and the path-specific narration.

The concepts this session surfaces

Prediction error (ε) — observation minus predicted observation.
Precision (Π) — inverse variance; weights on errors.
Generalised coordinates — (position, velocity, acceleration, ...) — how continuous states flow.
Gradient of F — the precision-weighted sum of errors.

The quiz

Q: In a two-level hierarchical agent, the upper level's input is:

☐ The raw observation from the world.

☐ The lower level's posterior mean (Q(s)).

☐ The lower level's prediction error. ✓

☐ A down-sampled version of the observation.

Why: In predictive coding, each level up the hierarchy sees only the residual — the prediction error — from the level below. The upper level updates its own belief to explain away that residual, and sends its new prediction back down. Raw observations stay at the sensory level; posteriors are communicated as residuals.

Run it yourself

/learn/session/5/s2_predictive_coding — session page.
/cookbook/predictive-coding-two-level-pass — hierarchy in action.
/cookbook/predictive-coding-top-down-prior — prior-driven predictions.
/equations — Eq. 4.19 with full derivation.

The mental move

Predictive coding is where the theory meets the physiology. Every empirical claim Chapter 5 makes about the cortex descends from this one idea: errors ascend, predictions descend, the hierarchy's job is to make the two match. You don't need to accept it as mechanism; you should know it as a commitment.

Part 30: Session §5.3 — Neuromodulation. Where ACh, NA, DA, and 5-HT enter the picture. Each one as a precision knob on a specific part of the factor graph. The most clinically-specific testable claim in the whole book.

⭐ Repo: github.com/TMDLRG/TheORCHESTRATEActiveInferenceWorkbench · MIT license

📖 Active Inference, Parr, Pezzulo, Friston — MIT Press 2022, CC BY-NC-ND: mitpress.mit.edu/9780262045353/active-inference

← Part 28: Session 5.1 · Part 29: Session 5.2 (this post) · Part 30: Session 5.3 → coming soon