DEV Community

ORCHESTRATE
ORCHESTRATE

Posted on

Active Inference, The Learn Arc — Part 33: Session §6.2 — A, B, C, D — Filling the Four Matrices

Session 6.2 — A, B, C, D

Series: The Learn Arc — 50 posts teaching Active Inference through a live BEAM-native workbench. ← Part 32: Session 6.1. This is Part 33.

The session

Chapter 6, §2. Session title: A, B, C, D. Route: /learn/session/6/s2_ab_c_d.

You have your three lists from Session 6.1. Session 6.2 walks through filling each of the four matrices, in order, with a careful eye on the shapes, the normalization constraints, and the common mistakes.

A: P(o | s) — the sensor

Session 4.2 covered A in theory. Session 6.2 covers A in practice:

  1. Pick a template. Most A matrices start near-identity with a small amount of noise off-diagonal (typically 0.01–0.1 of total column mass).
  2. Consider confusability. States that emit similar observations should share off-diagonal mass — that's the structure of real sensor confusion.
  3. Normalize. Every column sums to 1. The Workbench's Zoi validator refuses malformed A.

B: P(s' | s, a) — the transitions

Three ways to fill B:

  1. Deterministic. For each action, a permutation matrix. {up: move_up, down: move_down, ...}. Fast, simple, unrealistic for anything continuous.
  2. Stochastic with structure. Most mass on the expected transition, small mass on "slip" outcomes. Grid worlds often use this.
  3. Learned Dirichlet. Start with a flat prior and let the agent update it as it observes (s, a, s') triples. Chapter 7 Session 7.3 covers this.

C: P(o) — the preferences

C is the simplest conceptually (just a vector over observations) and the trickiest in practice (you're encoding what the agent wants to see):

  • High mass on observations that correspond to the goal state.
  • Zero or negative-log mass on observations you want to avoid.
  • Uniform if the agent should only explore.

Session 6.2's advice: start with a one-hot C pointed at the goal, run the agent, watch it beeline. Then soften C (add uniform mass across observations) to see the exploration-exploitation crossover from Session 3.2.

D: P(s_0) — the initial prior

D is usually uniform unless the agent has reason to believe it starts in a particular state. Two common cases:

  • Unknown start — D uniform, agent localizes itself from observations.
  • Known start — D one-hot at the known starting state.

The Zoi-enforced constraints

The Workbench's SpecCompiler + Zoi validator enforces the mathematical constraints at save time:

  • A: shape (|o|, |s|), columns sum to 1, all entries ≥ 0.
  • B: shape (|s|, |s|, |a|), columns sum to 1 along s' axis, all entries ≥ 0.
  • C: shape (|o|,), non-negative, normalized.
  • D: shape (|s|,), sums to 1, all entries ≥ 0.

Try to save a spec that violates these and Studio refuses. The error message points at the exact offending field. Contract enforced.

The concepts this session surfaces

  • A normalization — column-stochasticity.
  • B normalization — column-stochastic per action.
  • C shape — non-negative vector over observations.
  • D shape — categorical prior over states.

The quiz

Q: You fill A with off-diagonal mass of 0.5 and diagonals of 0.5 (evenly split). What's the likely behavior?

  • ☐ The agent will perform normally.
  • ☐ The agent's belief heatmap won't sharpen with observations. ✓
  • ☐ The agent will hallucinate.
  • ☐ The softmax temperature will diverge.

Why: A with equal diagonal and off-diagonal mass means observations contain no information about state — every observation is equally likely under every state. The likelihood message is flat. Belief doesn't sharpen. Fix A to concentrate mass on diagonals if you want informative sensing.

Run it yourself

The mental move

A, B, C, D is four matrices. Each has a distinct job. Most bugs trace to exactly one of them being mis-specified. Session 6.2 gives you the discipline to fill them carefully the first time.

Next

Part 34: Session §6.3 — Run and inspect. The final session of Chapter 6. Boot the agent you just built, watch it run in Studio, inspect every signal in Glass. The chapter as a deliverable workflow.


⭐ Repo: github.com/TMDLRG/TheORCHESTRATEActiveInferenceWorkbench · MIT license

📖 Active Inference, Parr, Pezzulo, Friston — MIT Press 2022, CC BY-NC-ND: mitpress.mit.edu/9780262045353/active-inference

Part 32: Session 6.1 · Part 33: Session 6.2 (this post) · Part 34: Session 6.3 → coming soon

Top comments (0)