Larkos 0.3: GAT neuron reasoning, temporal encoder, refactored fusion head
Core architecture changes:
Add _NeuronGraphReasoner: two-layer GAT over the live neuron graph,
producing per-neuron token embeddings (MAX_NEURONS x FUSE_GRAPH_DMODEL).
Node features include state, output, layer one-hot, connection degree,
state velocity, output magnitude, and mean edge weight (D_NODE=8).
Learned per-neuron embedding ensures distinct tokens for symmetric nodes.
Add _GATLayer: hand-rolled multi-head GAT with edge-weight-modulated
scores (tanh-squashed gain), dense [N,N] adjacency mask, and self-loops.
Add _TemporalAttentionEncoder: two-layer transformer over the
[TEMPORAL_WINDOW, FOURIER_OUT_DIM] input history with learned positional
embedding, replacing flat concatenation of window frames.
Refactor _FusionTransformerHead: attends over MAX_NEURONS+3 token
sequence (GAT tokens + band_q + band_m + driver). Token-type embeddings
(4 types) distinguish token kinds. Replaces mean-pool with learned-query
attention pool (single query, softmax over sequence). Head capacity
increased: 3 layers, d_model=64, dim_ff=128.
C-side fusion (fusion_mechanism.c):
Remove BAND_N and the neuron_flat projection pipeline; neuron reasoning
is now handled end-to-end by the Python-side GAT.
BAND_Q=32, BAND_M=32, FUSION_DIM=BAND_Q+BAND_M=64.
MEM_TOP_K: 8→32, MAX_MEM_ENTRIES: 300→1200.
Training loop:
Freeze cache extended to cover driver embedding (_cached_driver) and
GAT inputs (_cached_graph_inputs). All three caches invalidated together
on target refresh. GAT runs forward_from_inputs in-graph every step
(pinned inputs, live gradient).
x_temporal detached before MAML inner loop to prevent double-backward
through the temporal encoder graph.
graph_reasoner and temporal_encoder added to optimizer and checkpoint.
Verifier re-runs temporal_encoder on cached raw sequence to avoid
reusing a consumed autograd graph.
LR sensitivity check uses relative threshold (15% of current loss)
instead of fixed absolute delta.
Runner:
Add _advance_backend(): runs C-side decision/context/neuron/attractor/
affective updates before each step() so multi-step inference sees
evolving state.
alpha and mem_weight_ratio derived from live backend context by default,
matching the training loop's per-epoch derivation.
Temporal encoder and graph reasoner included in forward path.
Checkpoint:
Saves/loads graph_reasoner and temporal_encoder (strict=False for
backward compatibility with pre-0.3 checkpoints).
FUSION_DIM and fused_cog_norm dimension mismatch detection with safe
fallback to fresh init.
cached_driver persisted alongside cached_fused_cog.
Top comments (1)
Fascinating architecture work with the GAT neuron reasoning and temporal encoder. When building agent systems I've noticed agents don't have a "thinking mode" vs "action mode" — they just execute. Built Brainstorm-Mode (mehmetcanfarsak on GitHub) that adds three modes (divergent, actionable, academic) via hooks so the agent stays in brainstorming instead of jumping straight to code.