DEV Community

Cover image for Released larkos 0.3
Okerew
Okerew

Posted on

Released larkos 0.3

Larkos 0.3: GAT neuron reasoning, temporal encoder, refactored fusion head

Core architecture changes:

Add _NeuronGraphReasoner: two-layer GAT over the live neuron graph,
producing per-neuron token embeddings (MAX_NEURONS x FUSE_GRAPH_DMODEL).
Node features include state, output, layer one-hot, connection degree,
state velocity, output magnitude, and mean edge weight (D_NODE=8).
Learned per-neuron embedding ensures distinct tokens for symmetric nodes.
Add _GATLayer: hand-rolled multi-head GAT with edge-weight-modulated
scores (tanh-squashed gain), dense [N,N] adjacency mask, and self-loops.
Add _TemporalAttentionEncoder: two-layer transformer over the
[TEMPORAL_WINDOW, FOURIER_OUT_DIM] input history with learned positional
embedding, replacing flat concatenation of window frames.
Refactor _FusionTransformerHead: attends over MAX_NEURONS+3 token
sequence (GAT tokens + band_q + band_m + driver). Token-type embeddings
(4 types) distinguish token kinds. Replaces mean-pool with learned-query
attention pool (single query, softmax over sequence). Head capacity
increased: 3 layers, d_model=64, dim_ff=128.
C-side fusion (fusion_mechanism.c):

Remove BAND_N and the neuron_flat projection pipeline; neuron reasoning
is now handled end-to-end by the Python-side GAT.
BAND_Q=32, BAND_M=32, FUSION_DIM=BAND_Q+BAND_M=64.
MEM_TOP_K: 8→32, MAX_MEM_ENTRIES: 300→1200.
Training loop:

Freeze cache extended to cover driver embedding (_cached_driver) and
GAT inputs (_cached_graph_inputs). All three caches invalidated together
on target refresh. GAT runs forward_from_inputs in-graph every step
(pinned inputs, live gradient).
x_temporal detached before MAML inner loop to prevent double-backward
through the temporal encoder graph.
graph_reasoner and temporal_encoder added to optimizer and checkpoint.
Verifier re-runs temporal_encoder on cached raw sequence to avoid
reusing a consumed autograd graph.
LR sensitivity check uses relative threshold (15% of current loss)
instead of fixed absolute delta.
Runner:

Add _advance_backend(): runs C-side decision/context/neuron/attractor/
affective updates before each step() so multi-step inference sees
evolving state.
alpha and mem_weight_ratio derived from live backend context by default,
matching the training loop's per-epoch derivation.
Temporal encoder and graph reasoner included in forward path.
Checkpoint:

Saves/loads graph_reasoner and temporal_encoder (strict=False for
backward compatibility with pre-0.3 checkpoints).
FUSION_DIM and fused_cog_norm dimension mismatch detection with safe
fallback to fresh init.
cached_driver persisted alongside cached_fused_cog.

https://github.com/Okerew/larkos_models

Top comments (1)

Collapse
 
mehmetcanfarsak profile image
Mehmet Can Farsak

Fascinating architecture work with the GAT neuron reasoning and temporal encoder. When building agent systems I've noticed agents don't have a "thinking mode" vs "action mode" — they just execute. Built Brainstorm-Mode (mehmetcanfarsak on GitHub) that adds three modes (divergent, actionable, academic) via hooks so the agent stays in brainstorming instead of jumping straight to code.