DEV Community

Cover image for Updated larkos_0.1 to larkos 0.2
Okerew
Okerew

Posted on

Updated larkos_0.1 to larkos 0.2

Larkos 0.2: Major optimizations and architectural improvements

  • MAML inner-loop refactor: Replaced copy.deepcopy(model) every epoch with a persistent fast_model clone refreshed via per-parameter copy, plus a custom EmbeddingProjector.deepcopy that shares the frozen 22M-param MiniLM across clones instead of duplicating it. Reduces per-epoch deepcopy cost from 22M params to only the small trainable subset.
  • ST encode caching: Cached the last SentenceTransformer output so re-encoding the same embed_ctx string (~15x per epoch) only fires when the string actually changes.
  • Adaptive MC dropout: Reduced MC samples 3× when no Gaussian exploration noise is injected (non-exploring epochs), cutting MC cost ~3× on quiet epochs.
  • FUSION_DIM (64) text prefix: Widened the decode prefix source from MAX_NEURONS (8) to FUSION_DIM (64), giving 8× more source variance to GPT-2 for the same cost. Prefix is now rescaled to wte's mean/std so GPT-2 receives in-distribution embeddings rather than random-projection noise.
  • distilgpt2: Replaced gpt2 with distilgpt2 (same hidden dim 768, half the params, ~2× faster generation); text is a debug readout so the quality drop is irrelevant.
  • Adam → AdamW: Corrected weight decay regularization — Adam folds WD into the gradient before adaptive scaling (under-regularizes); AdamW applies it directly to weights.
  • Anchor text in decode: Fed the actual driver sentence to GPT-2 as real text context rather than steering blind from a bare BOS token.
  • Safe reflection: Capped _step_counter at _history_count to prevent C-side out-of-bounds access in performSelfReflection.
  • Graceful cross-architecture checkpoint transfer: Added start_training_from() with _expand_copy() partial weight transfer for upscaling to larger model dimensions; load_checkpoint now uses strict=False and filters EMA shadow keys to match the live model, dropping obsolete keys gracefully.
  • Robust scenario selection: Added nout == 0 guards and best_idx < 0 fallback to prevent crashes when no scenarios have outcomes.
  • Removed stale input tracking: Dropped dead add_memory_step / update_meta calls from receive_predictions.
  • Relaxed test thresholds: Adjusted adaptation, grounding, and physics-test RSA thresholds to account for constant-mass-channel bias and realistic effect sizes; made recovery tolerance configurable.
  • README updated to Larkos 0.2 with note about previous model versions in source.

In https://github.com/Okerew/larkos_models

Top comments (0)