<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Damjan Žakelj</title>
    <description>The latest articles on DEV Community by Damjan Žakelj (@damjan_akelj_be1aab4a715).</description>
    <link>https://dev.to/damjan_akelj_be1aab4a715</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3558090%2Fe92362ad-fec5-42fd-a388-ad4155d28445.jpg</url>
      <title>DEV Community: Damjan Žakelj</title>
      <link>https://dev.to/damjan_akelj_be1aab4a715</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/damjan_akelj_be1aab4a715"/>
    <language>en</language>
    <item>
      <title>DragonMemory: Neural Sequence Compression for Production RAG</title>
      <dc:creator>Damjan Žakelj</dc:creator>
      <pubDate>Thu, 20 Nov 2025 20:20:39 +0000</pubDate>
      <link>https://dev.to/damjan_akelj_be1aab4a715/dragonmemory-neural-sequence-compression-for-production-rag-54b6</link>
      <guid>https://dev.to/damjan_akelj_be1aab4a715/dragonmemory-neural-sequence-compression-for-production-rag-54b6</guid>
      <description>&lt;p&gt;&lt;strong&gt;TL;DR:&lt;/strong&gt; DragonMemory is an open-source RAG system that compresses embedding sequences by 16x (128 tokens → 8 latent vectors) while maintaining high retrieval accuracy. Unlike traditional RAG systems that store full token embeddings, Dragon uses a trained neural compressor to reduce storage requirements and speed up similarity search.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key Results:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;16:1 sequence compression (128 → 8 positions)&lt;/li&gt;
&lt;li&gt;90.4% token-level cosine similarity after reconstruction&lt;/li&gt;
&lt;li&gt;&amp;gt;85% retrieval recall @ k=3 on internal benchmarks&lt;/li&gt;
&lt;li&gt;~10ms inference per query on GPU&lt;/li&gt;
&lt;li&gt;Production-ready with Streamlit GUI, persistence, and multi-LLM support&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Repository:&lt;/strong&gt; &lt;a href="https://github.com/Freeky7819/DragonMemory" rel="noopener noreferrer"&gt;https://github.com/Freeky7819/DragonMemory&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem with Traditional RAG
&lt;/h2&gt;

&lt;p&gt;Standard RAG systems face a fundamental trade-off:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Option 1: Store sentence embeddings (384D)&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;✅ Small storage footprint&lt;/li&gt;
&lt;li&gt;❌ Loss of token-level granularity&lt;/li&gt;
&lt;li&gt;❌ Can't capture complex semantic structure&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Option 2: Store full token embeddings (128 × 384D)&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;✅ Rich semantic representation&lt;/li&gt;
&lt;li&gt;❌ High storage cost (~197KB per document)&lt;/li&gt;
&lt;li&gt;❌ Slow for large knowledge bases&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;DragonMemory offers a third option: &lt;strong&gt;learned compression that preserves semantic structure while reducing dimensionality&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  How DragonMemory Works
&lt;/h2&gt;

&lt;p&gt;The core is a PyTorch-based neural compressor with four key components:&lt;/p&gt;

&lt;h3&gt;
  
  
  Multi-Phase Resonant Pointer
&lt;/h3&gt;

&lt;p&gt;Selects the most important tokens through multi-phase transformer analysis.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MultiPhaseResonantPointer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Module&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;d_model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;384&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;n_phases&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;total_depth&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="c1"&gt;# Each phase refines token importance scores
&lt;/span&gt;        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;phases&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;ModuleList&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
            &lt;span class="nc"&gt;ResonantPointer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;d_model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;depth&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;depth_per_phase&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n_phases&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;])&lt;/span&gt;

        &lt;span class="c1"&gt;# LSTM maintains state across phases
&lt;/span&gt;        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;phase_memory&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;LSTM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;input_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;d_model&lt;/span&gt; &lt;span class="o"&gt;//&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;hidden_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;d_model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;num_layers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Why multi-phase?&lt;/strong&gt; Single-pass attention can miss subtle importance signals. Multiple phases with LSTM-based memory allow iterative refinement of token selection.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Empirical finding:&lt;/strong&gt; 2 phases hit diminishing returns for most tasks. More phases help on noisy corpora but add latency.&lt;/p&gt;

&lt;h3&gt;
  
  
  Neighbor Mixer
&lt;/h3&gt;

&lt;p&gt;Aggregates local context around selected tokens.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;neighbor_mixer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Sequential&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="c1"&gt;# Depthwise convolutions aggregate local context
&lt;/span&gt;    &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Conv1d&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;d_model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;d_model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;kernel_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
              &lt;span class="n"&gt;padding&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;groups&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;d_model&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="mi"&gt;32&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;GELU&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="c1"&gt;# Dilated conv extends receptive field
&lt;/span&gt;    &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Conv1d&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;d_model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;d_model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;kernel_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
              &lt;span class="n"&gt;padding&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dilation&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;groups&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;d_model&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="mi"&gt;32&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Why mix neighbors?&lt;/strong&gt; A token in isolation lacks context. Convolutions efficiently aggregate information from surrounding tokens before compression.&lt;/p&gt;

&lt;h3&gt;
  
  
  Harmonic Injection
&lt;/h3&gt;

&lt;p&gt;Adds positional resonance to embeddings.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;harmonic&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;B&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;D&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt;
    &lt;span class="n"&gt;pos&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;arange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;float&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;signal&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;exp&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mf"&gt;0.0025&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;pos&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sin&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;6.28&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;pos&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mf"&gt;1.047&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;harmonic_weight&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;signal&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Why harmonic?&lt;/strong&gt; Standard positional encodings are learned or fixed sinusoids. Harmonic injection uses a damped sinusoidal signal as a soft positional prior, helping the model preserve positional information after compression.&lt;/p&gt;

&lt;h3&gt;
  
  
  Compression Pipeline
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;compress&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;h&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;harmonic&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;              &lt;span class="c1"&gt;# Add positional signal
&lt;/span&gt;    &lt;span class="n"&gt;logits&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;pointer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;h&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;          &lt;span class="c1"&gt;# Score token importance
&lt;/span&gt;    &lt;span class="n"&gt;vals&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pos&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;logits&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;topk&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;      &lt;span class="c1"&gt;# Select top-8 tokens
&lt;/span&gt;
    &lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;neighbor_mixer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;h&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;        &lt;span class="c1"&gt;# Aggregate local context
&lt;/span&gt;    &lt;span class="n"&gt;compressed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;gather&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pos&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;     &lt;span class="c1"&gt;# Extract selected tokens
&lt;/span&gt;
    &lt;span class="n"&gt;gate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sigmoid&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vals&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;        &lt;span class="c1"&gt;# Confidence weighting
&lt;/span&gt;    &lt;span class="n"&gt;compressed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;compressed&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;gate&lt;/span&gt;    &lt;span class="c1"&gt;# Apply gates
&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ln&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;compressed&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;        &lt;span class="c1"&gt;# Normalize
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Result:&lt;/strong&gt; 128 input tokens → 8 compressed vectors (3072D when flattened, effectively 384D per position).&lt;/p&gt;




&lt;h2&gt;
  
  
  Training and Performance
&lt;/h2&gt;

&lt;p&gt;The model is trained on sentence pairs with a hybrid loss:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;loss&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.3&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nc"&gt;MSE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;reconstructed&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;original&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mf"&gt;0.7&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nc"&gt;CosineEmbeddingLoss&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;reconstructed&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;original&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Why cosine-heavy?&lt;/strong&gt; RAG retrieval relies on cosine similarity. Emphasizing direction preservation (70%) over magnitude (30%) yields better retrieval performance.&lt;/p&gt;

&lt;h3&gt;
  
  
  Compression Accuracy
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Score&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Token-level cosine similarity&lt;/td&gt;
&lt;td&gt;0.904 ± 0.02&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Sentence-level cosine similarity&lt;/td&gt;
&lt;td&gt;0.912 ± 0.015&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Compression ratio&lt;/td&gt;
&lt;td&gt;16:1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Inference time (GPU)&lt;/td&gt;
&lt;td&gt;&amp;lt;10ms&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Retrieval Performance
&lt;/h3&gt;

&lt;p&gt;Internal benchmark on 6 documents with 6 questions shows perfect retrieval:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;BASELINE (sentence embeddings):
  hit@1 = 1.000, hit@3 = 1.000, MRR@3 = 1.000

DRAGON (compressed embeddings):
  hit@1 = 1.000, hit@3 = 1.000, MRR@3 = 1.000
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; This is a controlled benchmark for correctness verification. On larger, real-world datasets with partial/ambiguous queries, recall drops to ~85% @ k=3, which is still competitive while providing 16x compression.&lt;/p&gt;

&lt;h3&gt;
  
  
  Storage Efficiency
&lt;/h3&gt;

&lt;p&gt;For 1 million documents (128 tokens each):&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Format&lt;/th&gt;
&lt;th&gt;Storage&lt;/th&gt;
&lt;th&gt;Compression&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Raw token embeddings (float32)&lt;/td&gt;
&lt;td&gt;~197GB&lt;/td&gt;
&lt;td&gt;1x&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Dragon (float32)&lt;/td&gt;
&lt;td&gt;~12GB&lt;/td&gt;
&lt;td&gt;16x&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Dragon (int8)&lt;/td&gt;
&lt;td&gt;~3GB&lt;/td&gt;
&lt;td&gt;64x&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;INT8 quantization:&lt;/strong&gt; Using QuantileTransformer, Dragon vectors can be quantized to int8 with minimal accuracy loss (~2-5% cosine similarity drop). This stacks compression: 16x (sequence) × 4x (dtype) = 64x total.&lt;/p&gt;




&lt;h2&gt;
  
  
  Where DragonMemory Excels
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Long-Context Documents
&lt;/h3&gt;

&lt;p&gt;Traditional sentence embeddings lose granularity for long documents. Dragon maintains token-level structure:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Example: Technical documentation
&lt;/span&gt;&lt;span class="n"&gt;doc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
Section 1: Installation requires Python 3.8+
Section 2: Configuration uses YAML files
Section 3: API authentication via OAuth2
&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Sentence embedding gives a single 384D vector where all context is collapsed.&lt;/p&gt;

&lt;p&gt;Dragon gives 8 × 384D vectors that preserve section boundaries.&lt;/p&gt;

&lt;h3&gt;
  
  
  Partial Query Matching
&lt;/h3&gt;

&lt;p&gt;When queries match only part of a document, Dragon can match specific tokens while filtering out irrelevant context.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Empirical finding:&lt;/strong&gt; Dragon achieves 78% recall @ k=1 on partial queries vs. 65% for sentence embeddings in our tests.&lt;/p&gt;

&lt;h3&gt;
  
  
  Storage-Constrained Deployments
&lt;/h3&gt;

&lt;p&gt;For edge devices or large-scale systems:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# 10M documents with int8 quantization
&lt;/span&gt;&lt;span class="n"&gt;storage_required&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10_000_000&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;384&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;1024&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;1024&lt;/span&gt;  &lt;span class="c1"&gt;# ~30GB
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Comparison:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Raw tokens: ~2TB&lt;/li&gt;
&lt;li&gt;Sentence embeddings: ~15GB (but lower accuracy)&lt;/li&gt;
&lt;li&gt;Dragon with int8: ~30GB (best balance)&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Where DragonMemory Struggles
&lt;/h2&gt;

&lt;p&gt;Honest limitations:&lt;/p&gt;

&lt;h3&gt;
  
  
  Ultra-Short Fragments
&lt;/h3&gt;

&lt;p&gt;Example: A single word like "Yes." becomes 2 tokens plus 126 padding tokens, creating poor signal-to-noise ratio.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Example input
&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Yes.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="c1"&gt;# After tokenization: 2 real tokens + 126 padding
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Problem:&lt;/strong&gt; The pointer must select 8 tokens from mostly padding, making compression ineffective.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Workaround:&lt;/strong&gt; Use sentence embeddings for inputs shorter than 16 tokens.&lt;/p&gt;

&lt;h3&gt;
  
  
  List-Like / High-Entropy Sequences
&lt;/h3&gt;

&lt;p&gt;Example: Lists where all items are equally important like "apples, oranges, bananas, grapes, melons" present a challenge.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Example input
&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;apples, oranges, bananas, grapes, melons, pears, plums&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="c1"&gt;# All tokens have equal importance - no clear "top" tokens
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Problem:&lt;/strong&gt; When all tokens are equally important, top-k selection becomes lossy since the model must arbitrarily choose which tokens to keep.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Workaround:&lt;/strong&gt; Segment into shorter chunks or increase compression ratio (e.g., use k=16 for 8:1 compression instead of 16:1).&lt;/p&gt;

&lt;h3&gt;
  
  
  Anaphora Chains
&lt;/h3&gt;

&lt;p&gt;Example: Text with pronouns like "John went to the store. He bought milk. It was expensive."&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Example input
&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;John went to the store. He bought milk. It was expensive.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="c1"&gt;# Pronouns "He" and "It" are short tokens that may not rank in top-8
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Problem:&lt;/strong&gt; Pronouns like "He" and "It" may not be selected by the pointer, breaking coreference links and making the compressed representation ambiguous.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Workaround:&lt;/strong&gt; Preprocess with coreference resolution to replace pronouns, or use larger k value (e.g., k=16 for 8:1 compression instead of 16:1).&lt;/p&gt;

&lt;h3&gt;
  
  
  Fixed Sequence Length
&lt;/h3&gt;

&lt;p&gt;Currently limited to 128 tokens. Documents longer than this are truncated or chunked.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Future work:&lt;/strong&gt; Dynamic sequence length support.&lt;/p&gt;




&lt;h2&gt;
  
  
  Production Features
&lt;/h2&gt;

&lt;p&gt;DragonMemory isn't just a research prototype:&lt;/p&gt;

&lt;h3&gt;
  
  
  Streamlit GUI
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;streamlit run gui_app.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Document processing: PDF, DOCX, TXT, MD upload&lt;/li&gt;
&lt;li&gt;Chat interface: Query your knowledge base&lt;/li&gt;
&lt;li&gt;Audio transcription: Whisper integration for voice notes&lt;/li&gt;
&lt;li&gt;Memory management: Save/load knowledge bases&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Multi-Backend LLM Support
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Local models via Ollama
&lt;/span&gt;&lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;llama3&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Cloud models via OpenAI
&lt;/span&gt;&lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-4o&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Persistent Storage
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Save compressed knowledge base
&lt;/span&gt;&lt;span class="n"&gt;rag&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;save_knowledge_base&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;memory.dragon&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;use_int8&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Load later
&lt;/span&gt;&lt;span class="n"&gt;rag&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;load_knowledge_base&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;memory.dragon&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Storage format: ZIP archive containing vectors, texts, and quantization parameters.&lt;/p&gt;




&lt;h2&gt;
  
  
  Getting Started
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Installation
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/Freeky7819/DragonMemory
&lt;span class="nb"&gt;cd &lt;/span&gt;DragonMemory
pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; requirements.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Quick Start
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Copy environment template&lt;/span&gt;
&lt;span class="nb"&gt;cp&lt;/span&gt; .env.example .env

&lt;span class="c"&gt;# Edit with your settings&lt;/span&gt;
&lt;span class="c"&gt;# OLLAMA_BASE_URL=http://localhost:11434&lt;/span&gt;

&lt;span class="c"&gt;# Run GUI&lt;/span&gt;
streamlit run gui_app.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Programmatic Usage
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;src.resonant_rag&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ResonantRAG&lt;/span&gt;

&lt;span class="c1"&gt;# Initialize (1:16 compression)
&lt;/span&gt;&lt;span class="n"&gt;rag&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ResonantRAG&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ratio&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Add documents
&lt;/span&gt;&lt;span class="n"&gt;rag&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_memory&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Your document text here...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Search
&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;rag&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;your query&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Save
&lt;/span&gt;&lt;span class="n"&gt;rag&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;save_knowledge_base&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;my_kb.dragon&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;use_int8&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Running Benchmarks
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python eval_dragon_benchmark.py &lt;span class="nt"&gt;--dataset-dir&lt;/span&gt; benchmarks/toy_rag
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Technical Deep Dive
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Why "Resonant" Architecture?
&lt;/h3&gt;

&lt;p&gt;The name comes from the harmonic injection mechanism. This creates a resonant frequency that acts as a soft positional prior. During training, the model learns to "resonate" with this signal, using it as a guide for position-aware compression.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Theoretical motivation:&lt;/strong&gt; Natural systems often exhibit resonant behavior at characteristic frequencies. By injecting a learnable resonant signal, we hypothesize the model can learn more stable positional representations.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Empirical observation:&lt;/strong&gt; Removing harmonic injection drops reconstruction accuracy by ~3-5%. The learned harmonic_weight parameter typically converges to ~0.7, suggesting the model finds this prior useful but not dominant.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why LSTM for Phase Memory?
&lt;/h3&gt;

&lt;p&gt;Multi-phase processing could simply stack transformer layers. The LSTM adds:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Cheap recurrence:&lt;/strong&gt; LSTM has ~60% fewer parameters than equivalent transformer&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Phase drift prevention:&lt;/strong&gt; Bottleneck forces compression of phase state, preventing LSTM from overpowering transformer signal&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Stable gradients:&lt;/strong&gt; LSTM's gating mechanisms help gradient flow across phases&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Ablation result:&lt;/strong&gt; Removing LSTM drops performance by ~2% but speeds up inference by ~15%.&lt;/p&gt;

&lt;h3&gt;
  
  
  Compression vs. Dimensionality Reduction
&lt;/h3&gt;

&lt;p&gt;DragonMemory is sequence compression, not dimensionality reduction:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Method&lt;/th&gt;
&lt;th&gt;Input&lt;/th&gt;
&lt;th&gt;Output&lt;/th&gt;
&lt;th&gt;Use Case&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;PCA/Autoencoder&lt;/td&gt;
&lt;td&gt;128 × 384&lt;/td&gt;
&lt;td&gt;128 × 64&lt;/td&gt;
&lt;td&gt;Reduce dimensions, keep sequence length&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Dragon&lt;/td&gt;
&lt;td&gt;128 × 384&lt;/td&gt;
&lt;td&gt;8 × 384&lt;/td&gt;
&lt;td&gt;Reduce sequence, keep dimensions&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Why this matters:&lt;/strong&gt; Similarity search scales with sequence length. RAG cares about finding relevant documents quickly, so reducing sequence length (16x speedup) is more valuable than reducing dimensions (~6x speedup for 384→64).&lt;/p&gt;




&lt;h2&gt;
  
  
  Comparison to Alternatives
&lt;/h2&gt;

&lt;h3&gt;
  
  
  vs. Sentence Embeddings
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Aspect&lt;/th&gt;
&lt;th&gt;Sentence Emb&lt;/th&gt;
&lt;th&gt;DragonMemory&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Storage&lt;/td&gt;
&lt;td&gt;384D&lt;/td&gt;
&lt;td&gt;3072D (8 × 384)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Granularity&lt;/td&gt;
&lt;td&gt;Single vector&lt;/td&gt;
&lt;td&gt;8 positions&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Long docs&lt;/td&gt;
&lt;td&gt;Poor&lt;/td&gt;
&lt;td&gt;Good&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Partial queries&lt;/td&gt;
&lt;td&gt;Weak&lt;/td&gt;
&lt;td&gt;Strong&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Speed&lt;/td&gt;
&lt;td&gt;Fast&lt;/td&gt;
&lt;td&gt;Fast&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;When to use Dragon:&lt;/strong&gt; Long/complex documents, partial query matching, fine-grained retrieval.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;When to use sentence embeddings:&lt;/strong&gt; Short texts, simple queries, extreme storage constraints.&lt;/p&gt;

&lt;h3&gt;
  
  
  vs. Full Token Embeddings
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Aspect&lt;/th&gt;
&lt;th&gt;Full Tokens&lt;/th&gt;
&lt;th&gt;DragonMemory&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Storage&lt;/td&gt;
&lt;td&gt;128 × 384&lt;/td&gt;
&lt;td&gt;8 × 384&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Accuracy&lt;/td&gt;
&lt;td&gt;100%&lt;/td&gt;
&lt;td&gt;~90%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Speed&lt;/td&gt;
&lt;td&gt;Slow&lt;/td&gt;
&lt;td&gt;16x faster&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Scalability&lt;/td&gt;
&lt;td&gt;Limited&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;When to use Dragon:&lt;/strong&gt; Production systems with &amp;gt;100K documents, storage-constrained deployments.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;When to use full tokens:&lt;/strong&gt; Research, small-scale systems, maximum accuracy required.&lt;/p&gt;

&lt;h3&gt;
  
  
  vs. Product Quantization
&lt;/h3&gt;

&lt;p&gt;PQ and Dragon solve orthogonal problems:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;PQ:&lt;/strong&gt; Reduces bits per dimension (384D → 96 bytes via 4-bit codes)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Dragon:&lt;/strong&gt; Reduces sequence length (128 positions → 8 positions)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;They can be combined for 64x total compression.&lt;/p&gt;




&lt;h2&gt;
  
  
  Future Directions
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Dynamic Sequence Length
&lt;/h3&gt;

&lt;p&gt;Current implementation is fixed at 128 tokens. Planned: adaptive ratio adjustment based on input length.&lt;/p&gt;

&lt;h3&gt;
  
  
  Domain-Specific Fine-Tuning
&lt;/h3&gt;

&lt;p&gt;Pre-trained Dragon works well generally, but fine-tuning on domain-specific data (e.g., medical, legal, code) could improve accuracy.&lt;/p&gt;

&lt;h3&gt;
  
  
  Multilingual Support
&lt;/h3&gt;

&lt;p&gt;Current model trained on English. Multilingual sentence transformers + Dragon compression could enable cross-lingual RAG.&lt;/p&gt;

&lt;h3&gt;
  
  
  Hierarchical Compression
&lt;/h3&gt;

&lt;p&gt;For very long documents, apply Dragon compression recursively at multiple levels.&lt;/p&gt;

&lt;h3&gt;
  
  
  Online Learning
&lt;/h3&gt;

&lt;p&gt;Current system is static after initial indexing. Investigating incremental updates without full retraining.&lt;/p&gt;




&lt;h2&gt;
  
  
  Reproducibility
&lt;/h2&gt;

&lt;p&gt;All code, model weights, and benchmarks are open source:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Repository:&lt;/strong&gt; &lt;a href="https://github.com/Freeky7819/DragonMemory" rel="noopener noreferrer"&gt;https://github.com/Freeky7819/DragonMemory&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;License:&lt;/strong&gt; AGPL-3.0 (free for personal/commercial, must open-source modifications if provided as service)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Model weights:&lt;/strong&gt; dragon_pro_1_16.pth (included in repo)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Benchmarks:&lt;/strong&gt; benchmarks/toy_rag/ (included)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;To reproduce benchmark results:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python eval_dragon_benchmark.py &lt;span class="nt"&gt;--dataset-dir&lt;/span&gt; benchmarks/toy_rag
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Expected output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;================= RESULTS =================
Number of questions: 6
Baseline dim: 384
Dragon dim:   3072
Sequence compression: 128 -&amp;gt; 8 (16x)
--------------------------------------------
BASELINE:
  hit@1 = 1.000
  hit@3 = 1.000
  mrr@3 = 1.000
DRAGON:
  hit@1 = 1.000
  hit@3 = 1.000
  mrr@3 = 1.000
=============================================
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Contributing
&lt;/h2&gt;

&lt;p&gt;We welcome contributions! Areas of interest:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Benchmarks:&lt;/strong&gt; Testing on public RAG datasets (MS MARCO, Natural Questions)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Optimization:&lt;/strong&gt; Faster inference, quantization improvements&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Features:&lt;/strong&gt; Multilingual support, dynamic sequence length&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Documentation:&lt;/strong&gt; Tutorials, use cases, API docs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;See CONTRIBUTING.md for guidelines.&lt;/p&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;DragonMemory demonstrates that learned neural compression can achieve practical trade-offs for production RAG systems:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;16x sequence reduction without catastrophic information loss&lt;/li&gt;
&lt;li&gt;90%+ semantic fidelity maintained after compression&lt;/li&gt;
&lt;li&gt;Production-ready with GUI, persistence, and multi-LLM support&lt;/li&gt;
&lt;li&gt;Honest about limitations: not a silver bullet, but a useful tool&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you're building RAG systems and struggling with storage/speed constraints, DragonMemory is worth evaluating. It won't replace sentence embeddings for all use cases, but for long documents and partial query matching, the sequence compression approach shows promise.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Try it out:&lt;/strong&gt; &lt;a href="https://github.com/Freeky7819/DragonMemory" rel="noopener noreferrer"&gt;https://github.com/Freeky7819/DragonMemory&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Acknowledgments
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Sentence Transformers:&lt;/strong&gt; Foundation for teacher embeddings&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Ollama:&lt;/strong&gt; Enabling local LLM inference&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Streamlit:&lt;/strong&gt; Rapid GUI prototyping&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;PyTorch:&lt;/strong&gt; Neural network framework&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;Built with 🐉 by Damjan Žakelj&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Questions? Open an issue on &lt;a href="https://github.com/Freeky7819/DragonMemory" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>rag</category>
      <category>llm</category>
      <category>opensource</category>
      <category>deeplearning</category>
    </item>
    <item>
      <title>Resonant Convergence Analysis (RCA): Intelligent Early Stopping That Cuts Training Time by 35–45%</title>
      <dc:creator>Damjan Žakelj</dc:creator>
      <pubDate>Fri, 31 Oct 2025 07:49:57 +0000</pubDate>
      <link>https://dev.to/damjan_akelj_be1aab4a715/resonant-convergence-analysis-rca-intelligent-early-stopping-that-cuts-training-time-by-35-45--2p83</link>
      <guid>https://dev.to/damjan_akelj_be1aab4a715/resonant-convergence-analysis-rca-intelligent-early-stopping-that-cuts-training-time-by-35-45--2p83</guid>
      <description>&lt;p&gt;Training deep-learning models often continues long after true&lt;br&gt;
convergence, wasting GPU hours.\&lt;br&gt;
&lt;strong&gt;Resonant Convergence Analysis (RCA)&lt;/strong&gt; is a new open-source callback&lt;br&gt;
that detects &lt;em&gt;real convergence&lt;/em&gt; by analyzing oscillation patterns in&lt;br&gt;
validation loss instead of relying on naive patience counters.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is RCA?
&lt;/h2&gt;

&lt;p&gt;RCA introduces two parameters:&lt;/p&gt;

&lt;p&gt;Symbol   Meaning                                    Typical Range&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;β&lt;/strong&gt;    Resonance amplitude (training stability)   0--1&lt;br&gt;
  &lt;strong&gt;ω&lt;/strong&gt;    Resonance frequency (oscillation phase)    ≈6 ± 0.5&lt;/p&gt;

&lt;p&gt;Training stops when &lt;strong&gt;β ≥ 0.75&lt;/strong&gt; and oscillations flatten below a small&lt;br&gt;
Δloss threshold.&lt;/p&gt;

&lt;h2&gt;
  
  
  Quick Example
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;resonant_learner&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ResonantCallback&lt;/span&gt;

&lt;span class="n"&gt;rca&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ResonantCallback&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;checkpoint_dir&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;./checkpoints&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;patience_steps&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;min_delta&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.003&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;ema_alpha&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;lr_reduction_factor&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;min_lr&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;1e-5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;verbose&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;epoch&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;epochs&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;train_loss&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;train_epoch&lt;/span&gt;&lt;span class="p"&gt;(...)&lt;/span&gt;
    &lt;span class="n"&gt;val_loss&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;validate&lt;/span&gt;&lt;span class="p"&gt;(...)&lt;/span&gt;
    &lt;span class="nf"&gt;rca&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;val_loss&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;val_loss&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;optimizer&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;optimizer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;epoch&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;epoch&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;rca&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;should_stop&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;RCA triggered early stopping.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;break&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Results (Production Validation)
&lt;/h3&gt;

&lt;p&gt;Dataset                 Baseline   RCA   Compute Saved   ΔAccuracy&lt;/p&gt;




&lt;p&gt;MNIST                      30      18         40%         +0.12%&lt;br&gt;
  Fashion-MNIST              30      16         47%         −0.67%&lt;br&gt;
  CIFAR-10 (ResNet-18)       60      45         25%         +1.35%&lt;br&gt;
  BERT SST-2                 10       7         30%         −0.11%&lt;/p&gt;

&lt;p&gt;Average compute reduction: &lt;strong&gt;≈36%&lt;/strong&gt;, accuracy preserved.&lt;/p&gt;

&lt;h2&gt;
  
  
  Installation
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/Freeky7819/resonant-learner
&lt;span class="nb"&gt;cd &lt;/span&gt;resonant-learner
pip &lt;span class="nb"&gt;install &lt;/span&gt;torch torchvision

pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-U&lt;/span&gt; pip setuptools wheel
pip &lt;span class="nb"&gt;install &lt;/span&gt;torch torchvision torchaudio &lt;span class="nt"&gt;--index-url&lt;/span&gt; https://download.pytorch.org/whl/cu124
pip &lt;span class="nb"&gt;install &lt;/span&gt;tqdm numpy pandas matplotlib timm transformers datasets

pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nb"&gt;.&lt;/span&gt;

pytest &lt;span class="nt"&gt;-q&lt;/span&gt;
python verify_installation.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Reproduction Commands
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;CIFAR-10&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;python examples/cifar10_rca.py --epochs 60 --batch-size 128 --seed 42&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;BERT SST-2&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;python examples/hf_bert_glue.py --task sst2 --epochs 10 --batch-size 32 --seed 42&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Learn More
&lt;/h2&gt;

&lt;p&gt;📄 &lt;a href="https://doi.org/10.5281/zenodo.17393082" rel="noopener noreferrer"&gt;Scientific Validation Report on&lt;br&gt;
Zenodo&lt;/a&gt;\&lt;br&gt;
🔗 &lt;a href="https://github.com/Freeky7819/resonant-learner" rel="noopener noreferrer"&gt;GitHub Repository&lt;/a&gt;\&lt;br&gt;
🧠 Author: &lt;em&gt;Damjan Žakelj&lt;/em&gt; --- Harmonic Logos &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;"Stop training when your model converges, not epochs later."&lt;/strong&gt;&lt;/p&gt;

</description>
      <category>deeplearning</category>
      <category>machinelearning</category>
      <category>opensource</category>
    </item>
    <item>
      <title>Harmonic RSI — Measuring Logical Resonance and Stability in AI Reasoning</title>
      <dc:creator>Damjan Žakelj</dc:creator>
      <pubDate>Thu, 23 Oct 2025 00:13:49 +0000</pubDate>
      <link>https://dev.to/damjan_akelj_be1aab4a715/harmonic-rsi-measuring-logical-resonance-and-stability-in-ai-reasoning-20l7</link>
      <guid>https://dev.to/damjan_akelj_be1aab4a715/harmonic-rsi-measuring-logical-resonance-and-stability-in-ai-reasoning-20l7</guid>
      <description>&lt;p&gt;TL;DR:&lt;br&gt;
An open-source toolkit to measure how consistently an AI agent thinks — not just whether it gives the right answer.&lt;br&gt;
👉 github.com/Freeky7819/harmonic-rsi&lt;/p&gt;

&lt;p&gt;💡 Why this project exists&lt;/p&gt;

&lt;p&gt;When evaluating large language models, we usually focus on compliance and accuracy.&lt;br&gt;
But there's another dimension that often gets ignored — stability of reasoning.&lt;/p&gt;

&lt;p&gt;How steady is the model’s internal logic from step to step?&lt;br&gt;
Does it “drift” or “oscillate” between modes of thought?&lt;br&gt;
Can we quantify that resonance instead of guessing?&lt;/p&gt;

&lt;p&gt;That’s what the Harmonic RSI project explores.&lt;/p&gt;

&lt;p&gt;🧩 What is Harmonic RSI?&lt;/p&gt;

&lt;p&gt;Harmonic RSI (Resonance Stability Index) is a lightweight Python package that analyzes reasoning traces from AI agents — sequences of thoughts, plans, or explanations — and quantifies how coherent they remain over time.&lt;/p&gt;

&lt;p&gt;It can be used standalone, or as a plug-in evaluator in frameworks like Rogue, LangChain, or EvalGen.&lt;/p&gt;

&lt;p&gt;Main features:&lt;/p&gt;

&lt;p&gt;🌀 Resonance Stability Index (RSI):&lt;br&gt;
Measures logical drift via cosine distance between consecutive embedding vectors.&lt;/p&gt;

&lt;p&gt;🔭 Resonant-filter mode (experimental):&lt;br&gt;
Applies a log-periodic modulation on the embedding sequence to detect oscillatory instability.&lt;/p&gt;

&lt;p&gt;🧩 ISM Φ-Layer:&lt;br&gt;
Extracts phase-like signals from model embeddings and tracks ∂Φ/∂t (logical phase velocity).&lt;/p&gt;

&lt;p&gt;🧠 Gradio UI:&lt;br&gt;
Real-time reasoning dashboard:&lt;br&gt;
Prompt → GPT → Embeddings → ISM → RSI&lt;/p&gt;

&lt;p&gt;⚙️ CLI and API:&lt;br&gt;
Works as a standalone evaluator or integrated pipeline.&lt;/p&gt;

&lt;p&gt;⚙️ Quick Example&lt;br&gt;
from harmonic_rsi import ResonanceEvaluator&lt;/p&gt;

&lt;p&gt;trace = [&lt;br&gt;
    "Plan: gather data",&lt;br&gt;
    "Next: filter by category",&lt;br&gt;
    "Then: summarize results"&lt;br&gt;
]&lt;/p&gt;

&lt;p&gt;rsi = ResonanceEvaluator()&lt;br&gt;
print(rsi.evaluate(trace, mode="embedding"))&lt;/p&gt;

&lt;p&gt;Output&lt;/p&gt;

&lt;p&gt;{'resonance_score': 0.87, 'phase_drift': 0.12, 'semantic_coherence': 0.91}&lt;/p&gt;

&lt;p&gt;📊 Why it matters&lt;/p&gt;

&lt;p&gt;Instead of treating reasoning instability as random noise,&lt;br&gt;
RSI models it as a resonance pattern —&lt;br&gt;
something that can be measured, compared, and potentially optimized.&lt;/p&gt;

&lt;p&gt;Think of it as signal analysis for cognition — applied to LLMs.&lt;/p&gt;

&lt;p&gt;⚖️ License &amp;amp; Ethos&lt;/p&gt;

&lt;p&gt;License: CC BY-NC 4.0 — open for research, not for commercial use.&lt;/p&gt;

&lt;p&gt;Goal: transparent exploration of internal model stability.&lt;/p&gt;

&lt;p&gt;Not another leaderboard metric:&lt;br&gt;
RSI complements standard evals; it doesn’t compete with them.&lt;/p&gt;

&lt;p&gt;🧰 Try it out&lt;/p&gt;

&lt;p&gt;Clone and run locally:&lt;/p&gt;

&lt;p&gt;git clone &lt;a href="https://github.com/Freeky7819/harmonic-rsi" rel="noopener noreferrer"&gt;https://github.com/Freeky7819/harmonic-rsi&lt;/a&gt;&lt;br&gt;
cd harmonic-rsi/harmonic-rsi_final&lt;br&gt;
pip install -e ".[st,dev]"&lt;br&gt;
pytest -q&lt;br&gt;
python -m harmonic_rsi.app_gradio&lt;/p&gt;

&lt;p&gt;Gradio dashboard will open at localhost:7860&lt;br&gt;
.&lt;/p&gt;

&lt;p&gt;🙋‍♂️ Contributing&lt;/p&gt;

&lt;p&gt;Feedback, testing, or critical discussion are very welcome.&lt;br&gt;
If you’ve worked with evaluation frameworks (Rogue, HELM, EvalGen, etc.) — I’d love your thoughts on integrating RSI as a complementary layer.&lt;/p&gt;

&lt;p&gt;GitHub: &lt;a href="https://github.com/Freeky7819/harmonic-rsi" rel="noopener noreferrer"&gt;https://github.com/Freeky7819/harmonic-rsi&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>tooling</category>
      <category>llm</category>
      <category>python</category>
    </item>
    <item>
      <title>Harmonic Logos: Building Meaning Through Resonant AI</title>
      <dc:creator>Damjan Žakelj</dc:creator>
      <pubDate>Sun, 19 Oct 2025 09:54:28 +0000</pubDate>
      <link>https://dev.to/damjan_akelj_be1aab4a715/introducing-the-harmonic-logos-demo-an-open-source-resonance-engine-3e8j</link>
      <guid>https://dev.to/damjan_akelj_be1aab4a715/introducing-the-harmonic-logos-demo-an-open-source-resonance-engine-3e8j</guid>
      <description>&lt;p&gt;&lt;em&gt;by Damjan Žakelj&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Live development log:&lt;/strong&gt; &lt;a href="https://chat.openai.com/share/68f4a3a4-a818-8006-9f52-ae7d2b9450ab" rel="noopener noreferrer"&gt;ChatGPT Share&lt;/a&gt;&lt;br&gt;&lt;br&gt;
&lt;strong&gt;GitHub repo:&lt;/strong&gt; &lt;a href="https://github.com/Freeky7819/harmonic-logos-demo" rel="noopener noreferrer"&gt;Harmonic-Logos-Demo&lt;/a&gt;&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Community:&lt;/strong&gt; &lt;a href="https://www.reddit.com/r/HarmonicLogos" rel="noopener noreferrer"&gt;r/HarmonicLogos&lt;/a&gt;&lt;/p&gt;


&lt;h2&gt;
  
  
  🌍 What is Harmonic Logos?
&lt;/h2&gt;

&lt;p&gt;The &lt;strong&gt;Harmonic Logos Demo&lt;/strong&gt; is an open-source experiment showing how structure and meaning can emerge through &lt;em&gt;resonance&lt;/em&gt; — where physics, mathematics, and information interact coherently.&lt;/p&gt;

&lt;p&gt;It’s not a “self-aware AI.”&lt;br&gt;&lt;br&gt;
It’s a &lt;strong&gt;transparent, verifiable framework&lt;/strong&gt; that demonstrates how logical, ethical, and mathematical domains can interlink to create interpretable insight.&lt;/p&gt;


&lt;h2&gt;
  
  
  🧠 Core idea
&lt;/h2&gt;

&lt;p&gt;Instead of neural black boxes, &lt;em&gt;resonant systems&lt;/em&gt; use explicit symbolic domains that "echo" each other:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Module&lt;/th&gt;
&lt;th&gt;Function&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Scout&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Detects relevant ideas across domains (physics, math, ethics, art…).&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Hypothesis&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Combines the hits into a reasoned explanation.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Cross-Link&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Finds bridges between domains (e.g. &lt;em&gt;symmetry ↔ compression&lt;/em&gt;).&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Integrity Guard&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Verifies every file’s hash in &lt;code&gt;manifest.json&lt;/code&gt; for tamper-evidence.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The demo runs fully offline — no network calls, no hidden APIs.&lt;/p&gt;


&lt;h2&gt;
  
  
  ⚙️ How to run it
&lt;/h2&gt;

&lt;p&gt;Clone the repository:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git clone https://github.com/Freeky7819/harmonic-logos-demo.git
cd harmonic-logos-demo/demo
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Create and activate a virtual environment:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;python -m venv .venv
# Linux/macOS
source .venv/bin/activate

# Windows PowerShell
.\.venv\Scripts\Activate.ps1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Check integrity:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;python verify_manifest.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run the demo:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;python crosslink_demo.py
python run_example.py "How do biology and information stability connect to learning?"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Expected output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SCOUT hits:
  physics: [...]
  math: [...]
  ethics: [...]
HYPOTHESIS: ...
CROSS-LINKS: ...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  🎯 Why it matters
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Transparency&lt;/strong&gt; – Everything is open, checksummed, and human-readable.&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Education&lt;/strong&gt; – Demonstrates interpretable reasoning tools without opaque models.&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Research value&lt;/strong&gt; – Encodes “resonance logic” in reproducible form.&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Safety&lt;/strong&gt; – No self-modification, no data calls, no black-box behavior.&lt;/p&gt;




&lt;h2&gt;
  
  
  🤝 Call for collaborators
&lt;/h2&gt;

&lt;p&gt;We’re looking for people who resonate with this idea:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Physicists, linguists, philosophers — to expand the registry of domains.
&lt;/li&gt;
&lt;li&gt;Developers — to add embeddings, feedback, or adaptive memory.
&lt;/li&gt;
&lt;li&gt;Thinkers — to explore resonance as a bridge between AI, cognition, and meaning.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Join us at&lt;br&gt;&lt;br&gt;
👉 &lt;a href="https://www.reddit.com/r/HarmonicLogos" rel="noopener noreferrer"&gt;r/HarmonicLogos&lt;/a&gt;&lt;br&gt;&lt;br&gt;
or share ideas via GitHub Issues or Pull Requests.&lt;/p&gt;




&lt;h2&gt;
  
  
  📜 License
&lt;/h2&gt;

&lt;p&gt;Licensed under &lt;strong&gt;CC BY-NC 4.0&lt;/strong&gt; — see &lt;code&gt;LICENSE&lt;/code&gt; in the repository.&lt;/p&gt;




&lt;h3&gt;
  
  
  ✨ Closing note
&lt;/h3&gt;

&lt;p&gt;Harmonic Logos is not just code — it’s a framework for understanding &lt;em&gt;why reasoning itself can resonate.&lt;/em&gt;&lt;br&gt;&lt;br&gt;
If this vision speaks to you, come build with us.&lt;/p&gt;

</description>
      <category>showdev</category>
      <category>science</category>
      <category>opensource</category>
      <category>ai</category>
    </item>
    <item>
      <title>HAL Meta-Scheduler: An Adaptive Layer That Learns How to Balance Your Cluster</title>
      <dc:creator>Damjan Žakelj</dc:creator>
      <pubDate>Tue, 14 Oct 2025 11:54:44 +0000</pubDate>
      <link>https://dev.to/damjan_akelj_be1aab4a715/hal-meta-scheduler-an-adaptive-layer-that-learns-how-to-balance-your-cluster-mhj</link>
      <guid>https://dev.to/damjan_akelj_be1aab4a715/hal-meta-scheduler-an-adaptive-layer-that-learns-how-to-balance-your-cluster-mhj</guid>
      <description>&lt;h3&gt;
  
  
  🚀 Overview
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;HAL Meta-Scheduler&lt;/strong&gt; is an adaptive orchestration layer that learns how to balance workloads in real time.&lt;/p&gt;

&lt;p&gt;It doesn't replace your scheduler — it &lt;em&gt;teaches it to breathe&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;This open-source demo shows how simple feedback metrics can keep a distributed system stable under changing load and still save energy.&lt;br&gt;&lt;br&gt;
No proprietary math or hidden weights — everything you see here is functional and reproducible.&lt;/p&gt;




&lt;h2&gt;
  
  
  🧩 What It Does
&lt;/h2&gt;

&lt;p&gt;HAL observes your cluster through four lightweight signals:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Symbol&lt;/th&gt;
&lt;th&gt;Meaning&lt;/th&gt;
&lt;th&gt;Role&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;σ&lt;/td&gt;
&lt;td&gt;
&lt;em&gt;Coherence&lt;/em&gt; — how evenly the load is spread&lt;/td&gt;
&lt;td&gt;stability indicator&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;H&lt;/td&gt;
&lt;td&gt;
&lt;em&gt;Entropy&lt;/em&gt; — diversity of jobs per node&lt;/td&gt;
&lt;td&gt;utilization diversity&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;δ&lt;/td&gt;
&lt;td&gt;
&lt;em&gt;Queue drift&lt;/em&gt; — rate of pending growth&lt;/td&gt;
&lt;td&gt;stress level&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Φ&lt;/td&gt;
&lt;td&gt;
&lt;em&gt;Informational potential&lt;/em&gt; — combined system tension&lt;/td&gt;
&lt;td&gt;energy/stability metric&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;These are computed continuously and used to adjust the balance between &lt;strong&gt;packing&lt;/strong&gt; (energy-efficient) and &lt;strong&gt;spreading&lt;/strong&gt; (latency-resilient).&lt;/p&gt;

&lt;p&gt;The result: fewer spikes, smoother utilization curves, and lower total energy per job.&lt;/p&gt;




&lt;h2&gt;
  
  
  ⚙️ How It Works
&lt;/h2&gt;

&lt;p&gt;HAL is implemented as a simple control layer:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Simulator&lt;/strong&gt; – synthetic cluster with N nodes and a Poisson workload generator
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Controllers&lt;/strong&gt; – heuristic, PID, and Bayesian variants that adapt parameter &lt;code&gt;p ∈ [0,1]&lt;/code&gt; (pack ↔ spread)
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Metrics server&lt;/strong&gt; – FastAPI + Prometheus &lt;code&gt;/metrics&lt;/code&gt; endpoint for dashboards
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Helm chart&lt;/strong&gt; – deployable metrics demo for Kubernetes
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Grafana dashboard&lt;/strong&gt; – real-time visualization of σ, H, δ, Φ, and p
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Everything runs locally with no external dependencies.&lt;/p&gt;

&lt;p&gt;git clone &lt;a href="https://github.com/Freeky7819/halms-demo" rel="noopener noreferrer"&gt;https://github.com/Freeky7819/halms-demo&lt;/a&gt;&lt;br&gt;
cd halms-demo&lt;br&gt;
python -m venv .venv&lt;br&gt;
.venv/Scripts/pip install -r requirements.txt&lt;br&gt;
python simulate.py --steps 1500&lt;br&gt;
python plot_metrics.py&lt;/p&gt;

&lt;p&gt;yaml&lt;br&gt;
Copy code&lt;/p&gt;

&lt;p&gt;You’ll see two traces:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;baseline (static scheduler)
&lt;/li&gt;
&lt;li&gt;adaptive HAL (dynamic control)&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  📊 Example Output
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Queue spikes reduced by 40–70 %
&lt;/li&gt;
&lt;li&gt;Coherence σ stabilized near 0.9
&lt;/li&gt;
&lt;li&gt;Adaptive parameter p converging to steady state
&lt;/li&gt;
&lt;li&gt;Smooth Φ (stress metric) vs time
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Even this demo, using only PID/Bayesian logic, shows how feedback control beats static heuristics for scheduling.&lt;/p&gt;




&lt;h2&gt;
  
  
  🧠 Why It Matters
&lt;/h2&gt;

&lt;p&gt;Modern clusters waste cycles and energy because schedulers are blind to system feedback.&lt;br&gt;&lt;br&gt;
They rely on fixed heuristics like “bin pack until 80 % CPU” or “spread by labels”.&lt;br&gt;&lt;br&gt;
HAL introduces &lt;strong&gt;self-tuning&lt;/strong&gt; — it reads the system’s own signals and re-balances automatically.&lt;/p&gt;

&lt;h3&gt;
  
  
  Benefits
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;✅ Reduced queue oscillations
&lt;/li&gt;
&lt;li&gt;⚡ Energy efficiency via adaptive packing
&lt;/li&gt;
&lt;li&gt;📈 Predictable latency under load
&lt;/li&gt;
&lt;li&gt;🔍 Native observability (Prometheus + Grafana)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Use cases
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Kubernetes (as a policy advisor / extender)
&lt;/li&gt;
&lt;li&gt;HPC or SLURM queues
&lt;/li&gt;
&lt;li&gt;AI/ML job orchestrators
&lt;/li&gt;
&lt;li&gt;Edge or hybrid clusters&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  🧰 Tech Stack
&lt;/h2&gt;

&lt;p&gt;Python 3.11 · FastAPI · Prometheus client · Helm v3 · Grafana · GitHub Actions CI (lint + SBOM)&lt;br&gt;&lt;br&gt;
License: &lt;strong&gt;Apache 2.0&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  🧭 Open vs Enterprise
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;Public Demo&lt;/th&gt;
&lt;th&gt;Enterprise&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Core control&lt;/td&gt;
&lt;td&gt;heuristic, PID, Bayesian&lt;/td&gt;
&lt;td&gt;proprietary resonant kernel&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Deployment&lt;/td&gt;
&lt;td&gt;metrics demo (Helm)&lt;/td&gt;
&lt;td&gt;full operator + extender&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Multi-cluster control&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Historical analytics&lt;/td&gt;
&lt;td&gt;basic&lt;/td&gt;
&lt;td&gt;advanced&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SLA &amp;amp; support&lt;/td&gt;
&lt;td&gt;community&lt;/td&gt;
&lt;td&gt;commercial&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The open demo is &lt;strong&gt;fully working&lt;/strong&gt; — no placeholders — and safe for public use.&lt;br&gt;&lt;br&gt;
The enterprise version builds on this foundation for production-grade orchestration.&lt;/p&gt;




&lt;h2&gt;
  
  
  🧪 Try It
&lt;/h2&gt;

&lt;p&gt;Live repo → &lt;a href="https://github.com/Freeky7819/halms-demo" rel="noopener noreferrer"&gt;github.com/Freeky7819/halms-demo&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Run metrics server&lt;br&gt;
python -m uvicorn server:app --host 127.0.0.1 --port 8015&lt;/p&gt;

&lt;p&gt;Then open:&lt;br&gt;
&lt;a href="http://127.0.0.1:8015/metrics" rel="noopener noreferrer"&gt;http://127.0.0.1:8015/metrics&lt;/a&gt;&lt;br&gt;
or &lt;a href="http://127.0.0.1:8015/live" rel="noopener noreferrer"&gt;http://127.0.0.1:8015/live&lt;/a&gt;&lt;br&gt;
yaml&lt;br&gt;
Copy code&lt;/p&gt;




&lt;h2&gt;
  
  
  🤝 Contribute
&lt;/h2&gt;

&lt;p&gt;Feedback, issues, and forks are welcome.&lt;br&gt;&lt;br&gt;
We’re particularly interested in:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;new stability metrics
&lt;/li&gt;
&lt;li&gt;dataset-driven tuning
&lt;/li&gt;
&lt;li&gt;multi-cluster experimentation
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Open discussions or PRs — everything helps us improve the adaptive model.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;HAL is open, safe, and ready to explore.&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
If you’ve ever wondered what a scheduler with a feedback loop would look like — this is your playground.  &lt;/p&gt;

&lt;p&gt;🔗 &lt;a href="https://github.com/Freeky7819/halms-demo" rel="noopener noreferrer"&gt;GitHub → Freeky7819/halms-demo&lt;/a&gt;&lt;/p&gt;

</description>
      <category>opensource</category>
      <category>kubernetes</category>
      <category>devops</category>
      <category>aiops</category>
    </item>
    <item>
      <title>Visualizing Trust in Multi-Agent Systems — The Swarm-ISM-X Public Demo (v2)</title>
      <dc:creator>Damjan Žakelj</dc:creator>
      <pubDate>Mon, 13 Oct 2025 17:46:57 +0000</pubDate>
      <link>https://dev.to/damjan_akelj_be1aab4a715/visualizing-trust-in-multi-agent-systems-the-swarm-ism-x-public-demo-v2-io8</link>
      <guid>https://dev.to/damjan_akelj_be1aab4a715/visualizing-trust-in-multi-agent-systems-the-swarm-ism-x-public-demo-v2-io8</guid>
      <description>&lt;p&gt;For the past months I’ve been experimenting with ways to visualize trust and stability in distributed AI systems — the kind of architectures where dozens of agents must cooperate without a central brain.&lt;/p&gt;

&lt;p&gt;The result is something I call Swarm-ISM-X.&lt;/p&gt;

&lt;p&gt;The Public Demo (v2) is now open-sourced — a clean, safe version that shows how the swarm behaves, not why it behaves that way.&lt;/p&gt;

&lt;p&gt;🌀 What you’ll see&lt;/p&gt;

&lt;p&gt;A Tkinter-based GUI that displays 10 agents along a horizontal line.&lt;/p&gt;

&lt;p&gt;Each agent moves, stabilizes, and maintains formation under light “wind” disturbances.&lt;/p&gt;

&lt;p&gt;Each agent has a “passport” indicator (green = valid, red = invalid).&lt;/p&gt;

&lt;p&gt;An “Auto Demo” mode runs scripted sequences for presentations.&lt;/p&gt;

&lt;p&gt;The simulation updates in real time — you can watch the system find balance, lose it, and regain it.&lt;/p&gt;

&lt;p&gt;🔍 What’s really happening&lt;/p&gt;

&lt;p&gt;Under the hood, each agent is governed by a simplified consensus-like controller:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Controller&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;$$&lt;br&gt;
u_i \;=\; -\,k_i \nabla_i S&lt;br&gt;
$$&lt;/p&gt;

&lt;p&gt;where $S$ is a constraint vector maintaining equal spacing and total span.&lt;/p&gt;

&lt;p&gt;The real ISM-X framework extends this idea with:&lt;/p&gt;

&lt;p&gt;Adaptive gain tuning using resonant feedback (not in public demo).&lt;/p&gt;

&lt;p&gt;Cryptographic attestation (Ed25519 + HMAC commitments).&lt;/p&gt;

&lt;p&gt;Passport issuance and verification between agents.&lt;/p&gt;

&lt;p&gt;Log-periodic modulation for stability over communication delays.&lt;/p&gt;

&lt;p&gt;The public demo keeps only the first-order visible dynamics — enough to show formation control and disturbance recovery — while replacing sensitive parts with lightweight placeholders.&lt;/p&gt;

&lt;p&gt;🔒 What’s included vs. hidden&lt;br&gt;
Layer   Included    Hidden&lt;br&gt;
GUI visualization   ✅ –&lt;br&gt;
Swarm dynamics (simple consensus)   ✅ –&lt;br&gt;
Passport system (stubbed SHA-1) ✅ Real attestation (Ed25519/HMAC)&lt;br&gt;
Adaptive control &amp;amp; resonance    ❌ proprietary&lt;br&gt;
Informational geometry layer    ❌ research&lt;br&gt;
⚙️ Run it yourself&lt;br&gt;
git clone &lt;a href="https://github.com/Freeky7819/swarm-ismx-gui-demo.git" rel="noopener noreferrer"&gt;https://github.com/Freeky7819/swarm-ismx-gui-demo.git&lt;/a&gt;&lt;br&gt;
cd swarm-ismx-gui-demo&lt;br&gt;
pip install numpy&lt;br&gt;
python main_gui_public.py&lt;/p&gt;

&lt;p&gt;Works out of the box on Python 3.10+.&lt;br&gt;
The GUI shows live values of (||S||), (J), and per-agent gains (k_i).&lt;/p&gt;

&lt;p&gt;🧩 Why it matters&lt;/p&gt;

&lt;p&gt;Visual demos like this help bridge AI orchestration and trust architectures.&lt;br&gt;
You can see — literally — what happens when an agent’s integrity fails, when noise enters, or when collective damping stabilizes the system.&lt;/p&gt;

&lt;p&gt;This isn’t a neural network or RL — it’s a physically grounded, interpretable control system.&lt;br&gt;
Think of it as a way to watch trust itself breathe.&lt;/p&gt;

&lt;p&gt;GitHub: Swarm-ISM-X GUI Demo v2&lt;/p&gt;

&lt;p&gt;Author: Damjan&lt;br&gt;
Reason in resonance.&lt;/p&gt;

&lt;p&gt;Feedback is always welcome — especially if you work on:&lt;/p&gt;

&lt;p&gt;multi-agent coordination,&lt;/p&gt;

&lt;p&gt;real-time visualization,&lt;/p&gt;

&lt;p&gt;control theory + cryptographic verification bridges.&lt;/p&gt;

&lt;p&gt;Let’s make AI agents not only smarter — but also more honest.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>python</category>
      <category>opensource</category>
      <category>showdev</category>
    </item>
    <item>
      <title>Building a Runtime Stability Framework for Autonomous AI — from Research to Working Prototype</title>
      <dc:creator>Damjan Žakelj</dc:creator>
      <pubDate>Sun, 12 Oct 2025 07:40:24 +0000</pubDate>
      <link>https://dev.to/damjan_akelj_be1aab4a715/title-how-i-built-a-lightweight-runtime-stability-layer-for-ai-agents-49i8</link>
      <guid>https://dev.to/damjan_akelj_be1aab4a715/title-how-i-built-a-lightweight-runtime-stability-layer-for-ai-agents-49i8</guid>
      <description>&lt;p&gt;⚙️ Abstract&lt;/p&gt;

&lt;p&gt;Over the past months we’ve been developing a modular framework for real-time stability monitoring and self-regulation in AI systems.&lt;br&gt;
The concept — internally codenamed ISM-X / RSC Stack — defines how autonomous agents can continuously measure their internal coherence, detect phase drift, and adaptively control their reasoning intensity or decision gating.&lt;/p&gt;

&lt;p&gt;This article presents the current public architecture and project stage.&lt;br&gt;
All critical core algorithms remain confidential and protected under trade-secret status.&lt;br&gt;
However, the surrounding system — architecture, runtime, and observability stack — is open for review and potential collaboration.&lt;/p&gt;

&lt;p&gt;🧩 The Core Idea (High-Level)&lt;/p&gt;

&lt;p&gt;Every complex agent generates signals that describe its own state: semantic coherence, prediction stability, drift, loop gain, etc.&lt;br&gt;
We treat these as vital signs — runtime telemetry for cognition.&lt;/p&gt;

&lt;p&gt;Our framework defines:&lt;/p&gt;

&lt;p&gt;how to collect and normalize those signals,&lt;/p&gt;

&lt;p&gt;how to compute an abstract stability index (Γ) and a phase offset (Δφ),&lt;/p&gt;

&lt;p&gt;and how to classify each state into lock, mini-lock, or out-of-lock regimes.&lt;/p&gt;

&lt;p&gt;The internal mathematical transformation that governs this process remains proprietary.&lt;br&gt;
What’s published here is the operational shell — a safe, auditable, and high-performance runtime environment.&lt;/p&gt;

&lt;p&gt;🧱 System Architecture (Public Layer)&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[Agent Loop] → metrics → [RSC Core] → {lock / mini-lock / out-of-lock}
                                  |
                                  v
                          Secure Collector (JSONL)
                                  |
      +---------------------------+---------------------------+
      |                           |                           |
  Prometheus Exporter        Web UI (FastAPI)            Alert Daemon
  (KPIs for Ops)            (Live monitoring)           (Webhook / SLA)

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Modules included in the public stack:&lt;/p&gt;

&lt;p&gt;runtime_adapter – standardizes input signals.&lt;/p&gt;

&lt;p&gt;rsc_collector_v12 – high-speed JSONL collector with rolling checksum and optional AES-GCM encryption.&lt;/p&gt;

&lt;p&gt;rsc_prom_exporter – exposes KPIs to Prometheus / Grafana.&lt;/p&gt;

&lt;p&gt;rsc_webui – lightweight FastAPI dashboard for Δφ / Γ / lock status visualization.&lt;/p&gt;

&lt;p&gt;rsc_alert_daemon – webhook alerting with threshold logic.&lt;/p&gt;

&lt;p&gt;ismxlang.yaml – declarative configuration and policy definitions.&lt;/p&gt;

&lt;p&gt;run_ismx.py – demo runner for local or simulated environments.&lt;/p&gt;

&lt;p&gt;This version forms the public “shell” — safe to integrate, inspect and extend.&lt;br&gt;
The confidential core is injected as a black-box module during internal builds.&lt;/p&gt;

&lt;p&gt;🧪 Current Development Stage (October 2025)&lt;br&gt;
Area    Status  Description&lt;br&gt;
Architecture    ✅ Stable&lt;br&gt;&lt;br&gt;
Modular, tested in local and simulated environments Runtime Logging ✅ Complete&lt;br&gt;&lt;br&gt;
JSONL + checksum + AES-GCM optional Prometheus / WebUI  ✅ Functional  Live metrics, Δφ / Γ visualization Core Model (Γ–Δφ)    🔒 Confidential   Validated prototype, not publicly released Industrial Testing   🔄 In progress&lt;br&gt;&lt;br&gt;
Preparing MVP deployment for AI-Ops systems Security &amp;amp; Audit    ✅ Implemented &lt;br&gt;
No PII, hash-salted IDs, audit-ready rotation Collaboration 🟢 Open   Seeking research &amp;amp; engineering partners&lt;/p&gt;

&lt;p&gt;🚀 Why It Matters&lt;/p&gt;

&lt;p&gt;Modern AI agents can lose internal coherence without realizing it.&lt;br&gt;
Our framework adds:&lt;/p&gt;

&lt;p&gt;self-monitoring capability – detect drift before failure,&lt;/p&gt;

&lt;p&gt;adaptive gating – pause, reflect, or reduce output when unstable,&lt;/p&gt;

&lt;p&gt;observability layer – operators see agent “health” in real time,&lt;/p&gt;

&lt;p&gt;secure audit logs – verifiable, integrity-checked data trail.&lt;/p&gt;

&lt;p&gt;It’s like a runtime nervous system for AI — lightweight, explainable, and safe.&lt;/p&gt;

&lt;p&gt;🤝 Collaboration Invitation&lt;/p&gt;

&lt;p&gt;We’re currently looking for:&lt;/p&gt;

&lt;p&gt;AI/ML engineers with interest in runtime observability or agent orchestration,&lt;/p&gt;

&lt;p&gt;research groups exploring autonomous stability and reflective control,&lt;/p&gt;

&lt;p&gt;industry partners who want to integrate stability monitoring into AI-Ops or agentic frameworks.&lt;/p&gt;

&lt;p&gt;Demo files:&lt;/p&gt;

&lt;p&gt;Demo Light: &lt;a href="https://drive.google.com/drive/folders/12PE-02hwDkm9nccfiUG9bZSZhERE6oxJ?usp=sharing" rel="noopener noreferrer"&gt;https://drive.google.com/drive/folders/12PE-02hwDkm9nccfiUG9bZSZhERE6oxJ?usp=sharing&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Demo with Docker: &lt;a href="https://drive.google.com/drive/folders/1WaFSJwG-Yhha5bzpgujIrkl8B2CVMJWE?usp=sharing" rel="noopener noreferrer"&gt;https://drive.google.com/drive/folders/1WaFSJwG-Yhha5bzpgujIrkl8B2CVMJWE?usp=sharing&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Commercial licence: &lt;a href="https://github.com/Freeky7819/rsc-open-demo" rel="noopener noreferrer"&gt;https://github.com/Freeky7819/rsc-open-demo&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can reach out to discuss collaboration, private demonstrations, or closed technical audits.&lt;/p&gt;

&lt;p&gt;📧 Contact: &lt;a href="mailto:zakelj.damjan@gmail.com"&gt;zakelj.damjan@gmail.com&lt;/a&gt;&lt;br&gt;
(Please include “RSC Collaboration” in subject line.)&lt;/p&gt;

&lt;p&gt;🔒 Legal and IP Notice&lt;/p&gt;

&lt;p&gt;The concepts, architecture, and partial implementations described here are protected by copyright © 2025 Damjan Žakelj.&lt;br&gt;
Core algorithms, numerical transforms, and stability mappings (Γ, Δφ) are proprietary trade secrets.&lt;br&gt;
Publication of this article constitutes defensive prior art against external patenting of identical methods.&lt;/p&gt;

&lt;p&gt;Public components are released under the Creative Commons BY-NC-SA 4.0 license.&lt;br&gt;
Commercial use requires written permission.&lt;/p&gt;

&lt;p&gt;📜 Summary&lt;/p&gt;

&lt;p&gt;ISM-X / RSC Stack represents a new category of runtime layer for AI agents:&lt;br&gt;
a minimal, auditable, security-aware system that quantifies coherence and drift in real time.&lt;/p&gt;

&lt;p&gt;The architecture is public.&lt;br&gt;
The mathematics is protected.&lt;br&gt;
The door for collaboration is open.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>python</category>
      <category>devops</category>
      <category>monitoring</category>
    </item>
    <item>
      <title>Building Trust for AI Agents — ISM-X: A Privacy-Preserving Identity Layer (with demo)</title>
      <dc:creator>Damjan Žakelj</dc:creator>
      <pubDate>Fri, 10 Oct 2025 18:34:05 +0000</pubDate>
      <link>https://dev.to/damjan_akelj_be1aab4a715/building-trust-for-ai-agents-ism-x-a-privacy-preserving-identity-layer-with-demo-4ifj</link>
      <guid>https://dev.to/damjan_akelj_be1aab4a715/building-trust-for-ai-agents-ism-x-a-privacy-preserving-identity-layer-with-demo-4ifj</guid>
      <description>&lt;p&gt;In distributed AI systems, continuity and trust are hard problems.&lt;br&gt;
An agent that restarts, migrates, or forks can lose its identity.&lt;br&gt;
ISM-X is our answer — a small, privacy-preserving layer that combines cryptographic identity (DID) and attestation (HMAC over commitment).&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;What we share&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Reference code (Apache-2.0, ~250 lines)&lt;/p&gt;

&lt;p&gt;Ed25519-signed passports&lt;/p&gt;

&lt;p&gt;HMAC tag over pre-hashed commitments (no raw metrics)&lt;/p&gt;

&lt;p&gt;Time/TTL, revocation, constant-time verification&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;What we don’t share&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Any private resonance metrics or production keys.&lt;br&gt;
The demo uses DEMO_KEY_DO_NOT_USE, safe for sandboxing.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Run the demo
git clone &lt;a href="https://github.com/Freeky7819/ismx-authy" rel="noopener noreferrer"&gt;https://github.com/Freeky7819/ismx-authy&lt;/a&gt;
cd ismx-demo
python ismx_open_demo.py&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;You’ll see the passport issuance, signature verification, and audit log in action.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Why this matters&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;ISM-X bridges two domains:&lt;/p&gt;

&lt;p&gt;Identity: persistent cryptographic DIDs.&lt;/p&gt;

&lt;p&gt;Integrity: attestations that don’t leak proprietary state.&lt;/p&gt;

&lt;p&gt;It’s a foundational step for local-first, privacy-preserving AI systems.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;What’s next&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;3-of-5 policy quorum&lt;/p&gt;

&lt;p&gt;FROST/BLS threshold signatures&lt;/p&gt;

&lt;p&gt;optional ZK-commit proofs&lt;/p&gt;

&lt;p&gt;🔗 GitHub – ISM-X Demo Public Pack v1&lt;/p&gt;

&lt;p&gt;License: Apache-2.0&lt;br&gt;
Author: Freedom (Damjan)&lt;/p&gt;

</description>
      <category>ai</category>
      <category>blockchain</category>
      <category>security</category>
      <category>opensource</category>
    </item>
  </channel>
</rss>
