<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Katharina Rückbrodt</title>
    <description>The latest articles on DEV Community by Katharina Rückbrodt (@ladylogic).</description>
    <link>https://dev.to/ladylogic</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3862321%2F8a390f8b-49fd-439f-9da0-498ef84c0731.jpeg</url>
      <title>DEV Community: Katharina Rückbrodt</title>
      <link>https://dev.to/ladylogic</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/ladylogic"/>
    <language>en</language>
    <item>
      <title>When swarm intelligence meets electricity data — and what goes wrong</title>
      <dc:creator>Katharina Rückbrodt</dc:creator>
      <pubDate>Tue, 07 Apr 2026 16:59:30 +0000</pubDate>
      <link>https://dev.to/ladylogic/when-swarm-intelligence-meets-electricity-data-and-what-goes-wrong-4hjj</link>
      <guid>https://dev.to/ladylogic/when-swarm-intelligence-meets-electricity-data-and-what-goes-wrong-4hjj</guid>
      <description>&lt;p&gt;&lt;em&gt;A report on a hobby project that turned out to be more interesting than planned.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;I actually just wanted to see what you could do with real German electricity data.&lt;/p&gt;

&lt;p&gt;The &lt;a href="https://www.smard.de" rel="noopener noreferrer"&gt;Federal Network Agency’s SMARD platform&lt;/a&gt; provides hourly time-series data free of charge: wind power, photovoltaics, natural gas, pumped storage, biomass, load. It’s all publicly available and completely free. A dataset just begging to be experimented with.&lt;/p&gt;

&lt;p&gt;What followed was a classic journey of discovery: every answer led to a new question. This article sets out what I learnt along the way — for myself, and for anyone who thinks along similar lines.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 1: The naive idea — simply setting thresholds
&lt;/h2&gt;

&lt;p&gt;My first approach was the obvious one: &lt;em&gt;If natural gas exceeds X MW, then trigger an alarm.&lt;/em&gt; Classic threshold-based anomaly detection.&lt;/p&gt;

&lt;p&gt;The problem with this: you only detect what you already know. Whoever sets the threshold also determines what counts as ‘normal’. Unknown anomalies — precisely the ones of interest — fall through the cracks.&lt;/p&gt;

&lt;p&gt;Furthermore, the approach is &lt;em&gt;blind to context&lt;/em&gt;. Natural gas at 8,000 MW can be completely normal. Or critical — depending on what wind, solar and the grid load are doing at the same time.&lt;/p&gt;

&lt;p&gt;That felt wrong.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 2: Graph theory — time series as a network
&lt;/h2&gt;

&lt;p&gt;The real question is: How are these time series &lt;em&gt;connected&lt;/em&gt;?&lt;/p&gt;

&lt;p&gt;This is where &lt;strong&gt;graph theory&lt;/strong&gt; comes into play. A graph consists of nodes and edges. In SwarmGrid, each time series is a node — wind power, natural gas, photovoltaics, pumped storage, load. The edges between them arise from &lt;strong&gt;correlations&lt;/strong&gt;: if two time series behave similarly, they form a strong connection.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Wind power ──── Load
    │             │
Photovoltaics  Pumped storage
         └──── Natural gas
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The result is a &lt;strong&gt;network of dependencies&lt;/strong&gt;. A topology of the energy system that emerges from the data itself — not predefined, but learned.&lt;/p&gt;

&lt;p&gt;For this, I used &lt;a href="https://networkx.org/" rel="noopener noreferrer"&gt;NetworkX&lt;/a&gt; — a Python library designed precisely for this purpose.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 3: The Framework — GDN
&lt;/h2&gt;

&lt;p&gt;The problem: I didn’t know how to meaningfully derive anomalies from a correlation matrix.&lt;/p&gt;

&lt;p&gt;Then I came across &lt;strong&gt;Graph Deviation Networks (GDN)&lt;/strong&gt; — a research-based approach that does exactly that: each node learns to predict its own behaviour based on the behaviour of its neighbours. Does a node deviate significantly from the prediction? Anomaly.&lt;/p&gt;

&lt;p&gt;I implemented a simplified variant — &lt;strong&gt;LightweightGDN&lt;/strong&gt; — using ridge regression per node instead of a full neural network. There must still be room for improvement...&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 4: Swarm behaviour — why the name
&lt;/h2&gt;

&lt;p&gt;Now comes the part that really fascinated me.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Swarm intelligence&lt;/strong&gt; describes the phenomenon whereby many simple units together generate complex, intelligent behaviour — without any central control. Starlings, fish, ants.&lt;/p&gt;

&lt;p&gt;The principle can be applied to time series:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Each time series &lt;em&gt;observes its learned neighbours in the network&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;No node has global knowledge&lt;/li&gt;
&lt;li&gt;Anomalies arise not from absolute values, but from &lt;em&gt;deviations from collective behaviour&lt;/em&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Traditional:   Time series X &amp;gt; threshold → Alarm
             → Only detects what is already known

SwarmGrid:   Time series X deviates from neighbours Y, Z → Score
             → Also detects unknown anomalies
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt; Natural gas is running at 8,000 MW — absolutely unremarkable. But all neighbours (load, pumped storage, wind) show a pattern that would fit 15,000 MW. Score = 2.6 → CRITICAL.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 5: The big picture
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Footxni4z2317io04210f.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Footxni4z2317io04210f.png" alt="SwarmGrid architecture diagram: SMARD data flows into a Graph Deviation Network, which learns and trains the topology and calculates anomaly scores. The scores are explained using SHAP and visualised as swarm recommendations in a Streamlit dashboard." width="800" height="539"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 6: The explainability problem — and XAI
&lt;/h2&gt;

&lt;p&gt;At some point, the question arose: &lt;em&gt;Why is this node anomalous?&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;A score alone is not enough. “Anomaly score 2.6” is not an explanation. This is where &lt;strong&gt;Explainable AI (XAI)&lt;/strong&gt; comes into play.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;SHAP&lt;/strong&gt; (SHapley Additive exPlanations) is a method from game theory that explains how much each feature contributed to the prediction. Specifically: Which neighbouring time series caused the deviation?&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;explainer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;shap&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;LinearExplainer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;X_train&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;shap_values&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;explainer &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;X_node&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# → Natural gas anomaly explained 73% by pumped storage behaviour
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Alternatives to SHAP that I have looked at:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Method&lt;/th&gt;
&lt;th&gt;Approach&lt;/th&gt;
&lt;th&gt;Strengths&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;SHAP&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Shapley values&lt;/td&gt;
&lt;td&gt;Locally and globally explainable, widely used&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;LIME&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Local linear approximation&lt;/td&gt;
&lt;td&gt;Fast, good for black-box models&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Integrated Gradients&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Gradient-based&lt;/td&gt;
&lt;td&gt;Good for neural networks&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;For Ridge Regression, SHAP with the &lt;code&gt;LinearExplainer&lt;/code&gt; is the natural choice — fast and accurate.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 7: Streamlit — surprisingly fast
&lt;/h2&gt;

&lt;p&gt;Naturally, I wanted to see what the whole thing looked like. With Streamlit, that was no problem at all.&lt;/p&gt;

&lt;p&gt;With &lt;a href="https://streamlit.io" rel="noopener noreferrer"&gt;Streamlit&lt;/a&gt;, you can create an interactive dashboard in just a few lines of Python — time series plots, anomaly scores, SHAP bars, all in real time (without any front-end knowledge and without JavaScript).&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1sj5923zge3o4xt94btc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1sj5923zge3o4xt94btc.png" alt="Streamlit picture" width="800" height="364"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For the publication, I used the Streamlit Community Cloud (free) and the result is iframe-compatible — meaning it can be embedded into other dashboards.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 8: No ML, no LLM — so what is it then?
&lt;/h2&gt;

&lt;p&gt;A question I asked myself.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Machine Learning (ML)&lt;/strong&gt; in the traditional sense optimises a model on a labelled dataset — it learns to predict a specific output. SwarmGrid has no labels, no target variable. It learns patterns, not classes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Large Language Models (LLMs)&lt;/strong&gt; such as GPT or Claude process language and generate text. They are not designed for time series anomaly detection.&lt;/p&gt;

&lt;p&gt;SwarmGrid is more akin to &lt;strong&gt;unsupervised learning&lt;/strong&gt; combined with &lt;strong&gt;graph theory&lt;/strong&gt;: the system learns what is ‘normal’ — and flags anything that deviates from it. No human has defined what constitutes an anomaly.&lt;/p&gt;




&lt;h2&gt;
  
  
  The honest part: What I realised too late
&lt;/h2&gt;

&lt;p&gt;My anomaly scores for photovoltaics looked abysmal. Why on earth? Until I realised that I wasn’t distinguishing between day and night, and that’s obviously a problem with photovoltaics!&lt;/p&gt;

&lt;p&gt;Photovoltaics produce nothing at night. Renewable energies follow a strong daily cycle. My model ‘learned’ at night that PV is at zero — and produces high scores during the day when all other nodes are also high.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgf4psweslv0xvixalyqz.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgf4psweslv0xvixalyqz.jpg" alt="Crying cat" width="300" height="300"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What to do better:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Add hourly features as input (&lt;code&gt;hour_of_day&lt;/code&gt;, &lt;code&gt;is_night&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Train separate models for day and night operations&lt;/li&gt;
&lt;li&gt;Explicitly model seasonality (e.g. using STL decomposition)&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Outlook — where we could go deeper
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Incorporate real network topology:&lt;/strong&gt; Use the physical topology of a distribution network as a graph instead of learned correlations&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Graph Neural Networks (GNN):&lt;/strong&gt; Full GNN implementation using PyTorch Geometric instead of Ridge Regression&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Time-window awareness:&lt;/strong&gt; Model knows time of day, season, public holidays&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Bridge node analysis:&lt;/strong&gt; Which time series is the ‘key node’ in the network — if it fails, does the swarm collapse?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Real-time streaming:&lt;/strong&gt; Continuous anomaly detection using Apache Kafka or similar, rather than batch processing&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Stack overview
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Component&lt;/th&gt;
&lt;th&gt;Technology&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Data&lt;/td&gt;
&lt;td&gt;SMARD API (Federal Network Agency)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Graph&lt;/td&gt;
&lt;td&gt;NetworkX&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ML&lt;/td&gt;
&lt;td&gt;scikit-learn Ridge Regression&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;XAI&lt;/td&gt;
&lt;td&gt;SHAP LinearExplainer&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;UI&lt;/td&gt;
&lt;td&gt;Streamlit + Plotly&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Deploy&lt;/td&gt;
&lt;td&gt;Streamlit Community Cloud&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;→ &lt;a href="https://github.com/lady-logic/swarmgrid" rel="noopener noreferrer"&gt;GitHub: lady-logic/swarmgrid&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;If you’re conducting similar experiments or have ideas for the future — issues and stars are welcome. The project thrives on curiosity.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>beginners</category>
      <category>machinelearning</category>
      <category>datascience</category>
      <category>python</category>
    </item>
  </channel>
</rss>
