DEV Community

Cover image for When swarm intelligence meets electricity data — and what goes wrong
Katharina Rückbrodt
Katharina Rückbrodt

Posted on

When swarm intelligence meets electricity data — and what goes wrong

A report on a hobby project that turned out to be more interesting than planned.


I actually just wanted to see what you could do with real German electricity data.

The Federal Network Agency’s SMARD platform provides hourly time-series data free of charge: wind power, photovoltaics, natural gas, pumped storage, biomass, load. It’s all publicly available and completely free. A dataset just begging to be experimented with.

What followed was a classic journey of discovery: every answer led to a new question. This article sets out what I learnt along the way — for myself, and for anyone who thinks along similar lines.


Step 1: The naive idea — simply setting thresholds

My first approach was the obvious one: If natural gas exceeds X MW, then trigger an alarm. Classic threshold-based anomaly detection.

The problem with this: you only detect what you already know. Whoever sets the threshold also determines what counts as ‘normal’. Unknown anomalies — precisely the ones of interest — fall through the cracks.

Furthermore, the approach is blind to context. Natural gas at 8,000 MW can be completely normal. Or critical — depending on what wind, solar and the grid load are doing at the same time.

That felt wrong.


Step 2: Graph theory — time series as a network

The real question is: How are these time series connected?

This is where graph theory comes into play. A graph consists of nodes and edges. In SwarmGrid, each time series is a node — wind power, natural gas, photovoltaics, pumped storage, load. The edges between them arise from correlations: if two time series behave similarly, they form a strong connection.

Wind power ──── Load
    │             │
Photovoltaics  Pumped storage
         └──── Natural gas
Enter fullscreen mode Exit fullscreen mode

The result is a network of dependencies. A topology of the energy system that emerges from the data itself — not predefined, but learned.

For this, I used NetworkX — a Python library designed precisely for this purpose.


Step 3: The Framework — GDN

The problem: I didn’t know how to meaningfully derive anomalies from a correlation matrix.

Then I came across Graph Deviation Networks (GDN) — a research-based approach that does exactly that: each node learns to predict its own behaviour based on the behaviour of its neighbours. Does a node deviate significantly from the prediction? Anomaly.

I implemented a simplified variant — LightweightGDN — using ridge regression per node instead of a full neural network. There must still be room for improvement...


Step 4: Swarm behaviour — why the name

Now comes the part that really fascinated me.

Swarm intelligence describes the phenomenon whereby many simple units together generate complex, intelligent behaviour — without any central control. Starlings, fish, ants.

The principle can be applied to time series:

  • Each time series observes its learned neighbours in the network
  • No node has global knowledge
  • Anomalies arise not from absolute values, but from deviations from collective behaviour
Traditional:   Time series X > threshold → Alarm
             → Only detects what is already known

SwarmGrid:   Time series X deviates from neighbours Y, Z → Score
             → Also detects unknown anomalies
Enter fullscreen mode Exit fullscreen mode

Example: Natural gas is running at 8,000 MW — absolutely unremarkable. But all neighbours (load, pumped storage, wind) show a pattern that would fit 15,000 MW. Score = 2.6 → CRITICAL.


Step 5: The big picture

SwarmGrid architecture diagram: SMARD data flows into a Graph Deviation Network, which learns and trains the topology and calculates anomaly scores. The scores are explained using SHAP and visualised as swarm recommendations in a Streamlit dashboard.


Step 6: The explainability problem — and XAI

At some point, the question arose: Why is this node anomalous?

A score alone is not enough. “Anomaly score 2.6” is not an explanation. This is where Explainable AI (XAI) comes into play.

SHAP (SHapley Additive exPlanations) is a method from game theory that explains how much each feature contributed to the prediction. Specifically: Which neighbouring time series caused the deviation?

explainer = shap.LinearExplainer(model, X_train)
shap_values = explainer (X_node)
# → Natural gas anomaly explained 73% by pumped storage behaviour
Enter fullscreen mode Exit fullscreen mode

Alternatives to SHAP that I have looked at:

Method Approach Strengths
SHAP Shapley values Locally and globally explainable, widely used
LIME Local linear approximation Fast, good for black-box models
Integrated Gradients Gradient-based Good for neural networks

For Ridge Regression, SHAP with the LinearExplainer is the natural choice — fast and accurate.


Step 7: Streamlit — surprisingly fast

Naturally, I wanted to see what the whole thing looked like. With Streamlit, that was no problem at all.

With Streamlit, you can create an interactive dashboard in just a few lines of Python — time series plots, anomaly scores, SHAP bars, all in real time (without any front-end knowledge and without JavaScript).

Streamlit picture

For the publication, I used the Streamlit Community Cloud (free) and the result is iframe-compatible — meaning it can be embedded into other dashboards.


Step 8: No ML, no LLM — so what is it then?

A question I asked myself.

Machine Learning (ML) in the traditional sense optimises a model on a labelled dataset — it learns to predict a specific output. SwarmGrid has no labels, no target variable. It learns patterns, not classes.

Large Language Models (LLMs) such as GPT or Claude process language and generate text. They are not designed for time series anomaly detection.

SwarmGrid is more akin to unsupervised learning combined with graph theory: the system learns what is ‘normal’ — and flags anything that deviates from it. No human has defined what constitutes an anomaly.


The honest part: What I realised too late

My anomaly scores for photovoltaics looked abysmal. Why on earth? Until I realised that I wasn’t distinguishing between day and night, and that’s obviously a problem with photovoltaics!

Photovoltaics produce nothing at night. Renewable energies follow a strong daily cycle. My model ‘learned’ at night that PV is at zero — and produces high scores during the day when all other nodes are also high.

Crying cat

What to do better:

  • Add hourly features as input (hour_of_day, is_night)
  • Train separate models for day and night operations
  • Explicitly model seasonality (e.g. using STL decomposition)

Outlook — where we could go deeper

  • Incorporate real network topology: Use the physical topology of a distribution network as a graph instead of learned correlations
  • Graph Neural Networks (GNN): Full GNN implementation using PyTorch Geometric instead of Ridge Regression
  • Time-window awareness: Model knows time of day, season, public holidays
  • Bridge node analysis: Which time series is the ‘key node’ in the network — if it fails, does the swarm collapse?
  • Real-time streaming: Continuous anomaly detection using Apache Kafka or similar, rather than batch processing

Stack overview

Component Technology
Data SMARD API (Federal Network Agency)
Graph NetworkX
ML scikit-learn Ridge Regression
XAI SHAP LinearExplainer
UI Streamlit + Plotly
Deploy Streamlit Community Cloud

GitHub: lady-logic/swarmgrid


If you’re conducting similar experiments or have ideas for the future — issues and stars are welcome. The project thrives on curiosity.

Top comments (0)