Katharina Rückbrodt

Posted on Apr 7

When swarm intelligence meets electricity data — and what goes wrong

#datascience #machinelearning #beginners #python

A report on a hobby project that turned out to be more interesting than planned.

I actually just wanted to see what you could do with real German electricity data.

The Federal Network Agency’s SMARD platform provides hourly time-series data free of charge: wind power, photovoltaics, natural gas, pumped storage, biomass, load. It’s all publicly available and completely free. A dataset just begging to be experimented with.

What followed was a classic journey of discovery: every answer led to a new question. This article sets out what I learnt along the way — for myself, and for anyone who thinks along similar lines.

Step 1: The naive idea — simply setting thresholds

My first approach was the obvious one: If natural gas exceeds X MW, then trigger an alarm. Classic threshold-based anomaly detection.

The problem with this: you only detect what you already know. Whoever sets the threshold also determines what counts as ‘normal’. Unknown anomalies — precisely the ones of interest — fall through the cracks.

Furthermore, the approach is blind to context. Natural gas at 8,000 MW can be completely normal. Or critical — depending on what wind, solar and the grid load are doing at the same time.

There must be more to it than that.

Step 2: Graph theory — time series as a network

The real question is: How are these time series connected?

This is where graph theory comes into play. A graph consists of nodes and edges. In SwarmGrid, each time series is a node — wind power, natural gas, photovoltaics, pumped storage, load. The edges between them arise from correlations: if two time series behave similarly, they form a strong connection.

Wind power ──── Load
    │             │
Photovoltaics  Pumped storage
         └──── Natural gas

The result is a network of dependencies. A topology of the energy system that emerges from the data itself — not predefined, but learned.

For this, I used NetworkX — a Python library designed precisely for this purpose.

Step 3: The Framework — GDN

The problem: I didn’t know how to meaningfully derive anomalies from a correlation matrix.

Then I came across Graph Deviation Networks (GDN) — a research-based approach that does exactly that: each node learns to predict its own behaviour based on the behaviour of its neighbours. Does a node deviate significantly from the prediction? Anomaly.

I implemented a simplified variant — LightweightGDN — using ridge regression per node instead of a full neural network. There must still be room for improvement...

Step 4: Swarm behaviour — why the name

Now comes the part where things really got exciting.

Swarm intelligence describes the phenomenon whereby many simple units together generate complex, intelligent behaviour — without any central control. Starlings, fish, ants.

The principle can be applied to time series:

Each time series observes its learned neighbours in the network
No node has global knowledge
Anomalies arise not from absolute values, but from deviations from collective behaviour

Traditional:   Time series X > threshold → Alarm
             → Only detects what is already known

SwarmGrid:   Time series X deviates from neighbours Y, Z → Score
             → Also detects unknown anomalies

Example: Natural gas is running at 8,000 MW — absolutely unremarkable. But all neighbours (load, pumped storage, wind) show a pattern that would fit 15,000 MW. Score = 2.6 → CRITICAL.

Step 5: The big picture

Step 6: The explainability problem — and XAI

At some point, the question arose: Why is this node anomalous?

A score alone is not enough. “Anomaly score 2.6” is not an explanation. This is where Explainable AI (XAI) comes into play.

SHAP (SHapley Additive exPlanations) is a method from game theory that explains how much each feature contributed to the prediction. Specifically: Which neighbouring time series caused the deviation?

explainer = shap.LinearExplainer(model, X_train)
shap_values = explainer (X_node)
# → Natural gas anomaly explained 73% by pumped storage behaviour

Alternatives to SHAP that I have looked at:

Method	Approach	Strengths
SHAP	Shapley values	Locally and globally explainable, widely used
LIME	Local linear approximation	Fast, good for black-box models
Integrated Gradients	Gradient-based	Good for neural networks

For Ridge Regression, SHAP with the LinearExplainer is the natural choice — fast and accurate.

Step 7: Streamlit — surprisingly fast

Naturally, I wanted to see what the whole thing looked like. With Streamlit, that was no problem at all.

With Streamlit, you can create an interactive dashboard in just a few lines of Python — time series plots, anomaly scores, SHAP bars, all in real time (without any front-end knowledge and without JavaScript).

For the publication, I used the Streamlit Community Cloud (free) and the result is iframe-compatible — meaning it can be embedded into other dashboards.

Step 8: No ML, no LLM — so what is it then?

A question I asked myself.

Machine Learning (ML) in the traditional sense optimises a model on a labelled dataset — it learns to predict a specific output. SwarmGrid has no labels, no target variable. It learns patterns, not classes.

Large Language Models (LLMs) such as GPT or Claude process language and generate text. They are not designed for time series anomaly detection.

SwarmGrid is more akin to unsupervised learning combined with graph theory: the system learns what is ‘normal’ — and flags anything that deviates from it. No human has defined what constitutes an anomaly.

The honest part: What I realised too late

My anomaly values for photovoltaics looked much worse than expected. Why on earth? Until it dawned on me that I hadn’t distinguished between day and night – and that’s obviously a problem with photovoltaics!

Photovoltaic systems obviously don’t produce electricity at night. Renewable energies are subject to a strong daily cycle. So my model had ‘learned’ the wrong thing.

What to do better:

Add hourly features as input (hour_of_day, is_night)
Train separate models for day and night operations
Explicitly model seasonality (e.g. using STL decomposition)

Outlook – areas that could be explored further

Incorporate real network topology: Use the physical topology of a distribution network as a graph instead of learned correlations
Graph Neural Networks (GNN): Full GNN implementation using PyTorch Geometric instead of Ridge Regression
Time-window awareness: Model knows time of day, season, public holidays
Bridge node analysis: Which time series is the ‘key node’ in the network — if it fails, does the swarm collapse?
Real-time streaming: Continuous anomaly detection using Apache Kafka or similar, rather than batch processing

Stack overview

Component	Technology
Data	SMARD API (Federal Network Agency)
Graph	NetworkX
ML	scikit-learn Ridge Regression
XAI	SHAP LinearExplainer
UI	Streamlit + Plotly
Deploy	Streamlit Community Cloud