Vignesh

Posted on Mar 24

The 9 ML Anomaly Detection Methods ThresholdIQ Uses — Explained in Plain English

#webdev #machinelearning #datascience #analytics

When you upload a spreadsheet to ThresholdIQ, nine separate machine learning methods run simultaneously across every column in your data. Each one is looking for a different type of problem. Some catch sudden spikes. Others find slow drift. One looks for sensors that have frozen. Another watches for two metrics that normally move together suddenly moving apart.

Most people don't need to know how any of this works — they just want the anomaly flagged. But if you've ever wondered "why did **ThresholdIQ **flag that?" or "what would it miss?", this guide is for you. Each method gets a plain-English explanation, a concrete worked example, and an honest summary of what it catches and what it doesn't.

Contents

Multi-Window Z-Score — the primary severity driver
EWMA Spike Detection — sudden event catcher
SARIMA Seasonal Residuals — seasonality-aware detection
Isolation Forest — multivariate outlier detection
Correlation Deviation — correlated failure detection
DBSCAN Cluster Noise — behavioural outlier detection
Seasonal Baseline — time-of-day / day-of-week context
Trend Detection — gradual drift early warning
Stuck & Zero Detection — sensor failure & line halt How the 9 methods work together: Each method runs independently and produces a score. ThresholdIQ fuses these scores using a weighted formula — multi-window Z-score drives the primary severity, and the other 8 methods can only boost severity, never reduce it. This means a false positive from one method can't override a clean result from the others — but genuine anomalies that multiple methods agree on escalate quickly to Critical or Emergency.

Method 1 of 9

Multi-Window Z-Score

The primary severity driver — compares every data point to multiple time horizons simultaneously
Plain-English explanation
A Z-score answers one question: "How far is this value from normal, measured in standard deviations?" A Z-score of 0 means perfectly average. A Z-score of 3 means three standard deviations above average — unusual for any distribution. Most single-metric alert systems use one Z-score window. **ThresholdIQ **runs four windows in parallel: 50 points (short-term), 100 points (mid-term), 200 points (long-term), and 500 points (very long-term). Each window has its own rolling mean and standard deviation.

Analogy: Imagine four weather forecasters, each looking at a different time period — the last week, the last month, the last quarter, and the last year. If all four agree that today's temperature is unusually high, you can be very confident it really is abnormal. If only the "last week" forecaster flags it, it might just be a warm spell. Multi-window Z-score works the same way: agreement across windows = high confidence = higher severity.
How severity is determined
The number of windows simultaneously breached maps directly to severity level:

W50 breach only → Warning
W50 + W100 both breached → Critical
W50 + W100 + W200 all breached → Emergency Worked example — daily cash flow monitoring Example data: Finance team daily cash inflow (A$000s) Day Cash inflow W50 Z-score W100 Z-score W200 Z-score

Result

Mon $142k 0.3 0.2 0.1 Normal
Tue $138k -0.1 -0.2 -0.1 Normal
Wed $41k 3.4 1.8 1.2 ⚠️ Warning (W50 only)
Thu $38k 3.8 3.1 1.9 🟠 Critical (W50+W100)
Fri $29k 4.2 3.7 3.2 🔴 Emergency (all 3)

Wednesday's drop fires a Warning — unusual in the short term but not yet confirmed. By Friday, all three windows agree the cash inflow is far below normal. The anomaly has persisted and escalated to Emergency. This is a structural problem, not a one-day blip.

Catches

Sudden large spikes or drops
Sustained deviations that persist over time
Values that are unusual at any time horizon
Escalating anomalies (gets worse each period)
Misses on its own
Seasonal patterns (Sunday lows look anomalous)
Gradual drift that's slow enough to shift the mean
Multi-metric relationships between columns

Method 2 of 9

EWMA Spike Detection

Exponentially Weighted Moving Average — catches sudden instantaneous events that resolve quickly

Plain-English explanation

EWMA (Exponentially Weighted Moving Average) is a special kind of average that gives more weight to recent data and less weight to older data. It creates a smoothed trend line through your data. Anything that deviates sharply from that smooth line — a sudden spike or crash — gets flagged.

Think of it as the difference between the trend and the actual value. EWMA subtracts the smoothed trend from each actual reading. What's left (the residual) tells you what's unexpected. When the residual exceeds 3 standard deviations, EWMA fires.

Analogy: You drive to work every day and the journey takes about 35 minutes. EWMA is like tracking your rolling average journey time, weighted towards recent trips. One day you hit an accident and it takes 90 minutes. The EWMA trend says "expected: 36 mins" — the actual is 90 mins — the residual is massive. Flag. Next day it's back to 34 minutes. The spike resolved quickly, but EWMA caught it the moment it happened.
Worked example — e-commerce daily revenue
Example data: Online store daily revenue (A$)
Date Actual revenue EWMA trend Residual Result
Mar 18 $14,200 $13,980 +$220 Normal
Mar 19 $13,900 $13,960 -$60 Normal
Mar 20 $2,100 $13,850 -$11,750 🔴 Emergency spike
Mar 21 $13,600 $13,740 -$140 Normal (resolved)
EWMA catches the Mar 20 revenue collapse on the same day it happens — before it shows up in any weekly report. It also correctly shows the next day as normal, so you don't get ongoing false alerts once the issue resolves. This was likely a payment gateway outage or checkout failure.
Catches
Sudden single-period spikes or crashes
Events that resolve quickly (transient anomalies)
Instantaneous sensor readings far from trend
Fast-reacting — fires within the same reporting period
Misses on its own
Gradual drift (the trend line adapts to drift)
Sustained long-term deviations
Patterns that emerge across multiple metrics

Method 3 of 9

SARIMA Seasonal Residuals

Seasonal AutoRegressive Integrated Moving Average — separates predictable patterns from real anomalies
Plain-English explanation
SARIMA is a statistical forecasting model that understands seasonality — predictable patterns that repeat at regular intervals. It learns from your historical data: "Electricity usage is always high on Monday mornings. Revenue always dips on Sundays. Production throughput is always lower on night shifts." It builds a model of what your data should look like at any given time based on those patterns.

Once SARIMA has learned the seasonal model, it computes what it expected at each point, and compares that to what actually happened. The difference (the residual) is what gets analysed for anomalies. A Sunday revenue figure that looks low by absolute standards might be perfectly normal for a Sunday — SARIMA knows this and won't flag it.

Analogy: A supermarket expects long checkout queues on Saturday afternoons — that's just how Saturdays work. SARIMA is the manager who knows all the normal busy and quiet patterns. If queues are long on a Wednesday at 11am (when it's normally quiet), SARIMA flags it. But it never flags Saturday afternoon queues as anomalous, because those are expected.
Worked example — utility gas consumption
Example data: Residential gas consumption (GJ/day)
Date Day Actual GJ SARIMA expected Residual Result
Jun 1 Mon 42.1 41.8 +0.3 Normal
Jun 7 Sun 31.4 32.1 -0.7 Normal (Sundays are always lower)
Jun 8 Mon 44.2 42.0 +2.2 Normal (slight winter increase)
Jun 9 Tue 29.3 41.9 -12.6 ⚠️ Warning (unexpected low)
Without SARIMA, Sunday's 31.4 GJ might look anomalously low compared to weekday averages of ~43 GJ. SARIMA knows Sundays are always lower and ignores it. But Tuesday's 29.3 GJ is far below the Tuesday expectation of 41.9 GJ — that's a real anomaly (possible meter fault or pipeline pressure issue).
Catches
Anomalies that break seasonal patterns
Events that look normal in absolute terms but are wrong for the time period
Day-of-week and hour-of-day deviations
Misses on its own
Anomalies that follow the seasonal pattern (a "seasonal" anomaly)
Very short data histories (needs at least 2 full seasonal cycles)
Multi-metric relationships

Method 4 of 9

Isolation Forest

Unsupervised ML model — detects globally unusual combinations across all metrics simultaneously
Plain-English explanation
Isolation Forest is an unsupervised machine learning algorithm that works by trying to isolate each data point from the rest of the dataset using random cuts. Normal points — those that cluster with similar points — require many cuts to isolate because there are lots of similar points nearby. Anomalous points — those that are unusual — can be isolated in very few cuts because they're far from everything else.

Crucially, Isolation Forest looks at all your columns simultaneously. A reading might be within normal range on any single metric, but if the combination of values is unusual across three or four metrics together, Isolation Forest finds it. This is the key method for catching multi-metric anomalies that single-column monitoring would never detect.

Analogy: Imagine a crowd of 1,000 people, and you're trying to find the one person who doesn't belong. You start drawing random lines through the crowd — "everyone on the left of this line, everyone on the right." A normal person surrounded by similar people takes many cuts before they're alone. An oddly-dressed person standing away from the crowd gets isolated in just two or three cuts. Isolation Forest does this mathematically with your data columns.
Worked example — manufacturing OEE with multiple metrics
Example data: Production line metrics (each column looks normal individually)
Shift OEE % Temp °C Cycle time (s) Reject % Isolation score Result
Day 1 87 68 42 1.2 0.42 Normal
Day 2 86 69 43 1.4 0.40 Normal
Day 3 83 74 47 2.1 0.71 ⚠️ Warning (unusual combo)
Day 4 81 79 52 3.8 0.89 🔴 Emergency (globally isolated)
Day 3's OEE of 83 looks only slightly below normal. Temperature of 74°C is within the accepted range. Cycle time of 47s is a little high. Reject rate of 2.1% might pass inspection individually. But the combination of all four metrics shifting together in the same direction is globally unusual — Isolation Forest catches this as a Warning on Day 3, before any single metric triggers an individual alert. By Day 4 it's an Emergency.
Catches
Multi-metric combinations that are globally unusual
Anomalies that don't breach any individual threshold
Patterns invisible to single-column analysis
Misses on its own
Anomalies in a single column where other columns are normal
Seasonal patterns (doesn't account for time)
Requires enough data to establish "normal" clusters

Method 5 of 9

Correlation Deviation

Monitors whether metrics that normally move together have started moving apart — or diverge when they shouldn't
Plain-English explanation
Some metrics in your data are naturally correlated — they tend to move up and down together. Revenue and units sold. Power consumption and production output. Delivery volume and fuel cost. Correlation Deviation monitors these relationships over time. When two metrics that have historically moved together suddenly diverge — or when two metrics that normally move independently start moving in lockstep — that's flagged as an anomaly.

This method is particularly powerful for catching process failures that don't show up in any single column. If your OEE stays flat but your reject rate climbs, something has changed in the relationship between those metrics — even if neither column individually looks alarming.

Analogy: You track both your heart rate and your step count while exercising. Normally, they move together — more steps = higher heart rate. One day your step count is normal but your heart rate is unusually high. The relationship has broken. Something might be wrong — you might be getting ill, or there's something else stressing your body. Correlation Deviation detects exactly this type of "the relationship broke" signal.
Worked example — operations supplier monitoring
Example data: Supplier B over 4 weeks (metrics normally correlated)
Week Order volume On-time % Historical corr. Deviation Result
Week 1 240 units 96% Strong positive None Normal
Week 2 255 units 94% Strong positive Slight Normal
Week 3 270 units 87% Breaking down Moderate ⚠️ Warning — volume up, OTD dropping
Week 4 290 units 74% Inverted Large 🟠 Critical — relationship fully inverted
Historically, higher order volume correlates with better supplier performance (volume customer = priority service). Now volume is rising but on-time delivery is falling — the relationship has inverted. This is an early signal that the supplier is over-committed and struggling. Neither metric individually would have triggered an alert. The relationship breaking is the signal.
Catches
Correlated metrics that diverge unexpectedly
Process changes that affect metric relationships
Multi-metric failures invisible to single-column rules
Misses on its own
Single-metric anomalies where all correlations hold
Very weak or noisy correlations in the data
Newly added columns with no correlation history

**Method 6 of 9

DBSCAN Cluster Noise

Density-Based Spatial Clustering — identifies points that don't belong to any normal behavioural cluster
Plain-English explanation
DBSCAN (Density-Based Spatial Clustering of Applications with Noise) groups your data into clusters based on density — regions where data points are close together. Points that belong to a dense cluster are normal. Points that sit far from any cluster, in low-density regions of the data space, are labelled "noise" — and those noise points are anomalies.

Unlike methods that flag values outside a threshold, DBSCAN doesn't need to know in advance what "normal" looks like. It discovers the natural clusters in your data and then identifies what doesn't fit. This makes it excellent at catching systematic patterns that are unusual — like a specific product SKU that consistently returns at a different rate from all similar products, or a meter that always reads in a pattern no other meter produces.

Analogy: Imagine dropping 1,000 coloured dots on a map representing where people park their cars. Most dots cluster around shopping centres, stations, and offices. A few dots appear in random fields with no nearby cluster — those are the anomalies. DBSCAN finds those isolated dots without anyone telling it where the normal clusters should be.
Worked example — e-commerce returns by product
Example data: Product return rates and average review score across 8 SKUs
SKU Return rate % Avg review Cluster Result
SKU-001 3.2 4.3 Normal cluster A Normal
SKU-002 4.1 4.1 Normal cluster A Normal
SKU-003 3.8 4.4 Normal cluster A Normal
SKU-004 2.9 4.6 Normal cluster B Normal
SKU-005 3.1 4.5 Normal cluster B Normal
SKU-006 18.7 2.1 No cluster (noise) 🔴 Emergency — isolated outlier
SKU-006 sits completely outside both normal clusters — it has a return rate and review score combination that no other product in the catalogue produces. DBSCAN labels it as noise (an anomaly). This flags a quality issue: the product is defective, mislabelled, or fundamentally not meeting customer expectations. A threshold rule watching return rate alone might catch this eventually, but DBSCAN catches the combined pattern immediately.
Catches
Entities (SKUs, meters, suppliers) unlike any normal group
Systematic defect patterns in quality data
Reverse-wiring and meter tampering patterns
No prior knowledge of "normal" required
Misses on its own
Time-series anomalies (DBSCAN ignores time order)
Anomalies that cluster with other anomalies
Sparse datasets with too few points per cluster

Method 7 of 9

Seasonal Baseline

Maintains separate normal ranges per hour-of-day and day-of-week — context-aware thresholds without manual configuration
Plain-English explanation
The Seasonal Baseline method builds a separate statistical profile for each time bucket in your data. For hourly data, it calculates a normal mean and standard deviation for each hour of the day and each day of the week independently — so "normal for 3am on a Sunday" and "normal for 3pm on a Tuesday" are tracked separately.

This is simpler than SARIMA (it doesn't build a full forecasting model) but it's very effective at eliminating false positives caused by predictable time-based patterns. Night-shift throughput, weekend call volumes, Monday morning order surges — all of these are learned as patterns specific to their time bucket and excluded from anomaly detection.

Analogy: A restaurant manager knows that 20 customers at 1pm on a Tuesday is completely normal, but 20 customers at 1pm on a Saturday is concerningly quiet. They don't need a formula — they just know from experience what each day and time looks like. The Seasonal Baseline does this automatically for every column in your data, learning the expected level for every hour and day combination.
Worked example — call centre ticket volume
Example data: Customer support tickets per hour (operations team)
Time Day Tickets Baseline for this slot Deviation Result
09:00 Monday 47 Mon 9am avg: 44 (±8) +3 Normal
09:00 Sunday 12 Sun 9am avg: 14 (±4) -2 Normal (Sundays are always quiet)
14:00 Wednesday 89 Wed 2pm avg: 38 (±9) +51 (>5σ) ⚠️ Warning — far above Wed 2pm normal
Without Seasonal Baseline, Sunday's 12 tickets might look anomalously low versus the overall weekly average of ~42 tickets/hour. The Seasonal Baseline knows Sunday at 9am averages 14 and treats 12 as normal. Wednesday at 2pm averaging 38 tickets suddenly receiving 89 is genuinely anomalous — something caused an unusual spike on a normally quiet Wednesday afternoon.
Catches
Values that are wrong for the time period even if typical overall
Anomalies on predictably quiet periods (weekends, nights)
Shift-specific deviations in manufacturing data
Misses on its own
Multi-week seasonal patterns (uses fixed day/hour buckets)
Long-term trends (the baseline adapts slowly to drift)
Multi-metric patterns

Method 8 of 9

Trend Detection

Identifies monotonic drift across consecutive rolling windows — the early warning system for gradual degradation
Plain-English explanation
Trend Detection compares the mean value across consecutive rolling windows of 50 data points each. If the mean is consistently moving in one direction — each window's average is higher (or lower) than the previous window's average, across three or more consecutive windows — a trend is flagged. This is monotonic drift: not a spike, not a step change, but a steady, persistent movement in one direction.

This is the method that gives you weeks of advance warning on bearing wear, budget overruns that build slowly, and supplier performance that's eroding imperceptibly. No single period looks alarming. The direction across periods is the signal.

Analogy: Your bathroom scale shows your weight each morning. No single day looks dramatically different. But Trend Detection is the method that notices "you were 78.2kg three weeks ago, then 78.8kg, then 79.3kg, then 79.9kg this week." Each individual reading looks fine. The four-week upward trajectory is the alert. This is how gradual health problems — and gradual data problems — are caught before they become crises.
Worked example — equipment spindle temperature drift
Example data: CNC machine spindle temperature across 4 weekly windows (50 readings each)
Window Period Window mean temp Change from prev. Trend status
W1 Week 1 67.2°C — Baseline
W2 Week 2 68.4°C +1.2°C Monitoring
W3 Week 3 69.9°C +1.5°C ⚠️ Warning — 3rd consecutive rise
W4 Week 4 71.8°C +1.9°C 🟠 Critical — accelerating upward trend
No individual reading breached the "normal operating temperature" threshold of 75°C. A single-threshold rule would have seen nothing. Trend Detection flagged the Warning in Week 3 because three consecutive windows all showed higher averages than the previous — monotonic upward drift. The maintenance team scheduled a bearing inspection, found wear, and replaced the bearing before it caused an unplanned line stoppage. This is what preventive detection looks like.
Catches
Gradual drift that never triggers a single-point threshold
Slowly accumulating budget or cost overruns
Equipment wear and sensor calibration drift
Performance erosion in supplier or sales data
Misses on its own
Sudden spikes (EWMA handles those)
Oscillating or reversing trends
Multi-metric anomalies

Method 9 of 9

Stuck & Zero Detection

Identifies sensor freeze (repeated identical values) and sudden drop to zero — fires Emergency immediately
Plain-English explanation
This is the most straightforward of the nine methods, but it catches some of the most expensive failures. It monitors for two specific patterns:

Stuck values: The same number appearing repeatedly across a rolling window. A live sensor that outputs 47.3 for 20 consecutive readings isn't measuring anything — it's frozen. This indicates sensor failure, PLC communication loss, or a data pipeline that's stuck replaying stale data.
Zero values: A metric that has been producing non-zero readings suddenly drops to exactly zero. This indicates a complete equipment stoppage, a service disconnection, a tracking pixel going offline, or a meter that has stopped registering.
Both patterns immediately escalate to Emergency severity — not Warning, not Critical. They indicate that your monitoring data is no longer trustworthy, which is worse than a bad reading. You can act on an anomalous reading. You can't act on data that secretly stopped updating.

Analogy: Imagine a speedometer in a car that suddenly locks at 60 km/h even while the car decelerates to a stop. A stuck speedometer is worse than an inaccurate one — it's actively lying. You'd rather know you have no speed data than trust a frozen reading. Stuck detection is your warning that the instrument has locked up. Zero detection is your warning that the engine has stopped entirely.
Worked example — production line sensor monitoring
Example 1: Stuck sensor — hydraulic pressure readings (bar)
Reading # Pressure (bar) Status
R-001 142.3 Normal
R-002 141.8 Normal
R-003 143.1 Normal
R-004 to R-024 143.1 (repeated 21×) 🔴 Emergency — sensor frozen
Example 2: Zero drop — e-commerce checkout completions per hour
Hour Completions Status
14:00 47 Normal
15:00 52 Normal
16:00 0 🔴 Emergency — checkout dead
17:00 0 🔴 Emergency — ongoing
Both patterns fire Emergency immediately. There's no Warning-then-Critical escalation because there's nothing to investigate — the data source has either stopped or frozen. In the checkout example, the business lost A$2,200/hour (52 completions × ~A$42 AOV) for two hours before the IT team was notified via the email alert. Without Stuck & Zero detection, this would have only been caught in the next morning's daily sales review.
Catches
Sensor freeze and PLC communication failure
Complete equipment or line stoppages
Data pipeline failures serving stale data
Tracking pixel and analytics disconnections
Meter communication failures in utility data
Misses on its own
Partial failures (low values, not zero)
Sensors that output random noise instead of zero
Legitimate zero readings (planned shutdowns)
How all 9 methods combine into a single severity grade
Running nine separate detection methods is only useful if their results are combined intelligently. **ThresholdIQ **uses a score fusion formula that treats Multi-Window Z-Score as the primary driver, with the other eight methods acting as boosters:

/* Score fusion formula */
final_score = multiWindow_score + min(0.25, ml_composite × 0.25)

/* ML composite weights */
ml_composite =
EWMA(0.12) + SARIMA(0.22) + IForest(0.20) +
Correlation(0.12) + DBSCAN(0.06) +
Seasonal(0.12) + Trend(0.10) + Stuck(0.06)
The key design principle: ML methods can only boost severity, never reduce it. If Multi-Window Z-Score says Warning, the combined ML composite can escalate it to Critical or Emergency, but it cannot declare it Normal. This prevents false negatives — genuine anomalies can't be overridden by other methods — while also preventing false positives from any single ML method firing alone.

Example of fusion in action: A meter reading fires a Warning from Multi-Window Z-Score (W50 breach). SARIMA flags it as unexpected for the time of day (+0.22 boost). Isolation Forest confirms it's a globally unusual reading (+0.20 boost). The combined ml_composite pushes the final score above the Critical threshold. The Warning automatically escalates to Critical — with the signals tab showing exactly which methods fired and why.

Why nine methods and not just one?

Every single one of these methods has failure modes when used alone. Z-Score fires false positives on seasonal data. SARIMA can't catch sudden spikes on new datasets. Isolation Forest doesn't understand time. EWMA can't detect gradual drift. No single method finds everything — but nine methods running in parallel, with intelligent fusion, catches the anomalies that cost real money.

The table below shows which method catches which type of anomaly:

Anomaly type Best method(s)
Sudden single-period spike or crash EWMA, Multi-Window Z-Score
Sustained deviation over many periods Multi-Window Z-Score, Trend detection
Seasonally unexpected value SARIMA, Seasonal Baseline
Multi-metric combination anomaly Isolation Forest, Correlation Deviation
Gradual drift over weeks Trend Detection
Behavioural cluster outlier DBSCAN
Sensor freeze / line halt Stuck & Zero Detection
Any of the above, with time context Seasonal Baseline + all others

Try all 9 methods free — upload your spreadsheet → thresholdiq.app

DEV Community

The 9 ML Anomaly Detection Methods ThresholdIQ Uses — Explained in Plain English

Top comments (0)