DEV Community

TildAlice
TildAlice

Posted on • Originally published at tildalice.io

IQR vs Isolation Forest vs DBSCAN: 47% False Positive Gap

The IQR Method Flags Half Your Legitimate Trades as Outliers

I ran three outlier detection methods on the same financial time series — daily returns from a mid-cap tech stock over 2020-2023. IQR flagged 47% of the dataset as outliers. Isolation Forest caught 8%. DBSCAN found 3%.

One of these methods is clearly broken for financial data.

The culprit? IQR treats heavy-tailed distributions like normal distributions. Financial returns follow a leptokurtic distribution — fat tails, frequent extreme moves. The traditional $Q_3 + 1.5 \times IQR$ threshold was designed for symmetric, thin-tailed data. Apply it to stock returns and you'll flag every earnings surprise, Fed announcement, and after-hours gap as an "anomaly."

But Isolation Forest and DBSCAN aren't perfect either. One struggles with temporal dependencies, the other requires manual parameter tuning that breaks when volatility regimes shift. Here's what actually works, backed by code you can run today.

Vivid stacked area chart and graphs on paper, showcasing data analysis.

Photo by RDNE Stock project on Pexels

Why Financial Data Breaks Classical Outlier Detection


Continue reading the full article on TildAlice

Top comments (0)