What I Did
- All models were tested under the same rules:
- Default settings from their libraries
- No hyperparameter tuning
- Same preprocessing
- Unique encoding for categorical features
- No dataset-specific tricks
- 3-Fold Cross Validation means
- CPU only
- Measured Single Inference P95 latency
Logistic Regression and KNN were scaled for fairness.
That’s it. No magic sauce.
What I Measured
For classification:
- Accuracy (CV Mean)
- Macro F1 (CV Mean)
- Single Inference P95 (ms)
For regression:
- CV RMSE
- Test RMSE
- Single Inference P95 (ms)
Because accuracy without latency is like buying a sports car without checking fuel cost.
Classification Results... What Surprised Me
Tree Models Still Dominate Accuracy
Across datasets like:
- Adult
- Credit Default
- Santander
- Fraud Detection
CatBoost, LightGBM, and XGBoost were very strong.
Example:
On Adult:
- LightGBM → 0.8734 accuracy
- CatBoost → 0.8726
- XGBoost → 0.8594
Solid.
But here’s the twist.
Random Forest Is Slow. Like… Really Slow.
On almost every dataset:
RandomForest P95 latency ≈ 24–38 ms
If you serve millions of predictions per hour, that gap is not “small.”
That’s server bills.
Accuracy Differences Are Small. Latency Differences Are Massive.
Example: Credit Card Fraud
Accuracy:
- CatBoost → 0.9996
- RandomForest → 0.9995
- SmartKNN → 0.9995
- XGBoost → 0.9995
All basically identical.
Latency:
- RandomForest → 25 ms
- SmartKNN → 0.31 ms
- XGBoost → 0.63 ms
Same accuracy.
80x latency difference.
That hit me.
KNN Is Fast… Until It Isn’t
Regular KNN sometimes exploded in latency.
Example:
Porto Seguro dataset:
- KNN → 34.67 ms
- SmartKNN → 0.35 ms
Same idea. Different implementation.
Distance methods are tricky.
In high dimensions, they behave nicely… until they don’t.
Curse of dimensionality is not theory. It’s pain.
Sometimes Simple Models Win
On Bank Marketing:
- SmartKNN → 0.9982 accuracy
- KNN → 0.9982
- CatBoost → 0.9973
- LightGBM → 0.9918
Tiny dataset-specific patterns matter.
No model wins everywhere.
Regression Results.. Same Story
Tree models are strong.
But again.. latency changes everything.
Example: Diamonds dataset
Best CV RMSE:
- SmartKNN → 892
- KNN → 933
- RandomForest → 1153
But RandomForest P95 latency: 34 ms
SmartKNN: 0.19 ms
That gap is wild.
On California Housing:
Tree models dominate accuracy.
But distance models:
- SmartKNN → 0.18 ms
- KNN → 0.65 ms
Speed monsters.
Lower accuracy, yes.
But ultra-cheap inference.
Engineering is about tradeoffs.
Big Things I Learned
- No Model Wins Everywhere
- Accuracy Differences Are Often Tiny
- Default Models Are Already Very Strong
- P95 Latency Matters More Than You
- Tree Models Are Systems
So What Actually Matters?
If you’re doing Kaggle:
Maximize metric.
If you’re deploying:
Balance:
- Accuracy
- Latency
- Memory
- Predictability
- Stability
Engineering is constraint optimization.
Not leaderboard chasing.
Top comments (0)