DEV Community

TildAlice
TildAlice

Posted on • Originally published at tildalice.io

XGBoost vs LightGBM vs CatBoost: Kaggle Tabular Benchmark

The 3-Way Race Nobody Expected

I ran the same Kaggle-style tabular dataset through XGBoost, LightGBM, and CatBoost with default settings. LightGBM trained in 2.3 seconds. XGBoost took 18.7 seconds. CatBoost? 47.2 seconds.

But here's the twist: CatBoost won on validation AUC by 0.008 points.

This mirrors what I see in Kaggle competitions — speed doesn't always correlate with leaderboard position. The library that takes longest to train often squeezes out that last 0.5% accuracy that separates gold from silver medals. But is it worth the wait?

I'll benchmark all three on a real dataset (Home Credit Default Risk), measure training time, memory usage, and predictive performance, then show you which hyperparameters actually matter. The results challenge the conventional wisdom that "LightGBM is always faster and good enough."

Close-up of graffiti 'Das Boot Ist Voll' on a post in Bubenreuth, Germany.

Photo by Markus Spiske on Pexels

The Test Setup: Home Credit Default Risk Data


Continue reading the full article on TildAlice

Top comments (0)