Member-only story

LightGBM vs XGBoost vs Catboost

Data Science Letter

12 min readNov 4, 2024

Quick summary

Hello 👋 In this article, I will compare LightGBM, XGBoost and CatBoost in the following areas:

Boosting algorithm
Node splitting
Missing data handling
Feature handling
Data sampling
LightGBM-specific features
XGBoost-specific features
CatBoost-specific features
Tips for choosing between LightGBM, XGBoost and CatBoost
Resources

🚀 Subscribe to us @ newsletter.datascienceletter.com

Boosting algorithm

Conventional boosting (LightGBM, XGBoost) vs Order boosting (CatBoost)

One of the major differences in tree building between LightGBM/XGBoost and CatBoost is the usage of ‘Order boosting’ in CatBoost.

In conventional boosting algorithms (used by LightGBM and XGBoost), at each boosting iteration, the tree is built using the same data points. It is argued that this repeated use of a single set of data points and can increase the chance of overfitting.

To mitigate this effect, CatBoost supports a different boosting algorithm as known as order boosting. The whole idea of this algorithm is to avoid repeatedly using same data points for both tree building and gradient or hessian computations. The method is briefly explained as follows:

First the original training dataset with size N is shuffled S times.
At each boosting iteration, for each shuffled dataset, a separate tree is built for each data position i (where i = 1, 2 ,…, N), using only data points before i (j < i).
The gradients and hessians for a particular data point k are then computed using trees built before k.

In reality, it is not practical to train a tree for each data position for each shuffled datasets, as the computational complexity would scale as SN². The actual algorithm builds trees for a fixed number of…

LightGBM vs XGBoost vs Catboost

Quick summary

Boosting algorithm

Written by Data Science Letter

No responses yet