Loss Function Matters More Than Framework: A Comparative Study of Gradient Boosting Robustness to Outliers
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
We present a systematic empirical study comparing the robustness of four major tree-based ensemble algorithms — XGBoost, LightGBM, CatBoost, and Random Forest — to controlled training data contamination. Unlike prior work that compares frameworks as monolithic units, we test multiple loss functions (MSE, Huber, MAE) within each boosting framework, yielding 13 regression and 5 classification configurations. Experiments on California Housing, Kaggle House Prices, and Adult Census Income datasets at contamination levels 0-40% reveal that loss function choice affects robustness radically more than framework choice. Within-framework retention index spread averages 0.63, roughly three times the between-framework spread of 0.51. LightGBM with MAE loss retains 96.6% of R2 at 40% label noise, while the same framework with MSE loss retains only 26.6%. Random Forest ranks only 8th out of 12 configurations. We provide theoretical justification through influence function analysis, report anomalous collapse of Huber loss when miscalibrated, and propose a retention index for standardized robustness comparison. For classification under symmetric label noise, CatBoost achieves the highest MCC retention (71.1%), significantly outperforming Random Forest (60.7%).