Comprehensive Analysis of Random Forest and XGBoost Performance with SMOTE, ADASYN, and GNUS Under Varying Imbalance Levels

Mehdi Imani
Ali Beikmohammadi
Hamid Reza Arabnia

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

This study examines the efficacy of Random Forest and XGBoost classifiers in conjunction with three upsampling techniques—SMOTE, ADASYN, and Gaussian noise upsampling (GNUS)—across datasets with varying class imbalance levels, ranging from moderate to extreme (15% to 1% churn rate). Employing metrics such as F1 score, ROC AUC, PR AUC, Matthews Correlation Coefficient (MCC), and Cohen’s Kappa, this research provides a comprehensive evaluation of classifier performance under different imbalance scenarios, focusing on applications in the telecommunications domain. The findings highlight that tuned XGBoost paired with SMOTE (Tuned_XGB_SMOTE) consistently achieves the highest F1 score and robust performance across all imbalance levels. SMOTE emerged as the most effective upsampling method, particularly when used with XGBoost, whereas Random Forest performed poorly under severe imbalance. ADASYN showed moderate effectiveness with XGBoost but underperformed with Random Forest, and GNUS produced inconsistent results. This study underscores the impact of data imbalance, with MCC, Kappa, and F1 scores fluctuating significantly, whereas ROC AUC and PR AUC remained relatively stable. Moreover, rigorous statistical analyses employing the Friedman test and Nemenyi post hoc comparisons confirmed that the observed improvements in F1 score, PR-AUC, Kappa, and MCC were statistically significant (p < 0.05), with Tuned_XGB_SMOTE significantly outperforming Tuned_RF_GNUS. While differences in ROC-AUC were not significant, the consistency of these results across multiple performance metrics underscores the reliability of our framework, offering a statistically validated and attractive solution for model selection in imbalanced classification scenarios.

Version published to 10.3390/technologies13030088
Feb 20, 2025
Version published to 10.20944/preprints202501.2274.v2
Feb 14, 2025
Version published to 10.20944/preprints202501.2274.v1
Jan 30, 2025

Why ROC-AUC Is Misleading for Highly Imbalanced Data: In-Depth Evaluation of MCC, F2-score, H-measure, and AUC-based Metrics across Diverse Classifiers

This article has 4 authors:
1. Mehdi Imani
2. Majid Joudaki
3. Ayoub Bagheri
4. Hamid R. Arabnia
This article has no evaluationsLatest version Jan 13, 2026
Enhancing Logistic Regression Performance Through Hyperparameter Tuning: A Comparative Evaluation Across Datasets

This article has 7 authors:
1. Mueed Ahmad
2. Noman Javed
3. Awais Muzafar
4. Mateen Muzafar
5. Hadia Naseer
6. Guantian Huang
7. Dianning He
This article has no evaluationsLatest version Jan 9, 2026
An Adaptive Multi-Objective Memetic Algorithm (AMOMA) for the Hyperparameter Tuning of LightGBM

This article has 1 author:
1. Odunayo Damilola Osofuye
This article has no evaluationsLatest version Dec 12, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Why ROC-AUC Is Misleading for Highly Imbalanced Data: In-Depth Evaluation of MCC, F2-score, H-measure, and AUC-based Metrics across Diverse Classifiers

Enhancing Logistic Regression Performance Through Hyperparameter Tuning: A Comparative Evaluation Across Datasets

An Adaptive Multi-Objective Memetic Algorithm (AMOMA) for the Hyperparameter Tuning of LightGBM