Evaluating the Performance of Random Forest and XGBoost with Gaussian Noise Upsampling Technique for Customer Churn Prediction

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Customer churn is a critical challenge for subscription-based businesses, especially in telecommunications, where retaining customers is essential to maintaining profitability. This study investigates the efficacy of two ML models, XGBoost and Random Forest, for predicting customer churn using a publicly available telecommunications dataset. The dataset, characterized by imbalanced classes, presents a crucial challenge addressed by incorporating the Gaussian Noise Upsampling (GNUS) sampling technique. The study evaluates and compares the two models using essential performance indicators, including precision, recall, accuracy, F1-score, and ROC-AUC, both with and without GNUS sampling. The results indicate that while XGBoost initially outperforms Random Forest across most metrics, both models show improved recall after the GNUS application, particularly in identifying churn cases. However, this improvement in recall comes with a trade-off in precision and overall accuracy. The findings highlight the relevance of using appropriate sampling techniques to tackle class imbalance in churn prediction and provide valuable insights for developing proactive customer retention strategies.

Article activity feed