Introducing DART: A Novel Deep Adaptive Upsampling Technique for Handling Class Imbalance

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Class imbalance remains a persistent challenge in predictive modeling, often leading to biased machine learning outcomes that disproportionately favor the majority class. This study investigates the effectiveness of advanced resampling techniques—both undersampling and oversampling—across two large and highly imbalanced datasets involving credit and loan default prediction. In addition to evaluating established oversampling techniques, the study introduces and validates a novel resampling approach, DART (Deep Adaptive Resampling Technique). Each technique is assessed using a consistent suite of classifiers, including logistic regression, gradient descent, Naïve Bayes, random forest, CatBoost, and artificial neural networks. The results reveal that K-MeansSMOTE and NearMiss outperform other resampling strategies in oversampling and undersampling, respectively, by achieving balanced trade-offs in precision, recall, F1-score, AUC, and Matthews Correlation Coefficient. Notably, DART demonstrates exceptional performance across both datasets, achieving nearly perfect classification scores across all metrics, suggesting strong generalizability and robustness. The study further analyzes the strengths and limitations of each resampling technique and emphasizes the importance of metric selection when evaluating imbalanced datasets. By integrating empirical evaluation with theoretical insights, this research contributes to the growing body of literature on imbalanced learning and offers practical guidance for selecting appropriate resampling strategies. These findings have broader implications for domains such as finance, healthcare, and fraud detection, where class imbalance is common. Overall, the study affirms the value of hybrid and adaptive resampling methods in building more accurate and generalizable predictive models.

Article activity feed