Introducing DART: A Novel Deep Adaptive Upsampling Technique for Handling Class Imbalance

Mark Lokanan

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Class imbalance remains a persistent challenge in predictive modeling, often leading to biased machine learning outcomes that disproportionately favor the majority class. This study investigates the effectiveness of advanced resampling techniques—both undersampling and oversampling—across two large and highly imbalanced datasets involving credit and loan default prediction. In addition to evaluating established oversampling techniques, the study introduces and validates a novel resampling approach, DART (Deep Adaptive Resampling Technique). Each technique is assessed using a consistent suite of classifiers, including logistic regression, gradient descent, Naïve Bayes, random forest, CatBoost, and artificial neural networks. The results reveal that K-MeansSMOTE and NearMiss outperform other resampling strategies in oversampling and undersampling, respectively, by achieving balanced trade-offs in precision, recall, F1-score, AUC, and Matthews Correlation Coefficient. Notably, DART demonstrates exceptional performance across both datasets, achieving nearly perfect classification scores across all metrics, suggesting strong generalizability and robustness. The study further analyzes the strengths and limitations of each resampling technique and emphasizes the importance of metric selection when evaluating imbalanced datasets. By integrating empirical evaluation with theoretical insights, this research contributes to the growing body of literature on imbalanced learning and offers practical guidance for selecting appropriate resampling strategies. These findings have broader implications for domains such as finance, healthcare, and fraud detection, where class imbalance is common. Overall, the study affirms the value of hybrid and adaptive resampling methods in building more accurate and generalizable predictive models.

Version published to 10.21203/rs.3.rs-6895500/v1 on Research Square
Jun 18, 2025

Quadratic Surface Twin Support Vector Machine for Imbalanced Data

This article has 5 authors:
1. Hossein Moosaei
2. Milan Hladik
3. Ahmad Mousavi
4. Zheming Gao
5. Haojie Fu
This article has no evaluationsLatest version Jun 13, 2025
Beyond Logistic Regression: Calibration With Dropouts In Tiny Neural Networks

This article has 1 author:
1. Aaditya Kachhadiya
This article has no evaluationsLatest version Jul 3, 2025
Systematic Evaluation of Label Noise Effects on Accuracy and Calibration in Deep Neural Networks

This article has 1 author:
1. Christopher Boseak
This article has no evaluationsLatest version Jul 24, 2025

Listed in

Abstract

Article activity feed

Related articles

Quadratic Surface Twin Support Vector Machine for Imbalanced Data

Beyond Logistic Regression: Calibration With Dropouts In Tiny Neural Networks

Systematic Evaluation of Label Noise Effects on Accuracy and Calibration in Deep Neural Networks