Adaptive Synthetic Minority Oversampling Technique with Density-Guided Noise Injection and Local Density Adaptation
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Class imbalance remains a persistent challenge in supervised learning, often leading to biased classifiers and poor detection of minority instances. This paper introduces Adaptive Synthetic Minority Oversampling Technique with Guided Density (AdaptiveSMOTEGD), a novel method that integrates local density-based sparsity detection, tunable Gaussian noise injection, and domain-specific constraint preservation. Unlike conventional methods such as Synthetic Minority Oversampling Technique (SMOTE), Adaptive Synthetic Sampling Approach (ADASYN), Borderline-SMOTE, Synthetic Minority Over-sampling Technique for Nominal and Continuous features (SMOTENC), Support Vector Machine SMOTE (SVMSMOTE), and KMeans-SMOTE, the proposed approach selectively targets sparse minority regions while avoiding degradation in dense areas. It also supports datasets with purely numerical features as well as those containing both numerical and categorical attributes. Experimental evaluation on eight numerical-only and six mixed-type benchmark datasets using Light Gradient Boosting Machine (LightGBM) demonstrates that AdaptiveSMOTEGD consistently achieves competitive or superior performance in F1-score, recall, Matthews Correlation Coefficient (MCC), and area under the precision-recall curve (AUC-PR), particularly under highly imbalanced and noisy conditions. Statistical analysis confirms significant improvements in recall for both numerical-only and mixed datasets, establishing AdaptiveSMOTEGD as a robust, scalable, and versatile solution for real-world imbalanced classification problems.