A SMOTEENN-Powered Stacked Ensemble with Transformer-Based Meta-Learner for Balanced Diabetic Retinopathy Grading
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Diabetic retinopathy (DR) is a major cause of vision loss, but manual screening is time-consuming and specialist-dependent. While deep learning models offer a solution, their reliability is often compromised by the severe class imbalance in clinical datasets, where healthy images far outnumber critical severe-stage images. To address this core challenge, we are the first to apply SMOTEENN (Synthetic Minority Oversampling Technique with Edited Nearest Neighbors), a powerful hybrid resampling method, to the APTOS 2019 Blindness Detection dataset. This technique generates high-quality synthetic data for minority classes while simultaneously cleaning noisy samples, creating a more balanced and reliable training set. Leveraging this balanced data, we propose a stacked ensemble framework that combines ResNet50 and DenseNet121 for feature extraction with a Transformer-based model and LightGBM as meta-learners. Our model achieved a G-Mean of 0.892 and a weighted F1-score of 0.940, demonstrating high, balanced accuracy across all five DR stages. This study proves that tackling data-level imbalance with SMOTEENN is a critical first step, enabling our ensemble to effectively capture retinal features for real-world DR screening.