Enhancing Diabetic Retinopathy Prediction Using Transformer-based Attention in Hybrid CNN Models

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Diabetic retinopathy (DR) is one of the major causes of blindness across the globe and hence its early and accurate detection is required to avoid drastic vision loss. In this study, we introduce a hybrid learning method that utilizes the combination of deep learning models with transformer-based attention mechanisms to make improved predictions for diabetic retinopathy. Our approach utilizes an ensemble of pre-trained models including InceptionV3, DenseNet121, VGG16, MobileNetV2, and ResNet50, which are individually renowned for their robust feature extraction ability. By incorporating self-attention and multi-head attention mechanisms with the hybrid models, we try to enhance feature representation and obtain increased classification accuracy. Our experimental findings indicate that such hybrid architectures are able to learn intricate retinal patterns and improve model performance compared to individual architectures. Surprisingly, the integration of ResNet50 and DenseNet121 with a transformer-based attention mechanism provided the most stable accuracy and robust results. This paper demonstrates the potential of hybrid deep learning models with the inclusion of attention mechanisms as a viable solution for enhanced diabetic retinopathy diagnosis. Our findings promote the advancement of sophisticated automatic medical image analysis techniques and improve clinical decision support systems for retinal disease detection.

Article activity feed