Transformer-Aided Skin Cancer Classification Using VGG19-Based Feature Encoding

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Skin cancer is one of the most common and deadliest forms of cancer in the world, as early detection is a primary means to improve the treatment options. Deep learning has demonstrated potential in the automation of skin lesion classification, but traditional Convolutional Neural Networks fail to capture data scarcity, orientation dependency, and global context. In this paper, we introduce a hybrid model, VGG19-RSPDA-ViT, combining the fine-grained feature extracting capability of VGG19 with the global attention module presented in Vision Transformers (ViT). We develop an original RSPDA (Rotated and Shifted Patch Data Augmentation) method by increasing data diversity and promoting rotation invariance. Performance was validated on two benchmark datasets, the Melanoma Skin Cancer Dataset of 10000 Images (MSK10000) for binary classification and Human Against Machine with 10000 training images (HAM10000) for multi-class classification. Our model obtained an accuracy of 97.91 and 97.10 on MSK10000 and HAM10000, respectively, with consistently high macro precision, recall, specificity and F1 score across all the datasets. VGG19-RSPDA-ViT outperformed the existing state-of-the-art methods with better generalization capability. These results demonstrate that the model developed in this study is effective in skin lesion classification and has strong potential for clinical use as an automated diagnostic tool in dermatology.

Article activity feed