Clinical Application of Vision Transformers for Melanoma Classification: A Multi-Dataset Evaluation Study

Antony Garcia
Jixing Zhou
Gabriela Pinero-Crespo
Thomas Beachkofsky
Xinming Huang

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Background: Melanoma is among the most lethal skin cancers, with survival heavily dependent on early detection, yet diagnosis remains challenging due to its resemblance to benign nevi. Despite their success in automated dermoscopy, convolutional neural networks remain limited by their focus on local features and their dependence on fixed input sizes, which can constrain generalization. Vision Transformers, which model global image context through self-attention, offer a promising alternative. Methods: A ViT-L/16 model was fine-tuned using the ISIC 2019 dataset of over 25,000 dermoscopic images. To expand the dataset and balance class representation, synthetic nevus and melanoma images were generated with StyleGAN2-ADA, with only high-confidence outputs retained. Performance was assessed on an external biopsy-confirmed dataset (MN187) and compared with CNN baselines (ResNet-152, DenseNet-201, EfficientNet-B7, ConvNeXt-XL, ViT-B/16) and the commercial MoleAnalyzer Pro system using ROC-AUC and DeLong’s test. Results: The ViT-L/16 model achieved the highest baseline ROC-AUC of 0.902 on MN187, exceeding the performance of CNN models and MoleAnalyzer Pro, though this difference was not statistically significant (p = 0.07). The incorporation of 46,000 confidence-filtered GAN-generated images increased the ROC-AUC to 0.915, producing a statistically significant improvement (p = 0.032). Conclusions: Vision Transformers show strong potential for melanoma classification, especially when combined with GAN-based augmentation, offering advantages in global feature representation and data expansion that support the development of reliable AI-driven clinical decision-support systems.

Version published to 10.20944/preprints202510.0303.v2
Oct 6, 2025

Out-of-Distribution Performance Analysis of Skin Lesion Classifiers for dermoscopic images

This article has 7 authors:
1. Eva Milara
2. Vanesa Gómez-Martínez
3. David Chushig-Muzo
4. María Castro-Fernández
5. Gustavo M. Callico
6. Conceição Granja
7. Cristina Soguero-Ruiz
This article has no evaluationsLatest version Sep 9, 2025
DermFusionX: An Explainable CNN–MLP Late Fusion Framework for Multimodal Skin Lesion Classification

This article has 1 author:
1. Vanshika Sharma
This article has no evaluationsLatest version Sep 25, 2025
SpectraViT: A Novel Hybrid Architecture for Enhanced Melanoma Classification

This article has 4 authors:
1. Samridhi Raj Sinha
2. Asmi Parikh
3. Archana Bhise
4. Supriya Agarwal
This article has no evaluationsLatest version Aug 26, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Out-of-Distribution Performance Analysis of Skin Lesion Classifiers for dermoscopic images

DermFusionX: An Explainable CNN–MLP Late Fusion Framework for Multimodal Skin Lesion Classification

SpectraViT: A Novel Hybrid Architecture for Enhanced Melanoma Classification