Can Artificial Intelligence Match Dermoscopy in Melanoma Detection? Evidence from a Systematic Review and Meta-analysis of Pigmented Skin Lesions
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background
Accurate risk stratification of pigmented skin lesions is essential for early melanoma detection and reducing unnecessary excisions. Although artificial intelligence (AI) is increasingly applied to dermoscopic image analysis, its diagnostic performance relative to dermoscopy remains uncertain.
Objective
To compare the diagnostic performance of AI, dermoscopy, and AI-assisted clinicians in the malignancy risk stratification of pigmented skin lesions.
Methods
PubMed, Embase, Web of Science, and Cochrane Library were systematically searched for studies evaluating AI, dermoscopy, or AI-assisted clinicians in diagnosing pigmented or melanoma-suspected skin lesions. Diagnostic performance metrics were calculated from extracted or reconstructed data, and study quality was assessed using QUADAS-2 and QUADAS-C.
Results
A total of 2571 records were identified, and 10 studies were included in the main quantitative analysis, contributing 17 diagnostic arms. These included 10 dermoscopy arms, 6 AI-alone arms, and 1 AI-assisted clinician arm. In the dermoscopy group, sensitivity ranged from 0.418 to 0.966, and specificity ranged from 0.293 to 0.975. In the AI group, sensitivity ranged from 0.164 to 0.968, and specificity ranged from 0.374 to 0.983. AI-assisted clinicians showed a sensitivity of 1.000 and specificity of 0.837 in the single available study. Overall, AI and dermoscopy showed overlapping diagnostic performance, although substantial variability was observed across AI algorithms and clinical settings. Deeks funnel plots did not indicate significant publication bias in either the AI group or the dermoscopy group.
Conclusions
Autonomous AI showed diagnostic performance broadly comparable to dermoscopy. Although AI achieved slightly higher pooled specificity, its sensitivity was lower, indicating no clear clinical advantage. AI is therefore best used as an adjunct to dermatological evaluation rather than a standalone tool.