How is Artificial Intelligence Transforming the Skin Cancer Screening Pathway? An Umbrella Review

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background AI algorithms for skin cancer detection have shown performance comparable to clinicians in controlled settings, yet their real-world reliability, performance across diverse populations, and readiness for clinical deployment remain uncertain. This umbrella review synthesizes evidence across the screening pathway to characterize AI performance, identify equity gaps, and assess implementation readiness. Methods We searched PubMed, Web of Science, and CINAHL (November 6, 2024) for systematic reviews and meta-analyses evaluating AI for skin cancer detection, excluding narrative reviews, scoping reviews, and reviews not reporting diagnostic accuracy. Two investigators (LS, AB) independently screened studies and assessed quality using ROBIS; one (LS) extracted data with verification by a second (AB). Findings were synthesized narratively by screening phase. This study is registered with PROSPERO (CRD42024605934). Results Of 411 records identified, 37 (2008–2024) met inclusion criteria; 10 (27.0%) were judged low risk of bias, 22 (59.5%) high, and five (13.5%) unclear. Self-screening applications demonstrated marked performance variability (sensitivity 0–98%), with reduced sensitivity for melanoma detection reported across reviews. Primary care AI achieved moderate accuracy (sensitivity 60–84%, specificity 88–93%). Specialist dermoscopy-based AI achieved sensitivities comparable to dermatologists (82–91%), and histopathology AI achieved 90% sensitivity. AI augmentation increased clinician sensitivity by 6–8 percentage points, with greater benefit for generalists (+28%) than specialists (+2%). Engagement with skin tone and ethnicity increased but remained largely superficial, and >70% of datasets were from light-skinned populations. Evidence disproportionately targeted melanoma (>40% of reviews) despite it comprising <2% of skin cancers; no reviews employed implementation science frameworks. Conclusions Current evidence does not support unsupervised clinical deployment of AI-based skin cancer detection. Self-screening tools demonstrate inconsistent performance, equity gaps persist, and common non-melanoma skin cancers remain understudied. These findings support the need for stage-specific validation standards and performance reporting.

Article activity feed