Predicting Distant Melanoma Metastasis at Diagnosis Using Machine Learning
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Distant melanoma metastasis at the time of diagnosis is uncommon, but has major implications for patient prognosis and treatment selection. However, few tools can reliably predict the risk of distant metastasis at initial presentation. Here, we developed and evaluated machine learning models to predict distant melanoma metastasis using routinely captured clinicopathologic and demographic variables across all histologic subtypes. Using the National Cancer Institute Surveillance, Epidemiology, and End Results (SEER) program from 2010-2022, we identified adults aged 20 to 90 years with melanoma as the first and only primary malignancy (n=51,285). Explainable Boosting Machine achieved a strong balance of discrimination and precision (AUROC = 0.947, AUPRC = 0.610, Precision = 0.793, Brier = 0.015). At 90% sensitivity, specificity was 0.843 with consistent performance across cross-validation folds. Clinicopathologic variables, including T stage, Breslow thickness, ulceration, and mitotic activity, contributed the largest share of predictive signal across descriptive, regression-based, and SHAP analyses, with smaller contributions from demographic factors. Decision curve analysis supported clinical utility, showing a net reduction of 88.3 per 100 patients and a standardized net benefit of 0.541. This model could be used to identify patients at sufficiently elevated risk to justify staging PET/CT despite otherwise localized clinical presentation. Cost-consequence analysis further showed that imaging true- and false-positive patients at 85% to 95% sensitivity threshold nearly doubled downstream imaging cost. We deployed the final model as an online calculator to support exploration of individualized risk estimates ( https://melanoma-calculator.streamlit.app/ ).