Machine Learning for Early Prediction of Secondary Cancer Post-Radiotherapy
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Secondary cancer(SC) following Radiotherapy (RT) represents a significant long-term risk of cancer survivors, necessitating accurate predictive models for early intervention. This study developed a machine learning (ML) model integrating clinical, pathological, and genomic data to predict SC incidence. The model leverages a dataset of 1,240 patients from population based registries and clinical cohorts, incorporating features such as radiation dose, age at exposure, histology, and mutations (e.g., TP53, BRCA1/2). A Random Forest (RF) regression achieved perfect performance metrics (MSE = 0.001, R-squared = 0.99), with radiation dose(Gini importance = 0.42) and age at exposure (Gini importance = 0.38) identified as the most critical predictors. Predicted incidence rates for new patients, such as 15.2 per 10,000 for breast- to-lung SCs, are consistent with epidemiological trends. The model’s impressive performance highlights its potential for accurately predicting SC, underscoring its utility in clinical settings for early detection and predictions for new patients. This study highlights the potential of ML in personalized oncology while emphasizing caution in interpreting overly optimistic metrics.