Machine Learning–Based Prognostic Evaluation of Delayed Diagnosis and Prognostic Outcomes in Testicular Cancer
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Purpose: To develop and validate a machine learning–based prognostic model using real-world data from a tertiary care center in Mexico to assess the impact of diagnostic delay and identify clinical, histopathological, and biochemical predictors of disease progression and mortality in testicular cancer, and to determine whether a Random Forest model based on clinical stage and metastatic status can accurately predict these adverse outcomes in a low- and middle-income setting. Methods: We retrospectively analyzed 223 testicular cancer cases, defining diagnostic delay as symptoms > 6 months. Predictors of progression and mortality were identified by multivariate analysis, and a Random Forest model based on clinical stage and metastasis status was trained (80:20 split) to predict adverse outcomes and assess feature importance using Python. Results: The Random Forest model achieved 83.3% accuracy for predicting progression and mortality, with clinical stage (79.1%) and metastatic status (20.9%) as the main contributors. Among 223 patients (mean age 27.8 years), non-seminomatous tumors predominated (53.4%), and 24.7% were poor-risk by IGCCCG. Diagnostic delay > 6 months occurred in 25.6% and was strongly associated with mortality (OR 12.98, p < 0.001). High AFP, elevated LDH, and non-seminomatous histology were additional predictors of adverse outcomes. Conclusions: A diagnostic delay > 6 months significantly increased mortality risk. A machine learning model using only clinical stage and metastasis achieved 83.3% accuracy, highlighting the potential of simple, low-cost tools for early risk stratification and improved clinical decision-making in resource-limited settings.