Development and Internal Validation of an Explainable Machine-Learning Model to Predict 3-Year overall survival rate After Radical Cystectomy

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background: This study aimed to develop and internally validate an explainable machine-learning model using routinely available clinicopathologic and laboratory variables for predicting 3-year overall survival (OS) after radical cystectomy. Methods: We retrospectively included 300 patients who underwent radical cystectomy between January 2018 and December 2022. Predictors were selected in the training set using LASSO logistic regression followed by random-forest recursive feature elimination. Ten variables were retained. Seven algorithms (logistic regression, KNN, SVM-RBF, random forest, XGBoost, LightGBM, and CatBoost) were trained on a 70% training set and evaluated on a 30% internal validation set. Discrimination, calibration, and clinical utility were assessed, and the final model was interpreted using Shapley additive explanations (SHAP). Results: In internal validation, AUCs ranged from 0.834 to 0.950. CatBoost achieved the best overall classification performance (AUC = 0.931, accuracy = 0.862, sensitivity = 0.647, specificity = 0.951, PPV = 0.846, and NPV = 0.867). SHAP analyses identified tumor stage (T, N, and M stage) as the dominant drivers of predicted risk, with additional contributions from age, BMI, albumin, globulin, lymphocyte count, platelet count, and preoperative creatinine. Conclusions: We developed an internally validated, SHAP-interpretable CatBoost model for predicting 3-year overall survival (OS) after radical cystectomy. External validation and recalibration in independent cohorts are required before clinical use.

Article activity feed