Interpretable Machine Learning Models for Bladder Cancer Overall Survival Prediction Development and External Validation via SEER Database and Chinese Cohort Analysis

Saimaitikari Abudoubari
Abudouresuli Tuersun
Sailidan Mutailipu
Wenbin chen
Qiange Li
Mayidili Nijiati
Xiaoguang Zou

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Objective We developed interpretable machine learning(ML) models to predict overall survival in bladder cancer patients. This approach aims to improve the interpretability and transparency of our modeling results. Methods We collected clinical and pathological information on bladder cancer patients from the SEER database, allocating it to training and validation sets in a 7:3 ratio. At the same time, we obtained an external validation cohort from Kashgar First People's Hospital in Xinjiang, China. We performed LASSO regression and Cox regression analyses to identify relevant risk factors and then combined these to develop CoxPH and six ML models: Random Survival Forest(RSF), Gradient Boosting with Component Linear(GLMboost), decision tree(dt), boosted tree(bt), DeepSurv, and neural multi-task logistic regression(NMTLR). We evaluated the predictive performance of these ML models using the consistency index (C-index), the area under the cumulative/dynamic curve (AUC) and the integrated Brier score and Kolmogorov-Smirnov(KS). For interpretability assessment, we employed three complementary methods: (1)time-dependent variable importance to quantify feature contribution across follow-up periods; (2)partial correlation survival plots to visualize individual variable effects; and (3)aggregated survival SHapley additive interpretation(SurvSHAP) plots with mean absolute deviation metrics to validate feature impact stability at both individual and population levels. Results The final ML model consists of 14 factors: the patient's age, AJCCStage, chemotherapy, Mstage, marital, Tstage, bone metastasis(BoneMets), stage, radiation, histology, liverMets, Nstage, sex. Our predictive models demonstrates significant discriminative ability, with the boosting tree model performing the best. The AUC for 1-year, 3-year, and 5-year overall survival (OS) was above 0.770 for the training set, validation set, and external validation set, respectively, with the overall Brier score consistently below 0.180. The interpretability analysis of the boosting trees model further indicated that AJCCStage, age, chemotherapy, stage, Mstage, marital were the most influential predictors via quantifiable SurvSHAP values and time-dependent importance weights, with their effects visually validated through partial correlation survival curves. Conclusions The boosting trees model prognostic model has the best performance and can be used to predict OS in bladder cancer patients, helping physicians to accurately assess patients' overall survival rates, and providing valuable and important references for patient diagnosis, treatment, and prognosis evaluation.

Version published to 10.21203/rs.3.rs-8591156/v1 on Research Square
Feb 10, 2026

Development and Internal Validation of an Explainable Machine-Learning Model to Predict 3-Year overall survival rate After Radical Cystectomy

This article has 4 authors:
1. Yunze Wang
2. Aikeshanjiang Ailiyaer
3. Shiming Chen
4. Wenguang Wang
This article has no evaluationsLatest version Feb 11, 2026
An Ensemble-Base Machine Learning Approach to Predict 2- and 10-Year Breast Cancer

This article has 10 authors:
1. Patricia Honorato Moreira
2. Arthur Shuzo Owtake Cardoso
3. Rafael de Oliveira
4. Joaquim Gasparini
5. Renata Colombo Bonadio
6. Bruna Salani Mota
7. Alexandre Ferreira Ramos
8. Flavia Santoro
9. Roger Chammas
10. Luciana Rodrigues Carvalho Barros
This article has no evaluationsLatest version Feb 22, 2026
Construction and validation of a nomogram for overall survival prognosis in patients with advanced (stage Ⅲ/Ⅳ) pancreatic cancer

This article has 10 authors:
1. Dongqi Yang
2. Chenjie Wang
3. Ke Su
4. Xin Liu
5. Zunyuan Tan
6. Jianwen Zhang
7. Han Li
8. Zhenjiang Li
9. Kun He
10. Yunwei Han
This article has no evaluationsLatest version Apr 3, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Development and Internal Validation of an Explainable Machine-Learning Model to Predict 3-Year overall survival rate After Radical Cystectomy

An Ensemble-Base Machine Learning Approach to Predict 2- and 10-Year Breast Cancer

Construction and validation of a nomogram for overall survival prognosis in patients with advanced (stage Ⅲ/Ⅳ) pancreatic cancer