Research on an Interpretable Grey Wolf Optimization-Based Ensemble Machine Learning Model for Identifying Heterogeneity of Bladder Cancer Based on Immunological Microenvironment
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Bladder urothelial carcinoma (BLCA) exhibits marked heterogeneity, leading to variable treatment responses and prognoses across subtypes. Current molecular classification systems lack emphasis on immune-related genes, limiting their utility for guiding immunotherapy. Using TCGA transcriptome data, we identified 490 immune-related differentially expressed genes. The top 20% most representative genes were selected for subtype delineation via Non-negative Matrix Factorization (NMF), yielding 2 optimal subtypes. We then constructed an Exploration-Enhanced Grey Wolf Optimization-based Soft Voting (EGWO-SV) model, integrating Logistic Regression, XGBoost, and Random Forest as base learners. This model outperformed 9 classical machine learning methods (AUC 97.11%, Accuracy 90.00%, F1 88.24%). SHAP visualization highlighted CLEC2B and SULT1A1 as key genes for BLCA prognosis. Subtype analysis revealed significant survival disparities, with the high-risk group linked to advanced stages. EGWO-SV enables efficient BLCA subtyping, supporting precise diagnosis, personalized immunotherapy, and improved understanding of tumor heterogeneity.