Gradient Boosting Machine based prediction of chemotherapy response and role of p53 mutational and smoking status for progression free survival in metastatic colorectal cancer

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background: Identifying predictors of response or progression after first-line chemotherapy for stage 4 colorectal cancer remains a challenge. This study aims to evaluate the correlation between patient outcomes and the p53 mutational status and smoking status of tumors using various machine learning methods. Material and methods: We consecutively recruited all patients diagnosed with metastatic colorectal cancer at an academic center within a specified time period. Response to first-line chemotherapy and associated factors were assessed using various machine learning models. The most accurate model was further optimized. Additionally, common clinical features, MMR, p53, and RAS status were tested for correlation with the outcome. Feature importance and calibration plots were generated, and univariate and multivariate Cox models were utilized to analyze associates of progression-free survival (PFS). Results: A total of 101 newly diagnosed metastatic colorectal cancer patients initiating first-line chemotherapy were included. The median age was 62, and 69% of the cases were male. We evaluated 15 machine learning models to predict the binary outcome of best response to chemotherapy, among which LightGBM demonstrated the highest baseline accuracy of 0.71. Further tuning of the LightGBM model improved accuracy to 0.79, with a macro average AUC value of 0.82. Age at diagnosis, maximum metastatic dimension of cancer, and metastatic status at diagnosis were identified as the three most important features. Genetic variables did not establish significant feature importance for response analysis. Survival analysis revealed an association between PFS and p53 mutation status (Exp(B) = 0.52, Wald = 6.98, P = 0.008) and smoking pack years (Exp(B) = 0.99, Wald = 4.28, P = 0.039). Discussion: Utilizing LightGBM as a machine learning method, we developed a predictive model with good accuracy for assessing response to first-line treatment. If confirmed and further improved, such a model could aid in identifying responders to first-line chemotherapy in metastatic colorectal cancer patients and suggesting alternative chemotherapy options for non-responders. Furthermore, our findings highlight the prognostic importance of genetic features, particularly p53 mutation status, and smoking pack years for PFS duration in this context.

Article activity feed