Comparative Study of XGBoost and Logistic Regression for Predicting Sarcopenia in Postsurgical Gastric Cancer Patients

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background: The use of machine learning (ML) techniques, particularly XGBoost and logistic regression, to predict sarcopenia among postsurgical gastric cancer patients has gained significant attention in recent research. Sarcopenia, characterized by the progressive loss of skeletal muscle mass and strength, is a serious concern in these patients due to its association with poor postoperative outcomes, including increased morbidity and mortality. In this study, machine learning was used to establish a risk prediction model for sarcopenia in patients with gastric cancer undergoing gastrectomy to facilitate early intervention and reduce the incidence of postoperative complications. Methods: Gastric cancer patients who underwent surgery at a tertiary comprehensive hospital in Nanjing (China) from January 2022 to December 2023 were retrospectively included in this study, and their clinical and follow-up data were collected. The XGBoost model and multivariate logistic regression analysis model were used to screen the factors related to postoperative outcomes, and the results of the two models were compared. The area under the receiver operating characteristic (ROC) curve (AUC), sensitivity and specificity were calculated to evaluate the predictive value of the XGBoost model. The SHAP (SHapley Additive exPlanations) method was used to explain the XGBoost model and determine the impact of features on the prediction model. Results: A total of 231 postoperative gastric cancer patients were included in this study, of whom 128 (55.4%) developed sarcopenia. The results of the univariate analysis and LASSO (Least Absolute Shrinkage and Selection Operator) regression were cross-validated, and 5 key study variables were ultimately determined: serum albumin, comorbid diabetes, operation style, nutritional score, and ECOG (Eastern Cooperative Oncology Group) performance status score. The XGBoost model has slightly better AUC (0.987, 95% CI: 0.976-0.998) than the logistic regression model (0.918, 95% CI: 0.873-0.963) in the training set. The SHAP analysis showed that in the XGBoost model, diabetes, nutritional score, and serum albumin have a greater impact on the sarcopenia risk prediction after gastric cancer surgery, especially the impact of diabetes and nutritional score is the most significant, followed by the ECOG performance status score, and operation style has the least impact. Conclusions: In summary, the machine learning-based sarcopenia prediction model constructed in this study provides a valuable decision support tool for clinical screening and intervention of sarcopenia.

Article activity feed