Machine Learning Prediction of Surgical Site Infections Following Major Gastrointestinal Surgery: A Comprehensive Model Development and Validation Study in Yemeni Patients

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background Surgical site infections (SSIs) continue to exert a substantial burden on healthcare systems, particularly in resource-limited settings where they contribute to prolonged hospitalizations, escalated costs, and increased patient morbidity. The ability to accurately predict SSI risk is essential for implementing targeted prevention strategies and optimizing resource allocation, especially in constrained environments. Methods We conducted a retrospective cohort study utilizing data from 525 patients who underwent major gastrointestinal surgery at Ibb University-affiliated hospitals in Yemen between 2018 and 2023. Four machine learning models—Logistic Regression, Random Forest, XGBoost, and Neural Network—were developed using 38 preoperative and intraoperative variables. Temporal validation was performed, with data from 2018–2022 used for model training (n = 420) and 2023 data (n = 105) reserved for testing. Model performance was evaluated by area under the receiver operating characteristic curve (AUROC), area under the precision-recall curve (AUPRC), calibration metrics, and decision curve analysis. Subgroup analyses assessed model fairness across demographic and clinical strata. Results The observed SSI rate was 16.2%, consistent across both training and test sets. XGBoost achieved the highest predictive performance (AUROC: 0.934; 95% CI: 0.891–0.967; AUPRC: 0.809), outperforming logistic regression (AUROC: 0.868, p = 0.012) and neural network (AUROC: 0.890, p = 0.038) models. Random Forest also demonstrated competitive accuracy (AUROC: 0.924; AUPRC: 0.787). Robust performance was maintained across critical subgroups, with XGBoost yielding an AUROC of 0.967 among elderly patients and Random Forest achieving an AUROC of 0.979 among diabetic patients. All models systematically overestimated SSI risk (calibration slopes > 2.0), though XGBoost exhibited the best calibration (Brier score: 0.080). Decision curve analysis confirmed clinical utility within probability thresholds of 15–35%. Conclusion Machine learning models, specifically XGBoost and Random Forest, can accurately predict SSI risk following major gastrointestinal surgery in the Yemeni healthcare context. Despite calibration limitations, these models demonstrate strong discriminative ability and clinical utility, supporting their use for risk stratification in resource-limited settings. The development of a simplified risk score offers a pragmatic alternative for implementation in environments with limited technological infrastructure.

Article activity feed