A machine learning model for the early prediction of Gram-negative bloodstream infection in ICU patients
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Backgroud Gram-negative bloodstream infection (GN-BSI) can induce fatal septic shock, and the increasingly severe problem of antimicrobial resistance results in high clinical mortality particularly in intensive care unit (ICU) patients. The early identification of pathogens and timely antibiotic therapy are critical for patient outcomes. However, conventional diagnostic methods like blood culture are time-consuming and can delay treatment. Furthermore, the the implementation of molecular detection techniques in routine laboratories is often hindered by high costs and technical complexity.Machine learning (ML) offers a promising alternative for early prediction of GN-BSI. This study aims to develop an early prediction model for GN-BSI by integrating clinical and laboratory parameters from ICU patients using machine learning algorithms, thereby assisting in the early diagnosis and treatment of GN-BSI. Methods This retrospective study utilized data from ICU patients admitted to the West District of the First Affiliated Hospital of Anhui Medical University between January and July 2025. Following data preprocessing and multiple imputation of missing values, the dataset was randomly divided into training and validation sets in a 7:3 ratio. Feature selection was performed using Lasso regression and multivariate logistic regression. Seven ML models were developed and evaluated based on metrics including the area under the receiver operating characteristic curve (AUC), accuracy, sensitivity, specificity,F1-score, positive predictive value (PPV), and negative predictive value (NPV). Model interpretability was further assessed using Shapley Additive Explanation (SHAP) analysis. Results This study ultimately included 405 ICU patients. Following further feature selection, four variables were identified, including deep vein catheterization, continuous renal replacement therapy (CRRT), procalcitonin, and c-reactive protein (CRP). Early prediction models for GN-BSI in ICU patients were constructed using seven machine learning algorithms. Among them, the XGBoost model demonstrated the best performance, with the AUC value of 0.898, accuracy of 88.43%, F1 score of 0.783,PPV of 85.00%, and NPV of 89.10%. SHAP bar and beeswarm plots illustrate the contribution of the four variables to the outcome. The SHAP dependency plot and force analysis provided model interpretation at the factor level and individual level, respectively. Conclusions We have successfully developed, evaluated, and interpreted a machine learning model for predicting GN-BSI in ICU patients, facilitating timely interventions and treatments. The XGBoost model holds potential for clinical reference following validation set and further refinement.