A Hybrid AutoML Ensemble Integrating Conventional Learners and Gradient-Boosting Models for Multi-Outcome Prediction in ICU Patients with Pseudomonas aeruginosa

LV Xiao-chun
Ren Qi
Zhu Lihong
CHEN Kun
WANG Jian-bing
CHEN Fang
JIN Kai-ling
LIN Kai

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Background

Carbapenem resistance in Pseudomonas aeruginosa is increasing in intensive care units (ICUs). To enhance antimicrobial stewardship and infection control, we aimed to develop and validate a real-time interpretable hybrid Automated Machine Learning (AutoML) ensemble for multi-outcome prediction.

Methods

We retrospectively analyzed 847 adult ICU admissions with P. aeruginosa isol ates at a tertiary hospital in Hangzhou, China (January 2018 to December 2024). After a three-stage VTF-MI-L1 feature selection pipeline, XGBoost, LightGBM, CatBoost, random forests, and linear/logistic regression were used as base learners and combined via Bagging, Voting, Stacking, and Gradient Boosting. Nested five-fold cross-validation was used to assess model performance (AUC for classification; MSE, RMSE, MAE, and R ² for regression). Interpretability was provided by SHAP values, and the inference latency was recorded.

Results

For carbapenem resistance rate (CRR) prediction, the CatBoost regressor (cRMSD = 0.1663; r = 0.8849; R ² ≈ 0.78) and the Voting Regressor (cRMSD = 0.1675; r = 0.8838) outperformed all other models ( p < 0.05). XGB-R achieved the best accuracy and computational efficiency for the last two tests of the CRR of P. aeruginosa (CRR-PA-Last2) ( p < 0.05). In predicting ICU length of stay, XGB-R led with r = 0.9724, cRMSD = 55.7 d, and σ-ratio = 0.88, significantly surpassing Bagging and CatBoost regressors ( p < 0.05). XGB-R also yielded the lowest composite error for the ICU-to-death interval (cRMSD = 205.1 d; r = 0.7741; σ-ratio = 0.71), again outperforming Bagging and CatBoost (p < 0.05). Across all four regression outcomes, XGB-R obtained the lowest average rank (1.95), whereas CatBoost and Voting regressors showed particular strengths in predicting resistance. SHAP analysis identified age, carbapenem exposure intensity, and duration of mechanical ventilation and catheterization as the key positive contributors. All top-ranked models required < 50 ms per inference, meeting the bedside real-time requirements.

Conclusions

The proposed hybrid AutoML ensemble delivered highly accurate, interpretable, and millisecond-level predictions of diverse resistance-related outcomes, underscoring its potential for ICU antimicrobial stewardship and infection control. Multicenter prospective studies are warranted to confirm the generalizability of these findings.

Version published to 10.1101/2025.06.19.25329970 on medRxiv
Jun 20, 2025

Development and Validation of a Machine Learning-Based Risk Prediction Model for PICC-Related Bloodstream Infections in Premature Infants Using SHAP Interpretability

This article has 5 authors:
1. Yongqin Guo
2. Yingying Dou
3. Wenxia Song
4. Lihong Wang
5. Li Wang
This article has no evaluationsLatest version Dec 30, 2025
Machine learning prediction and interpretive analysis of multidrug-resistant microbial infection risk in septicemia patients: A study from the MIMIC-IV database

This article has 5 authors:
1. Qianqian Zhang
2. Nianzhi Zhang
3. Ying Zheng
4. Jing Zhou
5. Ling Liu
This article has no evaluationsLatest version Dec 30, 2025
Responsible AI for Sepsis Prediction: Bridging the Gap Between Machine Learning Performance and Clinical Trust

This article has 6 authors:
1. Thiago Q. Oliveira
2. Leandro A. Carvalho
3. Flávio R. C. Sousa
4. João B. F. Filho
5. Khalil F. Oliveira
6. Daniel A. B. Tavares
This article has no evaluationsLatest version Jan 30, 2026

Discuss this preprint

Listed in

Abstract

Background

Methods

Results

Conclusions

Article activity feed

Related articles

Development and Validation of a Machine Learning-Based Risk Prediction Model for PICC-Related Bloodstream Infections in Premature Infants Using SHAP Interpretability

Machine learning prediction and interpretive analysis of multidrug-resistant microbial infection risk in septicemia patients: A study from the MIMIC-IV database​

Responsible AI for Sepsis Prediction: Bridging the Gap Between Machine Learning Performance and Clinical Trust

Machine learning prediction and interpretive analysis of multidrug-resistant microbial infection risk in septicemia patients: A study from the MIMIC-IV database