A Hybrid AutoML Ensemble Integrating Conventional Learners and Gradient-Boosting Models for Multi-Outcome Prediction in ICU Patients with Pseudomonas aeruginosa
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background
Carbapenem resistance in Pseudomonas aeruginosa is increasing in intensive care units (ICUs). To enhance antimicrobial stewardship and infection control, we aimed to develop and validate a real-time interpretable hybrid Automated Machine Learning (AutoML) ensemble for multi-outcome prediction.
Methods
We retrospectively analyzed 847 adult ICU admissions with P. aeruginosa isol ates at a tertiary hospital in Hangzhou, China (January 2018 to December 2024). After a three-stage VTF-MI-L1 feature selection pipeline, XGBoost, LightGBM, CatBoost, random forests, and linear/logistic regression were used as base learners and combined via Bagging, Voting, Stacking, and Gradient Boosting. Nested five-fold cross-validation was used to assess model performance (AUC for classification; MSE, RMSE, MAE, and R 2 for regression). Interpretability was provided by SHAP values, and the inference latency was recorded.
Results
For carbapenem resistance rate (CRR) prediction, the CatBoost regressor (cRMSD = 0.1663; r = 0.8849; R 2 ≈ 0.78) and the Voting Regressor (cRMSD = 0.1675; r = 0.8838) outperformed all other models ( p < 0.05). XGB-R achieved the best accuracy and computational efficiency for the last two tests of the CRR of P. aeruginosa (CRR-PA-Last2) ( p < 0.05). In predicting ICU length of stay, XGB-R led with r = 0.9724, cRMSD = 55.7 d, and σ-ratio = 0.88, significantly surpassing Bagging and CatBoost regressors ( p < 0.05). XGB-R also yielded the lowest composite error for the ICU-to-death interval (cRMSD = 205.1 d; r = 0.7741; σ-ratio = 0.71), again outperforming Bagging and CatBoost (p < 0.05). Across all four regression outcomes, XGB-R obtained the lowest average rank (1.95), whereas CatBoost and Voting regressors showed particular strengths in predicting resistance. SHAP analysis identified age, carbapenem exposure intensity, and duration of mechanical ventilation and catheterization as the key positive contributors. All top-ranked models required < 50 ms per inference, meeting the bedside real-time requirements.
Conclusions
The proposed hybrid AutoML ensemble delivered highly accurate, interpretable, and millisecond-level predictions of diverse resistance-related outcomes, underscoring its potential for ICU antimicrobial stewardship and infection control. Multicenter prospective studies are warranted to confirm the generalizability of these findings.