Machine Learning Prediction Models for COVID-19 ICU Mortality: Model Development and Validation
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Introduction: Predicting the severity and outcome of COVID-19 is a challenging task. This study investigated the potential of predicting mortality using the SOFA score, CT scan findings based on the CO-RADS system, and biomarkers (including IL-6 and LDH) in intensive care unit (ICU)-admitted patients with COVID-19. Additionally, we developed multivariable models to enhance prognostic accuracy. Materials and methods This retrospective cohort study was conducted on 426 COVID-19 patients admitted to the ICU of a tertiary hospital in Zanjan, Iran, from March to November 2020. The data were collected from patients' medical records. The correlation between variables and mortality was analyzed, and the predictability of mortality was assessed using the receiver operating characteristic (ROC) curve. Cut-off points, sensitivity, and specificity were determined. Conventional logistic regression methods and four machine learning (ML) algorithms were employed to develop mortality prediction models for ICU patients with COVID-19 using Python. The performance of these machine learning models was measured by the area under the receiver operating characteristic curve (AUC). The internal validation of these ML-based models was performed using an integrated 10-fold stratified cross-validation with bootstrap. Results The mortality rate was 47.1% (n = 200). The mean SOFA (5.23 vs. 3.58, p < 0.001) and CO-RADS (5.54 vs. 5.03, p < 0.001) scores were significantly higher in the deceased group. Biomarkers were investigated as mortality predictors, and IL-6 (AUC: 0.761, cut-off: >18.95 pg/mL, 65.5% sensitivity, 93.4% specificity) and LDH (AUC: 0.737, cut-off: >437.5 U/L, 63.0% sensitivity, 68.6% specificity) yielded the highest predictability for mortality, followed by SOFA (AUC: 0.701, cut-off: >3, 80.0% sensitivity, 46.5% specificity) and among comorbidities, hypertension and diabetes exhibited significant correlation with mortality risk (p = 0.001 and p = 0.038, respectively). Using logistic regression methods and four machine learning (ML) algorithms, a six-factor model was developed involving age, IL-6, LDH, SOFA, CO-RADS, and ESR. This model yielded the best AUC in the adaptive boosting (AdaBoost) algorithm (AUC: 0.960, sensitivity: 0.915, specificity: 0.885). According to the internal validation and calibration plots (Fig. 3), the closest agreement between predicted and observed probabilities was shown in the categorical boosting (CatBoost) algorithm (Brier score: 0.086, AUC: 0.952). Conclusions IL-6 and LDH showed the highest predictive value for mortality among biomarkers, and the SOFA score showed predictability compared to the CT-based CO-RADS system. The six-factor AdaBoost model showed the best performance in predicting the mortality risk.