Predictive Modeling of Morbidity and Mortality in Patients Hospitalized With COVID-19 and its Clinical Implications: Algorithm Development and Interpretation

Joshua M Wang
Wenke Liu
Xiaoshan Chen
Michael P McRae
John T McDevitt
David Fenyö

This article has been Reviewed by the following groups

Read the full article

Listed in

Evaluated articles (ScreenIT)

Abstract

The COVID-19 pandemic began in early 2021 and placed significant strains on health care systems worldwide. There remains a compelling need to analyze factors that are predictive for patients at elevated risk of morbidity and mortality.

Objective

The goal of this retrospective study of patients who tested positive with COVID-19 and were treated at NYU (New York University) Langone Health was to identify clinical markers predictive of disease severity in order to assist in clinical decision triage and to provide additional biological insights into disease progression.

Methods

The clinical activity of 3740 patients at NYU Langone Hospital was obtained between January and August 2020; patient data were deidentified. Models were trained on clinical data during different parts of their hospital stay to predict three clinical outcomes: deceased, ventilated, or admitted to the intensive care unit (ICU).

Results

The XGBoost (eXtreme Gradient Boosting) model that was trained on clinical data from the final 24 hours excelled at predicting mortality (area under the curve [AUC]=0.92; specificity=86%; and sensitivity=85%). Respiration rate was the most important feature, followed by SpO2 (peripheral oxygen saturation) and being aged 75 years and over. Performance of this model to predict the deceased outcome extended 5 days prior, with AUC=0.81, specificity=70%, and sensitivity=75%. When only using clinical data from the first 24 hours, AUCs of 0.79, 0.80, and 0.77 were obtained for deceased, ventilated, or ICU-admitted outcomes, respectively. Although respiration rate and SpO2 levels offered the highest feature importance, other canonical markers, including diabetic history, age, and temperature, offered minimal gain. When lab values were incorporated, prediction of mortality benefited the most from blood urea nitrogen and lactate dehydrogenase (LDH). Features that were predictive of morbidity included LDH, calcium, glucose, and C-reactive protein.

Conclusions

Together, this work summarizes efforts to systematically examine the importance of a wide range of features across different endpoint outcomes and at different hospitalization time points.

Version published to 10.2196/29514
Jul 9, 2021
Version published to 10.2196/preprints.29514
Apr 9, 2021
Version published to 10.1101/2020.12.02.20235879v5 on medRxiv
Mar 29, 2021
ScreenIT
Dec 5, 2020
SciScore for 10.1101/2020.12.02.20235879: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.
Table 2: Resources
Software and Algorithms
Sentences Resources
All models were implemented in Python with built-in units in TensorFlow 2 and Keras32.
Python
suggested: (IPython, RRID:SCR_001658)
Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).
Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.
Results from TrialIdentifier: No clinical trial numbers were referenced.
Results from Barzooka: We did not find any …
SciScore for 10.1101/2020.12.02.20235879: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.
Table 2: Resources
Software and Algorithms
Sentences Resources
All models were implemented in Python with built-in units in TensorFlow 2 and Keras32.
Python
suggested: (IPython, RRID:SCR_001658)
Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).
Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.
Results from TrialIdentifier: No clinical trial numbers were referenced.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: We did not find any issues relating to colormaps.
Results from rtransparent:
Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
No protocol registration statement was detected.
About SciScore
SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.
Read the original source
Version published to 10.1101/2020.12.02.20235879v3 on medRxiv
Dec 5, 2020
Version published to 10.1101/2020.12.02.20235879v4 on medRxiv
Dec 5, 2020
Version published to 10.1101/2020.12.02.20235879v1 on medRxiv
Dec 4, 2020
Version published to 10.1101/2020.12.02.20235879v2 on medRxiv
Dec 4, 2020

Software and Algorithms
Sentences	Resources
All models were implemented in Python with built-in units in TensorFlow 2 and Keras32.	Python suggested: (IPython, RRID:SCR_001658)

Software and Algorithms
Sentences	Resources
All models were implemented in Python with built-in units in TensorFlow 2 and Keras32.	Python suggested: (IPython, RRID:SCR_001658)

Clinical characteristics associated with ARDS and mortality in patients with COVID-19 who received corticosteroid therapy

This article has 4 authors:
1. Madeleine Anthonisen
2. Elliot Fortin
3. Simon Rousseau
4. Karine Tremblay
This article has no evaluationsLatest version May 19, 2025
Trends and Predictors of In-Hospital Mortality, Length of Stay and Hospitalization Costs among Oral and Oropharyngeal Cancer Patients

This article has 4 authors:
1. Shefali Viegas
2. Masoud MiriMoghaddam
3. Babak Bohlouli
4. Maryam Amin
This article has no evaluationsLatest version Jun 24, 2025
The Models for End-stage Liver Disease as prognostic assessment and risk stratification tools in sepsis: a study based on MIMIC-Ⅳ database

This article has 8 authors:
1. Tuo Shen
2. Xingping Lv
3. Yezhou Shen
4. Wei Zhou
5. Xiaobin Liu
6. Qimin Ma
7. Shuyue Sheng
8. Feng Zhu
This article has no evaluationsLatest version May 13, 2025

This article has been Reviewed by the following groups

Listed in

Abstract

Objective

Methods

Results

Conclusions

Article activity feed

Related articles

Clinical characteristics associated with ARDS and mortality in patients with COVID-19 who received corticosteroid therapy

Trends and Predictors of In-Hospital Mortality, Length of Stay and Hospitalization Costs among Oral and Oropharyngeal Cancer Patients

The Models for End-stage Liver Disease as prognostic assessment and risk stratification tools in sepsis: a study based on MIMIC-Ⅳ database