Early prediction of mortality risk among patients with severe COVID-19, using machine learning

Chuanyu Hu
Zhenqiu Liu
Yanfeng Jiang
Oumin Shi
Xin Zhang
Kelin Xu
Chen Suo
Qin Wang
Yujing Song
Kangkang Yu
Xianhua Mao
Xuefu Wu
Mingshan Wu
Tingting Shi
Wei Jiang
Lina Mu
Damien C Tully
Lei Xu
Li Jin
Shusheng Li
Xuejin Tao
Tiejun Zhang
Xingdong Chen

This article has been Reviewed by the following groups

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

Evaluated articles (ScreenIT)

Abstract

Background

Coronavirus disease 2019 (COVID-19), caused by severe acute respiratory syndrome coronavirus 2 infection, has been spreading globally. We aimed to develop a clinical model to predict the outcome of patients with severe COVID-19 infection early.

Methods

Demographic, clinical and first laboratory findings after admission of 183 patients with severe COVID-19 infection (115 survivors and 68 non-survivors from the Sino-French New City Branch of Tongji Hospital, Wuhan) were used to develop the predictive models. Machine learning approaches were used to select the features and predict the patients’ outcomes. The area under the receiver operating characteristic curve (AUROC) was applied to compare the models’ performance. A total of 64 with severe COVID-19 infection from the Optical Valley Branch of Tongji Hospital, Wuhan, were used to externally validate the final predictive model.

Results

The baseline characteristics and laboratory tests were significantly different between the survivors and non-survivors. Four variables (age, high-sensitivity C-reactive protein level, lymphocyte count and d-dimer level) were selected by all five models. Given the similar performance among the models, the logistic regression model was selected as the final predictive model because of its simplicity and interpretability. The AUROCs of the external validation sets were 0.881. The sensitivity and specificity were 0.839 and 0.794 for the validation set, when using a probability of death of 50% as the cutoff. Risk score based on the selected variables can be used to assess the mortality risk. The predictive model is available at [https://phenomics.fudan.edu.cn/risk_scores/].

Conclusions

Age, high-sensitivity C-reactive protein level, lymphocyte count and d-dimer level of COVID-19 patients at admission are informative for the patients’ outcomes.

SciScore for 10.1101/2020.04.13.20064329: (What is this?)

Please note, not all rigor criteria are appropriate for all manuscripts.

Table 1: Rigor

Institutional Review Board Statement	Consent: Written informed consent was waived by the Ethics Commission of the designated hospital for emerging infectious diseases.8 IRB: Written informed consent was waived by the Ethics Commission of the designated hospital for emerging infectious diseases.8
Randomization	not detected.
Blinding	not detected.
Power Analysis	not detected.
Sex as a biological variable	not detected.

Table 2: Resources

No key resources detected.

Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).

Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:

Our study also has limitations. First, the predictive models were constructed based on a relatively small sample size; the interpretation of our findings might be limited. Second, due to the retrospective study design, not all the laboratory tests were performed in all the patients. Some of them might be deleted in the data preprocessing procedure and their roles might be underestimated in predicting patients’ outcomes. Third, patients were sometimes transferred from other hospitals to the two branches of Tongji hospitals, although we excluded patients who did not meet the inclusion criteria. The values of the laboratory tests might be biased by prior antiviral treatment in these patients. Finally, the patients in the derivation set and the validation set were from Tongji Hospital, which is one of the hospitals with a high level of medical care in China. Some critically ill patients recovered here might die in other hospitals with suboptimal or typical levels of medical care. The cutoff for predicting death should be <50% (e.g., defining patients who have a >30% probability of death as high-risk patients) in these settings. In summary, using available clinical data, we developed a robust machine learning model to predict the outcome of COVID-19 patients early. Our model and the accompanying web application are of importance for clinicians to identify patients at high risk of death and are therefore critical for the prevention and control of COVID-19.

Results from TrialIdentifier: No clinical trial numbers were referenced.

Results from Barzooka: We did not find any issues relating to the usage of bar graphs.

Results from JetFighter: We did not find any issues relating to colormaps.

Results from rtransparent:

Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
No protocol registration statement was detected.

Read the original source

Version published to 10.1093/ije/dyaa171
Sep 23, 2020

SciScore for 10.1101/2020.04.13.20064329: (What is this?)

Please note, not all rigor criteria are appropriate for all manuscripts.

Table 1: Rigor

Institutional Review Board Statement

Written informed consent was waived by the Ethics Commission of the designated hospital for emerging infectious diseases.

Randomization

not detected.

Blinding

not detected.

Power Analysis

not detected.

Sex as a biological variable

not detected.

Table 2: Resources

Software and Algorithms
Sentences	Resources
Cost curves of the predictive model Cost curve is a graphical technique for visualizing the performance ( expected cost ) of 2-class classifiers over a range of possible class distributions and cutoffs .	Cost suggested: (COST, SCR_014098)

Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).

About SciScore

SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore is not a substitute for expert review. SciScore checks for the presence and correctness of RRIDs (research resource identifiers) in the manuscript, and detects sentences that appear to be missing RRIDs. SciScore also checks to make sure that rigor criteria are addressed by authors. It does this by detecting sentences that discuss criteria such as blinding or power analysis. SciScore does not guarantee that the rigor criteria that it detects are appropriate for the particular study. Instead it assists authors, editors, and reviewers by drawing attention to sections of the manuscript that contain or should contain various rigor criteria and key resources. For details on the results shown here, including references cited, please follow this link.

Read the original source

Version published to 10.1101/2020.04.13.20064329 on medRxiv
Apr 19, 2020

A Preliminary Prognostic Model for Predicting Poor Prognosis in COVID-19 Integrating Lung Epithelial Injury (KL-6) with Routine Care Markers

This article has 7 authors:
1. Yunlai Liang
2. Kun Wang
3. Lu Long
4. Qizhuo Hou
5. Wenze Yu
6. Kangkang Huang
7. Bin Yi
This article has no evaluationsLatest version Feb 3, 2026
Development and Deployment of a Machine Learning–Based Predictive Model for COVID- 19 Infection Using Patient Demographic and Symptom Data in Nigeria

This article has 10 authors:
1. Olanrewaju Eniade
2. Ezekiel Ukwenga
3. Uchenna Akuka
4. Opeyemi Adeniyi
5. Elonna Obak
6. Omolola Adeagbo
7. Peter Babatunde Olaitan
8. Rita Ayanbolade Olowe
9. Tolulope Opakunle
10. Olugbenga Adekunle Olowe
This article has no evaluationsLatest version Jan 25, 2026
Prognostic Assessment of Sepsis-Induced Acute Respiratory Distress Syndrome Using an Early Warning Model

This article has 4 authors:
1. Zechao Huang
2. Qikai Kong
3. Xin Lian
4. Fengyong Yang
This article has no evaluationsLatest version Feb 27, 2026

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Background

Methods

Results

Conclusions

Article activity feed

Related articles

A Preliminary Prognostic Model for Predicting Poor Prognosis in COVID-19 Integrating Lung Epithelial Injury (KL-6) with Routine Care Markers

Development and Deployment of a Machine Learning–Based Predictive Model for COVID- 19 Infection Using Patient Demographic and Symptom Data in Nigeria

Prognostic Assessment of Sepsis-Induced Acute Respiratory Distress Syndrome Using an Early Warning Model