Systematic evaluation and external validation of 22 prognostic models among hospitalised adults with COVID-19: an observational cohort study

Rishi K. Gupta
Michael Marks
Thomas H.A. Samuels
Akish Luintel
Tommy Rampling
Humayra Chowdhury
Matteo Quartagno
Arjun Nair
Marc Lipman
Ibrahim Abubakar
Maarten van Smeden
Wai Keong Wong
Bryan Williams
Mahdad Noursadeghi

This article has been Reviewed by the following groups

Read the full article

Listed in

Evaluated articles (ScreenIT)

Abstract

The number of proposed prognostic models for coronavirus disease 2019 (COVID-19) is growing rapidly, but it is unknown whether any are suitable for widespread clinical implementation.

We independently externally validated the performance of candidate prognostic models, identified through a living systematic review, among consecutive adults admitted to hospital with a final diagnosis of COVID-19. We reconstructed candidate models as per original descriptions and evaluated performance for their original intended outcomes using predictors measured at the time of admission. We assessed discrimination, calibration and net benefit, compared to the default strategies of treating all and no patients, and against the most discriminating predictors in univariable analyses.

We tested 22 candidate prognostic models among 411 participants with COVID-19, of whom 180 (43.8%) and 115 (28.0%) met the endpoints of clinical deterioration and mortality, respectively. Highest areas under receiver operating characteristic (AUROC) curves were achieved by the NEWS2 score for prediction of deterioration over 24 h (0.78, 95% CI 0.73–0.83), and a novel model for prediction of deterioration <14 days from admission (0.78, 95% CI 0.74–0.82). The most discriminating univariable predictors were admission oxygen saturation on room air for in-hospital deterioration (AUROC 0.76, 95% CI 0.71–0.81), and age for in-hospital mortality (AUROC 0.76, 95% CI 0.71–0.81). No prognostic model demonstrated consistently higher net benefit than these univariable predictors, across a range of threshold probabilities.

Admission oxygen saturation on room air and patient age are strong predictors of deterioration and mortality among hospitalised adults with COVID-19, respectively. None of the prognostic models evaluated here offered incremental value for patient stratification to these univariable predictors.

SciScore for 10.1101/2020.07.24.20149815: (What is this?)

Please note, not all rigor criteria are appropriate for all manuscripts.

Table 1: Rigor

Institutional Review Board Statement	not detected.
Randomization	not detected.
Blinding	not detected.
Power Analysis	not detected.
Sex as a biological variable	not detected.

Table 2: Resources

Software and Algorithms
Sentences	Resources
Identification of candidate prognostic models: We used a published living systematic review to identify candidate prognostic models for COVID-19 indexed in PubMed, Embase, Arxiv, medRxiv, or bioRxiv until 5thMay 20208.	PubMed suggested: (PubMed, RRID:SCR_004846) Embase suggested: (EMBASE, RRID:SCR_001650) bioRxiv suggested: (bioRxiv, RRID:SCR_003933)

Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when …

SciScore for 10.1101/2020.07.24.20149815: (What is this?)

Please note, not all rigor criteria are appropriate for all manuscripts.

Table 1: Rigor

Institutional Review Board Statement	not detected.
Randomization	not detected.
Blinding	not detected.
Power Analysis	not detected.
Sex as a biological variable	not detected.

Table 2: Resources

Software and Algorithms
Sentences	Resources
Identification of candidate prognostic models: We used a published living systematic review to identify candidate prognostic models for COVID-19 indexed in PubMed, Embase, Arxiv, medRxiv, or bioRxiv until 5thMay 20208.	PubMed suggested: (PubMed, RRID:SCR_004846) Embase suggested: (EMBASE, RRID:SCR_001650) bioRxiv suggested: (bioRxiv, RRID:SCR_003933)

Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).

Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:

Second, due to the limitations of routinely collected data, predictor variables were available for varying numbers of participants for each model. We therefore performed multiple imputation, in keeping with recommendations for development and validation of multivariable prediction models, in our primary analyses30. Findings were similar in the complete case sensitivity analysis, thus supporting the robustness of our results. Thirdly, a number of models could not be reconstructed in our data. For some models, this was due the absence of predictors in our dataset, such as those requiring computed tomography imaging, since this is not currently routinely recommended for patients with suspected or confirmed COVID-1916. We were also not able to include models for which the parameters were not publicly available. This underscores the need for strict adherence to reporting standards in multivariable prediction models13. Finally, we used admission data only as predictors in this study, since most prognostic scores are intended to predict outcomes at the point of hospital admission. We note, however, that some scores (such as NEWS2) are designed for dynamic in-patient monitoring. Future studies may integrate serial data to examine model performance when using such dynamic measurements. Despite the vast global interest in the pursuit of prognostic models for COVID-19, our findings show that no COVID-19-specific models can currently be recommended for routine clinical use. All novel pro...

Results from TrialIdentifier: No clinical trial numbers were referenced.

Results from Barzooka: We did not find any issues relating to the usage of bar graphs.

Results from JetFighter: We did not find any issues relating to colormaps.

Results from rtransparent:

Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
No protocol registration statement was detected.

Read the original source

Version published to 10.1183/13993003.03498-2020
Sep 25, 2020

SciScore for 10.1101/2020.07.24.20149815: (What is this?)

Please note, not all rigor criteria are appropriate for all manuscripts.

Table 1: Rigor

Institutional Review Board Statement

Ethical approval This study was approved by East Midlands - Nottingham 2 Research Ethics Committee (REF: 20/EM/0114).

Randomization

not detected.

Blinding

not detected.

Power Analysis

not detected.

Sex as a biological variable

Median age of the cohort was 66 years (interquartile range (IQR) 53-79), and the majority were male (252/411; 61.3%).

Table 2: Resources

Software and Algorithms

Sentences Resources

Methods Identification of candidate prognostic models We used a published living systematic review to identify candidate prognostic models for COVID19 indexed in PubMed, Embase, Arxiv, medRxiv, or bioRxiv until 5th May 20208.

PubMed

suggested: (PubMed, SCR_004846)

      <div style="margin-bottom:8px">
        <div><b>Embase</b></div>
        <div>suggested: (EMBASE, <a href="https://scicrunch.org/resources/Any/search?q=SCR_001650">SCR_001650</a>)</div>
      </div>
    
      <div style="margin-bottom:8px">
        <div><b>bioRxiv</b></div>
        <div>suggested: (bioRxiv, <a href="https://scicrunch.org/resources/Any/search?q=SCR_003933">SCR_003933</a>)</div>
      </div>
    </td></tr></table>

Data from additional tools added to each annotation on a weekly basis.

About SciScore

SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore is not a substitute for expert review. SciScore checks for the presence and correctness of RRIDs (research resource identifiers) in the manuscript, and detects sentences that appear to be missing RRIDs. SciScore also checks to make sure that rigor criteria are addressed by authors. It does this by detecting sentences that discuss criteria such as blinding or power analysis. SciScore does not guarantee that the rigor criteria that it detects are appropriate for the particular study. Instead it assists authors, editors, and reviewers by drawing attention to sections of the manuscript that contain or should contain various rigor criteria and key resources. For details on the results shown here, including references cited, please follow this link.

Read the original source

Version published to 10.1101/2020.07.24.20149815 on medRxiv
Jul 26, 2020

Age as a Significant Factor in the Diagnosis and Prognosis of ARDS Patients Not Meeting the Berlin Criteria

This article has 12 authors:
1. Pan Pan
2. Jin Chen
3. Tiantian Zhang
4. Haibo Cheng
5. Wanyi Zhang
6. Xiaobo Han
7. Xinjie Han
8. Xingshuo Hu
9. Qingyun Yang
10. Hongjun Gu
11. Yuhong Liu
12. Lixin Xie
This article has no evaluationsLatest version Jul 23, 2025
Development and Validation of a Nomogram for Predicting Overall Survival in Patients with Epithelioid Sarcoma

This article has 7 authors:
1. Panhong Zhang
2. Bangmin Wang
3. Xinhui Du
4. Peng Zhang
5. Jiaqiang Wang
6. Zhehuang Li
7. Weitao Yao
This article has no evaluationsLatest version Jul 14, 2025
Development and internal validation of risk scores to predict survival in the pediatric population following out-of-hospital cardiac arrest

This article has 5 authors:
1. Minaz Mawani
2. Bryan McNally
3. Jessica Knight
4. Ye Shen
5. Mark Ebell
This article has no evaluationsLatest version Aug 6, 2025

This article has been Reviewed by the following groups

Listed in

Abstract

Article activity feed

Related articles

Age as a Significant Factor in the Diagnosis and Prognosis of ARDS Patients Not Meeting the Berlin Criteria

Development and Validation of a Nomogram for Predicting Overall Survival in Patients with Epithelioid Sarcoma

Development and internal validation of risk scores to predict survival in the pediatric population following out-of-hospital cardiac arrest