Systematic evaluation and external validation of 22 prognostic models among hospitalised adults with COVID-19: an observational cohort study

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

The number of proposed prognostic models for coronavirus disease 2019 (COVID-19) is growing rapidly, but it is unknown whether any are suitable for widespread clinical implementation.

We independently externally validated the performance of candidate prognostic models, identified through a living systematic review, among consecutive adults admitted to hospital with a final diagnosis of COVID-19. We reconstructed candidate models as per original descriptions and evaluated performance for their original intended outcomes using predictors measured at the time of admission. We assessed discrimination, calibration and net benefit, compared to the default strategies of treating all and no patients, and against the most discriminating predictors in univariable analyses.

We tested 22 candidate prognostic models among 411 participants with COVID-19, of whom 180 (43.8%) and 115 (28.0%) met the endpoints of clinical deterioration and mortality, respectively. Highest areas under receiver operating characteristic (AUROC) curves were achieved by the NEWS2 score for prediction of deterioration over 24 h (0.78, 95% CI 0.73–0.83), and a novel model for prediction of deterioration <14 days from admission (0.78, 95% CI 0.74–0.82). The most discriminating univariable predictors were admission oxygen saturation on room air for in-hospital deterioration (AUROC 0.76, 95% CI 0.71–0.81), and age for in-hospital mortality (AUROC 0.76, 95% CI 0.71–0.81). No prognostic model demonstrated consistently higher net benefit than these univariable predictors, across a range of threshold probabilities.

Admission oxygen saturation on room air and patient age are strong predictors of deterioration and mortality among hospitalised adults with COVID-19, respectively. None of the prognostic models evaluated here offered incremental value for patient stratification to these univariable predictors.

Article activity feed

  1. SciScore for 10.1101/2020.07.24.20149815: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    Institutional Review Board Statementnot detected.
    Randomizationnot detected.
    Blindingnot detected.
    Power Analysisnot detected.
    Sex as a biological variablenot detected.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    Identification of candidate prognostic models: We used a published living systematic review to identify candidate prognostic models for COVID-19 indexed in PubMed, Embase, Arxiv, medRxiv, or bioRxiv until 5thMay 20208.
    PubMed
    suggested: (PubMed, RRID:SCR_004846)
    Embase
    suggested: (EMBASE, RRID:SCR_001650)
    bioRxiv
    suggested: (bioRxiv, RRID:SCR_003933)

    Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


    Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
    Second, due to the limitations of routinely collected data, predictor variables were available for varying numbers of participants for each model. We therefore performed multiple imputation, in keeping with recommendations for development and validation of multivariable prediction models, in our primary analyses30. Findings were similar in the complete case sensitivity analysis, thus supporting the robustness of our results. Thirdly, a number of models could not be reconstructed in our data. For some models, this was due the absence of predictors in our dataset, such as those requiring computed tomography imaging, since this is not currently routinely recommended for patients with suspected or confirmed COVID-1916. We were also not able to include models for which the parameters were not publicly available. This underscores the need for strict adherence to reporting standards in multivariable prediction models13. Finally, we used admission data only as predictors in this study, since most prognostic scores are intended to predict outcomes at the point of hospital admission. We note, however, that some scores (such as NEWS2) are designed for dynamic in-patient monitoring. Future studies may integrate serial data to examine model performance when using such dynamic measurements. Despite the vast global interest in the pursuit of prognostic models for COVID-19, our findings show that no COVID-19-specific models can currently be recommended for routine clinical use. All novel pro...

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.

  2. SciScore for 10.1101/2020.07.24.20149815: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    Institutional Review Board StatementEthical approval This study was approved by East Midlands - Nottingham 2 Research Ethics Committee (REF: 20/EM/0114).Randomizationnot detected.Blindingnot detected.Power Analysisnot detected.Sex as a biological variableMedian age of the cohort was 66 years (interquartile range (IQR) 53-79), and the majority were male (252/411; 61.3%).

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    Methods Identification of candidate prognostic models We used a published living systematic review to identify candidate prognostic models for COVID19 indexed in PubMed, Embase, Arxiv, medRxiv, or bioRxiv until 5th May 20208.
    PubMed
    suggested: (PubMed, SCR_004846)
          <div style="margin-bottom:8px">
            <div><b>Embase</b></div>
            <div>suggested: (EMBASE, <a href="https://scicrunch.org/resources/Any/search?q=SCR_001650">SCR_001650</a>)</div>
          </div>
        
          <div style="margin-bottom:8px">
            <div><b>bioRxiv</b></div>
            <div>suggested: (bioRxiv, <a href="https://scicrunch.org/resources/Any/search?q=SCR_003933">SCR_003933</a>)</div>
          </div>
        </td></tr></table>
    

    Data from additional tools added to each annotation on a weekly basis.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore is not a substitute for expert review. SciScore checks for the presence and correctness of RRIDs (research resource identifiers) in the manuscript, and detects sentences that appear to be missing RRIDs. SciScore also checks to make sure that rigor criteria are addressed by authors. It does this by detecting sentences that discuss criteria such as blinding or power analysis. SciScore does not guarantee that the rigor criteria that it detects are appropriate for the particular study. Instead it assists authors, editors, and reviewers by drawing attention to sections of the manuscript that contain or should contain various rigor criteria and key resources. For details on the results shown here, including references cited, please follow this link.