Predicting severe COVID-19 outcomes for triage and resource allocation

This article has been Reviewed by the following groups

Read the full article

Abstract

Background

While numerous studies have identified factors associated with severe COVID-19 outcomes, they have yet to quantify these characteristics. Therefore, our study’s purpose is to stratify these risk factors and use them to predict outcomes.

Study Design

This is a retrospective review of the CDC COVID-19 Surveillance Data. Logistic regression models calculated risk estimates for independent variables, and random forest models predicted the chance of severe outcomes.

Results

Our sample of 3,798,261 patients with COVID-19 consisted mainly of females (51.9%), 10-to 69-year-olds, and White/Non-Hispanics (34.9%). Most were not healthcare workers (90.6%) and did not have preexisting medical conditions (47.1%). Age had an increased risk of severe outcomes that grew every decade of life. White patients had a decreased occurrence of severe outcomes than Non-Whites, except for Pacific Islanders with comparable mortality. The variable selection algorithm detected that three outcomes were more accurate without healthcare worker classification: mechanical ventilation/intubation, pneumonia, and ARDS Acute respiratory distress. However, providers had a decreased risk of severe outcomes overall. Also, patients with preexisting conditions demonstrated an increased risk in all outcomes. Compared to the logistic regressions, the predictive models had a higher performance (AUC>0.8). The death model had the best metrics, followed by hospitalization and ventilation. We amassed these predictive models into the Severe COVID-19 Calculator web application that estimates the probability of severe outcomes.

Conclusions

Several patient social and medical demographics recorded by the CDC significantly affect severe COVID-19 outcomes suggesting a multifactorial influence. To account for these variables, a generated Severe Covid-19 Calculator can accurately predict the chance of severe outcomes in citizens that may contract or have COVID-19.

Article activity feed

  1. SciScore for 10.1101/2021.04.12.21255201: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    No key resources detected.


    Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


    Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
    This limitation also questions the possibility of the number of pre-existing conditions correlating with disease severity. If so, there may be a benefit in making this distinction in future predictive tools. Like all prediction modeling projects, the exclusion of potentially essential variables could reduce our models’ performance. Also, within the available data, a few predictive variables, such as race/ethnicity, healthcare worker status, pre-existing medical conditions, contained unknown variables, which could have affected our calculator’s accuracy. However, the high AUC value and performance metrics for all of our outcomes suggest that our predictive models are significantly better than random chance. As vaccine distribution continues to roll out in phases and trends of severe COVID-19 outcomes change, we will need to continuously recalculate our results to determine the patients at the highest risk for severe COVID-19 outcomes. With the continuously updated CDC data and our publicly available web app, we can recreate our Severe COVID-19 Calculator to reflect the most up-to-date data. For example, in a new data release, we may find a race/ethnicity, sex, or age group with a disproportionate chance of severe COVID-19 outcomes in the vaccination era, and we would like to account for that in future calculator iterations. Additionally, using some of the variables already available in the CDC report, we plan to aggregate data sources to account for variables of interest, such...

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • No funding statement was detected.
    • No protocol registration statement was detected.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.