Predicting Intensive Care Transfers and Other Unforeseen Events: Analytic Model Validation Study and Comparison to Existing Methods

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

COVID-19 has led to an unprecedented strain on health care facilities across the United States. Accurately identifying patients at an increased risk of deterioration may help hospitals manage their resources while improving the quality of patient care. Here, we present the results of an analytical model, Predicting Intensive Care Transfers and Other Unforeseen Events (PICTURE), to identify patients at high risk for imminent intensive care unit transfer, respiratory failure, or death, with the intention to improve the prediction of deterioration due to COVID-19.

Objective

This study aims to validate the PICTURE model’s ability to predict unexpected deterioration in general ward and COVID-19 patients, and to compare its performance with the Epic Deterioration Index (EDI), an existing model that has recently been assessed for use in patients with COVID-19.

Methods

The PICTURE model was trained and validated on a cohort of hospitalized non–COVID-19 patients using electronic health record data from 2014 to 2018. It was then applied to two holdout test sets: non–COVID-19 patients from 2019 and patients testing positive for COVID-19 in 2020. PICTURE results were aligned to EDI and NEWS scores for head-to-head comparison via area under the receiver operating characteristic curve (AUROC) and area under the precision-recall curve. We compared the models’ ability to predict an adverse event (defined as intensive care unit transfer, mechanical ventilation use, or death). Shapley values were used to provide explanations for PICTURE predictions.

Results

In non–COVID-19 general ward patients, PICTURE achieved an AUROC of 0.819 (95% CI 0.805-0.834) per observation, compared to the EDI’s AUROC of 0.763 (95% CI 0.746-0.781; n=21,740; P<.001). In patients testing positive for COVID-19, PICTURE again outperformed the EDI with an AUROC of 0.849 (95% CI 0.820-0.878) compared to the EDI’s AUROC of 0.803 (95% CI 0.772-0.838; n=607; P<.001). The most important variables influencing PICTURE predictions in the COVID-19 cohort were a rapid respiratory rate, a high level of oxygen support, low oxygen saturation, and impaired mental status (Glasgow Coma Scale).

Conclusions

The PICTURE model is more accurate in predicting adverse patient outcomes for both general ward patients and COVID-19 positive patients in our cohorts compared to the EDI. The ability to consistently anticipate these events may be especially valuable when considering potential incipient waves of COVID-19 infections. The generalizability of the model will require testing in other health care systems for validation.

Article activity feed

  1. SciScore for 10.1101/2020.07.08.20145078: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    Institutional Review Board StatementIRB: 2.1 Setting and study population: The study protocol was approved by the University of Michigan’s Institutional Review Board (HUM00092309).
    Randomizationnot detected.
    Blindingnot detected.
    Power Analysisnot detected.
    Sex as a biological variablenot detected.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    2.3 Outcomes: The primary outcomes in the training, validation, and test cohorts (data collected from 2014 through 2019) were death, cardiac arrest (as defined by the American Heart Association’s Get With The Guidelines®), transfer to an ICU from a general ward or similar unit, or need for mechanical ventilation.
    American Heart Association’s
    suggested: None
    All analysis was performed using Python 3.8.2.
    Python
    suggested: (IPython, RRID:SCR_001658)

    Results from OddPub: Thank you for sharing your code.


    Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
    Due to limitations in available EDI scores, the number of encounters was restricted to 21,215. These time-matched scores were again evaluated using AUROC and AUPRC on the observation- and encounter levels (Table 3). Figure 2 displays the associated ROC and PR curves for the observation-level performance. The difference of AUROC and AUPR between PICTURE and the EDI reached statistical significance (α = 5%) on both the observation level (AUROC: 0.057 [95% CI: 0.044 - 0.069], AUPRC: 0.032 [95% CI: 0.019 - 0.043]) and the encounter level (AUROC: 0.059 [95% CI: 0.048 - 0.069], AUPRC: 0.095 [95% CI: 0.067 - 0.118]). In addition to classification performance, lead time represents another critical component of a predictive analytics’ utility as it determines the amount of time clinicians have to act on the model’s recommendations. We assessed the model’s relative performance at different lead times in a threshold-independent manner by censoring data occurring 0.5, 1, 2, 6, 12, and 24 hours before an adverse event (Table 4). In our cohort, PICTURE performs markedly better than the EDI model even when considering predictions made 24 hours or more before the actual event. 3.3 Comparison of PICTURE to EDI in COVID-19 patients: When applied to patients testing positive for COVID-19, PICTURE performs similarly well. PICTURE scores were again aligned to EDI scores using the process outlined in Section 2.6.2. This resulted in the inclusion of 402 encounters. Table 5 presents AUROC and AUPRC ...

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • No funding statement was detected.
    • No protocol registration statement was detected.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.