Development and validation of a machine learning model predicting illness trajectory and hospital utilization of COVID-19 patients: A nationwide study

Abstract

Objective

The spread of coronavirus disease 2019 (COVID-19) has led to severe strain on hospital capacity in many countries. We aim to develop a model helping planners assess expected COVID-19 hospital resource utilization based on individual patient characteristics.

Materials and Methods

We develop a model of patient clinical course based on an advanced multistate survival model. The model predicts the patient's disease course in terms of clinical states—critical, severe, or moderate. The model also predicts hospital utilization on the level of entire hospitals or healthcare systems. We cross-validated the model using a nationwide registry following the day-by-day clinical status of all hospitalized COVID-19 patients in Israel from March 1 to May 2, 2020 (n = 2703).

Results

Per-day mean absolute errors for predicted total and critical care hospital bed utilization were 4.72 ± 1.07 and 1.68 ± 0.40, respectively, over cohorts of 330 hospitalized patients; areas under the curve for prediction of critical illness and in-hospital mortality were 0.88 ± 0.04 and 0.96 ± 0.04, respectively. We further present the impact of patient influx scenarios on day-by-day healthcare system utilization. We provide an accompanying R software package.

Discussion

The proposed model accurately predicts total and critical care hospital utilization. The model enables evaluating impacts of patient influx scenarios on utilization, accounting for the state of currently hospitalized patients and characteristics of incoming patients. We show that accurate hospital load predictions were possible using only a patient’s age, sex, and day-by-day clinical state (critical, severe, or moderate).

Conclusions

The multistate model we develop is a powerful tool for predicting individual-level patient outcomes and hospital-level utilization.

SciScore for 10.1101/2020.09.04.20185645: (What is this?)

Please note, not all rigor criteria are appropriate for all manuscripts.

Table 1: Rigor

Institutional Review Board Statement	not detected.
Randomization	not detected.
Blinding	not detected.
Power Analysis	not detected.
Sex as a biological variable	not detected.

Table 2: Resources

No key resources detected.

Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).

Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:

Our model has several limitations. First, it is based on data from the first wave of patients in Israel. As treatment strategies and hospitalization policies differ over time and between health systems and hospitals, we cannot guarantee that LOS statistics will be the same across all locales and times. Thus, when possible we encourage planners to use the attached software package and fit it to their own hospitalization data. We will update the software package and app as more updated data will become available from the Israeli registry. A second limitation is that our model relies on estimation of the frequency and characteristics of future incoming patients. If arriving patient populations – both patient type and patient numbers – will differ significantly from the scenarios taken into account, the model’s predictions will be wrong. We thus recommend that planners evaluate multiple hypotheticals for incoming patients, testing for scenarios such as the ones we presented in the Results section above. A third limitation is that the model does not take into account patients’ comorbidities21–23 On the one hand, our model achieves good results while analyzing only a limited number of covariates as input; on the other hand, it is possible that using comorbidities could enhance the model’s performance. We also wish to point out that researchers with access to patient-level comorbidity data can easily incorporate it into a multistate model using the software we provide. A fourth limi...

Results from TrialIdentifier: No clinical trial numbers were referenced.

Results from Barzooka: We did not find any issues relating to the usage of bar graphs.

Results from JetFighter: We did not find any issues relating to colormaps.

Results from rtransparent:

Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
Thank you for including a protocol registration statement.

Read the original source

Development and validation of a machine learning model predicting illness trajectory and hospital utilization of COVID-19 patients: A nationwide study

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Objective

Materials and Methods

Results

Discussion

Conclusions

Article activity feed

Machine Learning Prediction of Discharge Destination in Patients with Parkinson’s Disease; A Nationwide Cohort Study

Disease trajectories and end of life care in a Norwegian ALS cohort

ICU Mortality and LOS Prediction Models Using MachineLearning Based on Both Real and Simulated Data

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Objective

Materials and Methods

Results

Discussion

Conclusions

Article activity feed

Related articles

Machine Learning Prediction of Discharge Destination in Patients with Parkinson’s Disease; A Nationwide Cohort Study

Disease trajectories and end of life care in a Norwegian ALS cohort

ICU Mortality and LOS Prediction Models Using MachineLearning Based on Both Real and Simulated Data