Development and evaluation of a machine learning-based in-hospital COVID-19 disease outcome predictor (CODOP): A multicontinental retrospective study
Curation statements for this article:-
Curated by eLife
Evaluation Summary:
This article is dealing with the unmet need to generate a machine-learning approach for the early and accurate estimation of the risk among COVID-19 admission. The presented data generate confidence on the validity since they have been developed in a vast number of patients and they are validated in cohorts from different geographical regions.
(This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #1 and Reviewer #3 agreed to share their name with the authors.)
This article has been Reviewed by the following groups
Listed in
- Evaluated articles (eLife)
- Evaluated articles (ScreenIT)
Abstract
New SARS-CoV-2 variants, breakthrough infections, waning immunity, and sub-optimal vaccination rates account for surges of hospitalizations and deaths. There is an urgent need for clinically valuable and generalizable triage tools assisting the allocation of hospital resources, particularly in resource-limited countries. We developed and validate CODOP, a machine learning-based tool for predicting the clinical outcome of hospitalized COVID-19 patients. CODOP was trained, tested and validated with six cohorts encompassing 29223 COVID-19 patients from more than 150 hospitals in Spain, the USA and Latin America during 2020–22. CODOP uses 12 clinical parameters commonly measured at hospital admission for reaching high discriminative ability up to 9 days before clinical resolution (AUROC: 0·90–0·96), it is well calibrated, and it enables an effective dynamic risk stratification during hospitalization. Furthermore, CODOP maintains its predictive ability independently of the virus variant and the vaccination status. To reckon with the fluctuating pressure levels in hospitals during the pandemic, we offer two online CODOP calculators, suited for undertriage or overtriage scenarios, validated with a cohort of patients from 42 hospitals in three Latin American countries (78–100% sensitivity and 89–97% specificity). The performance of CODOP in heterogeneous and geographically disperse patient cohorts and the easiness of use strongly suggest its clinical utility, particularly in resource-limited countries.
Article activity feed
-
-
Evaluation Summary:
This article is dealing with the unmet need to generate a machine-learning approach for the early and accurate estimation of the risk among COVID-19 admission. The presented data generate confidence on the validity since they have been developed in a vast number of patients and they are validated in cohorts from different geographical regions.
(This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #1 and Reviewer #3 agreed to share their name with the authors.)
-
Reviewer #1 (Public Review):
This submission is dealing with the unmet need to generate a machine learning approach for the early and accurate estimation of the risk among COVID-19 submission. The presented data generate confidence on the validity since they have been developed in a vast number of patients and they are validated in cohorts from different geographical regions.
-
Reviewer #2 (Public Review):
The authors describe the development by machine learning of a score, namely CODOP, to predict in an easy and cheap way in-hospital mortality of patients with COVId-19 pneumonia. The score is developed and validated through large and different (multinational) cohorts suggesting robust results. They provide two versions in case of over- and under-triage.
The manuscript is well written and statistics are adequate. All related data are provided and ethical issues do not rise. -
Reviewer #3 (Public Review):
This is a robust, solid work developing an artificial intelligence-derived model (CODOP) which accurately predicts mortality risk in COVID-19 patients needing of hospitalization. Major strengths include the derivation and validation approach using thousands of patients across different continents, either in a single time point (hospital admission) or across a time period (first nine days following admission). The low number of missing values for the considered variables also contributes to the validity of the results. The eleven parameters considered are commonly used in hospitals all over the world, facilitating its application. They compare the performance of CODOP against three reference models. The authors have also developed an on-line calculator to make easier the clinical application of this model.
-
-
SciScore for 10.1101/2021.09.20.21263794: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
Ethics IRB: 9 The use of the anonymized clinical data of patients from the SEMI-COVID-Registry was approved by the Provincial Research Ethics Committee of Málaga (Spain). Sex as a biological variable not detected. Randomization not detected. Blinding All predictions were done blinded to the final clinical outcome. Power Analysis not detected. Table 2: Resources
Recombinant DNA Sentences Resources The metrics were calculated using R packages pROC21 (version 1.17.0.1) and caret15 (R package version 6.0-86). pROC21suggested: NoneSoftware and Algorithms Sentences Resources COPE model is a linear regression model, which uses variables age, respiratory rate, C-reactive protein, lactic dehydrogenase, albumin, and urea. CO…SciScore for 10.1101/2021.09.20.21263794: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
Ethics IRB: 9 The use of the anonymized clinical data of patients from the SEMI-COVID-Registry was approved by the Provincial Research Ethics Committee of Málaga (Spain). Sex as a biological variable not detected. Randomization not detected. Blinding All predictions were done blinded to the final clinical outcome. Power Analysis not detected. Table 2: Resources
Recombinant DNA Sentences Resources The metrics were calculated using R packages pROC21 (version 1.17.0.1) and caret15 (R package version 6.0-86). pROC21suggested: NoneSoftware and Algorithms Sentences Resources COPE model is a linear regression model, which uses variables age, respiratory rate, C-reactive protein, lactic dehydrogenase, albumin, and urea. COPEsuggested: (COPE, RRID:SCR_009153)Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).
Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:The overall performance of CODOP has inherent limitations, some of them generalizable to any MLH. On the one side, our approach to using training and test datasets with a high degree of perturbations (see above) adds several sources of variability32: pre-analytical due to differences in blood sampling, analytical due to different laboratory protocols, intra- and inter-individual, and inter-hospital and geographical differences in clinical practices. As an additional factor, the high diversity of COVID-19 encompassing more than 60 disease subtypes7 sets a limitation in terms of the discriminability ability and the overall clinical utility of any MHL. In contrast to other predictors and to facilitate its use, CODOP does not take into account the level of care received by each patient (e.g., ICU versus basic care), which influences the outcome of the patient and perturbs the discrimination ability of CODOP (as predictions are made with the data from blood analyses at hospital admission). A clear example is a slightly lower performance of CODOP-Ovt (sensitivity of 73%) in the case of the “Hospital Vélez Sarsfield‥ from Buenos Aires (named as Argentina (b) in Figure 4B), where all patients analysed by CODOP were finally treated in the ICU. On the other hand, CODOP-Unt would have correctly suggested triaging 84% of these patients already on the day of admission, therefore offering a significant clinical utility. Finally, the clinical utility of MHL has to take into account the chan...
Results from TrialIdentifier: No clinical trial numbers were referenced.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: We did not find any issues relating to colormaps.
Results from rtransparent:- Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
- Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
- Thank you for including a protocol registration statement.
Results from scite Reference Check: We found no unreliable references.
-