Development and evaluation of a machine learning-based in-hospital COVID-19 disease outcome predictor (CODOP): A multicontinental retrospective study

Riku Klén
Disha Purohit
Ricardo Gómez-Huelgas
José Manuel Casas-Rojo
Juan Miguel Antón-Santos
Jesús Millán Núñez-Cortés
Carlos Lumbreras
José Manuel Ramos-Rincón
Noelia García Barrio
Miguel Pedrera-Jiménez
Antonio Lalueza Blanco
María Dolores Martin-Escalante
Francisco Rivas-Ruiz
Maria Ángeles Onieva-García
Pablo Young
Juan Ignacio Ramirez
Estela Edith Titto Omonte
Rosmery Gross Artega
Magdy Teresa Canales Beltrán
Pascual Ruben Valdez
Florencia Pugliese
Rosa Castagna
Ivan A Huespe
Bruno Boietti
Javier A Pollan
Nico Funke
Benjamin Leiding
David Gómez-Varela

Curated by eLife

Evaluation Summary:

This article is dealing with the unmet need to generate a machine-learning approach for the early and accurate estimation of the risk among COVID-19 admission. The presented data generate confidence on the validity since they have been developed in a vast number of patients and they are validated in cohorts from different geographical regions.

(This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #1 and Reviewer #3 agreed to share their name with the authors.)

This article has been Reviewed by the following groups

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

Evaluated articles (eLife)
Evaluated articles (ScreenIT)

Abstract

New SARS-CoV-2 variants, breakthrough infections, waning immunity, and sub-optimal vaccination rates account for surges of hospitalizations and deaths. There is an urgent need for clinically valuable and generalizable triage tools assisting the allocation of hospital resources, particularly in resource-limited countries. We developed and validate CODOP, a machine learning-based tool for predicting the clinical outcome of hospitalized COVID-19 patients. CODOP was trained, tested and validated with six cohorts encompassing 29223 COVID-19 patients from more than 150 hospitals in Spain, the USA and Latin America during 2020–22. CODOP uses 12 clinical parameters commonly measured at hospital admission for reaching high discriminative ability up to 9 days before clinical resolution (AUROC: 0·90–0·96), it is well calibrated, and it enables an effective dynamic risk stratification during hospitalization. Furthermore, CODOP maintains its predictive ability independently of the virus variant and the vaccination status. To reckon with the fluctuating pressure levels in hospitals during the pandemic, we offer two online CODOP calculators, suited for undertriage or overtriage scenarios, validated with a cohort of patients from 42 hospitals in three Latin American countries (78–100% sensitivity and 89–97% specificity). The performance of CODOP in heterogeneous and geographically disperse patient cohorts and the easiness of use strongly suggest its clinical utility, particularly in resource-limited countries.

Version published to 10.7554/elife.75985 on eLife
May 17, 2022
eLife
Mar 4, 2022

Evaluation Summary:

This article is dealing with the unmet need to generate a machine-learning approach for the early and accurate estimation of the risk among COVID-19 admission. The presented data generate confidence on the validity since they have been developed in a vast number of patients and they are validated in cohorts from different geographical regions.

(This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #1 and Reviewer #3 agreed to share their name with the authors.)

Read the original source
eLife
Mar 4, 2022

Reviewer #1 (Public Review):

This submission is dealing with the unmet need to generate a machine learning approach for the early and accurate estimation of the risk among COVID-19 submission. The presented data generate confidence on the validity since they have been developed in a vast number of patients and they are validated in cohorts from different geographical regions.

Read the original source
eLife
Mar 4, 2022

Reviewer #2 (Public Review):

The authors describe the development by machine learning of a score, namely CODOP, to predict in an easy and cheap way in-hospital mortality of patients with COVId-19 pneumonia. The score is developed and validated through large and different (multinational) cohorts suggesting robust results. They provide two versions in case of over- and under-triage.
The manuscript is well written and statistics are adequate. All related data are provided and ethical issues do not rise.

Read the original source
eLife
Mar 4, 2022

Reviewer #3 (Public Review):

This is a robust, solid work developing an artificial intelligence-derived model (CODOP) which accurately predicts mortality risk in COVID-19 patients needing of hospitalization. Major strengths include the derivation and validation approach using thousands of patients across different continents, either in a single time point (hospital admission) or across a time period (first nine days following admission). The low number of missing values for the considered variables also contributes to the validity of the results. The eleven parameters considered are commonly used in hospitals all over the world, facilitating its application. They compare the performance of CODOP against three reference models. The authors have also developed an on-line calculator to make easier the clinical application of this model.

Read the original source

SciScore for 10.1101/2021.09.20.21263794: (What is this?)

Please note, not all rigor criteria are appropriate for all manuscripts.

Table 1: Rigor

Ethics	IRB: 9 The use of the anonymized clinical data of patients from the SEMI-COVID-Registry was approved by the Provincial Research Ethics Committee of Málaga (Spain).
Sex as a biological variable	not detected.
Randomization	not detected.
Blinding	All predictions were done blinded to the final clinical outcome.
Power Analysis	not detected.

Table 2: Resources

Recombinant DNA
Sentences	Resources
The metrics were calculated using R packages pROC21 (version 1.17.0.1) and caret15 (R package version 6.0-86).	pROC21 suggested: None
Software and Algorithms
Sentences	Resources
COPE model is a linear regression model, which uses variables age, respiratory rate, C-reactive protein, lactic dehydrogenase, albumin, and urea.	CO…

SciScore for 10.1101/2021.09.20.21263794: (What is this?)

Please note, not all rigor criteria are appropriate for all manuscripts.

Table 1: Rigor

Ethics	IRB: 9 The use of the anonymized clinical data of patients from the SEMI-COVID-Registry was approved by the Provincial Research Ethics Committee of Málaga (Spain).
Sex as a biological variable	not detected.
Randomization	not detected.
Blinding	All predictions were done blinded to the final clinical outcome.
Power Analysis	not detected.

Table 2: Resources

Recombinant DNA
Sentences	Resources
The metrics were calculated using R packages pROC21 (version 1.17.0.1) and caret15 (R package version 6.0-86).	pROC21 suggested: None
Software and Algorithms
Sentences	Resources
COPE model is a linear regression model, which uses variables age, respiratory rate, C-reactive protein, lactic dehydrogenase, albumin, and urea.	COPE suggested: (COPE, RRID:SCR_009153)

Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).

Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:

The overall performance of CODOP has inherent limitations, some of them generalizable to any MLH. On the one side, our approach to using training and test datasets with a high degree of perturbations (see above) adds several sources of variability32: pre-analytical due to differences in blood sampling, analytical due to different laboratory protocols, intra- and inter-individual, and inter-hospital and geographical differences in clinical practices. As an additional factor, the high diversity of COVID-19 encompassing more than 60 disease subtypes7 sets a limitation in terms of the discriminability ability and the overall clinical utility of any MHL. In contrast to other predictors and to facilitate its use, CODOP does not take into account the level of care received by each patient (e.g., ICU versus basic care), which influences the outcome of the patient and perturbs the discrimination ability of CODOP (as predictions are made with the data from blood analyses at hospital admission). A clear example is a slightly lower performance of CODOP-Ovt (sensitivity of 73%) in the case of the “Hospital Vélez Sarsfield‥ from Buenos Aires (named as Argentina (b) in Figure 4B), where all patients analysed by CODOP were finally treated in the ICU. On the other hand, CODOP-Unt would have correctly suggested triaging 84% of these patients already on the day of admission, therefore offering a significant clinical utility. Finally, the clinical utility of MHL has to take into account the chan...

Results from TrialIdentifier: No clinical trial numbers were referenced.

Results from Barzooka: We did not find any issues relating to the usage of bar graphs.

Results from JetFighter: We did not find any issues relating to colormaps.

Results from rtransparent:

Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
Thank you for including a protocol registration statement.

Results from scite Reference Check: We found no unreliable references.

Read the original source

Version published to 10.1101/2021.09.20.21263794 on medRxiv
Sep 22, 2021

Development and Deployment of a Machine Learning–Based Predictive Model for COVID- 19 Infection Using Patient Demographic and Symptom Data in Nigeria

This article has 10 authors:
1. Olanrewaju Eniade
2. Ezekiel Ukwenga
3. Uchenna Akuka
4. Opeyemi Adeniyi
5. Elonna Obak
6. Omolola Adeagbo
7. Peter Babatunde Olaitan
8. Rita Ayanbolade Olowe
9. Tolulope Opakunle
10. Olugbenga Adekunle Olowe
This article has no evaluationsLatest version Jan 25, 2026
Machine learning prediction and interpretive analysis of multidrug-resistant microbial infection risk in septicemia patients: A study from the MIMIC-IV database

This article has 5 authors:
1. Qianqian Zhang
2. Nianzhi Zhang
3. Ying Zheng
4. Jing Zhou
5. Ling Liu
This article has no evaluationsLatest version Dec 30, 2025
A Preliminary Prognostic Model for Predicting Poor Prognosis in COVID-19 Integrating Lung Epithelial Injury (KL-6) with Routine Care Markers

This article has 7 authors:
1. Yunlai Liang
2. Kun Wang
3. Lu Long
4. Qizhuo Hou
5. Wenze Yu
6. Kangkang Huang
7. Bin Yi
This article has no evaluationsLatest version Feb 3, 2026

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Development and Deployment of a Machine Learning–Based Predictive Model for COVID- 19 Infection Using Patient Demographic and Symptom Data in Nigeria

Machine learning prediction and interpretive analysis of multidrug-resistant microbial infection risk in septicemia patients: A study from the MIMIC-IV database​

A Preliminary Prognostic Model for Predicting Poor Prognosis in COVID-19 Integrating Lung Epithelial Injury (KL-6) with Routine Care Markers

Machine learning prediction and interpretive analysis of multidrug-resistant microbial infection risk in septicemia patients: A study from the MIMIC-IV database