COVID-19 diagnosis prediction in emergency care patients: a machine learning approach

Abstract

The coronavirus disease (COVID-19) pandemic has increased the necessity of immediate clinical decisions and effective usage of healthcare resources. Currently, the most validated diagnosis test for COVID-19 (RT-PCR) is in shortage in most developing countries, which may increase infection rates and delay important preventive measures. The objective of this study was to predict the risk of positive COVID-19 diagnosis with machine learning, using as predictors only results from emergency care admission exams. We collected data from 235 adult patients from the Hospital Israelita Albert Einstein in São Paulo, Brazil, from 17 to 30 of March, 2020, of which 102 (43%) received a positive diagnosis of COVID-19 from RT-PCR tests. Five machine learning algorithms (neural networks, random forests, gradient boosting trees, logistic regression and support vector machines) were trained on a random sample of 70% of the patients, and performance was tested on new unseen data (30%). The best predictive performance was obtained by the support vector machines algorithm (AUC: 0.85; Sensitivity: 0.68; Specificity: 0.85; Brier Score: 0.16). The three most important variables for the predictive performance of the algorithm were the number of lymphocytes, leukocytes and eosinophils, respectively. In conclusion, we found that targeted decisions for receiving COVID-19 tests using only routinely-collected data is a promising new area with the use of machine learning algorithms.

Article activity feed

SciScore for 10.1101/2020.04.04.20052092: (What is this?)

Please note, not all rigor criteria are appropriate for all manuscripts.

Table 1: Rigor

Institutional Review Board Statement	not detected.
Randomization	The sample was randomly divided using a 70-30 split, where 70% of the patients were used to train the machine learning algorithms, and the other 30% were used to test the performance of the models on new unseen data.
Blinding	not detected.
Power Analysis	not detected.
Sex as a biological variable	not detected.

Table 2: Resources

Software and Algorithms
Sentences	Resources
All analyses were performed in Python using the scikit-learn library.	Python suggested: (IPython, RRID:SCR_001658) scikit-learn suggested: (scikit-learn, RRID:SCR_002577)

Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged …

SciScore for 10.1101/2020.04.04.20052092: (What is this?)

Please note, not all rigor criteria are appropriate for all manuscripts.

Table 1: Rigor

Institutional Review Board Statement	not detected.
Randomization	The sample was randomly divided using a 70-30 split, where 70% of the patients were used to train the machine learning algorithms, and the other 30% were used to test the performance of the models on new unseen data.
Blinding	not detected.
Power Analysis	not detected.
Sex as a biological variable	not detected.

Table 2: Resources

Software and Algorithms
Sentences	Resources
All analyses were performed in Python using the scikit-learn library.	Python suggested: (IPython, RRID:SCR_001658) scikit-learn suggested: (scikit-learn, RRID:SCR_002577)

Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).

Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

Results from TrialIdentifier: No clinical trial numbers were referenced.

Results from Barzooka: We did not find any issues relating to the usage of bar graphs.

Results from JetFighter: We did not find any issues relating to colormaps.

Results from rtransparent:

Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
No funding statement was detected.
No protocol registration statement was detected.

Read the original source

Abdallah Alsammani
Merasia Johnson
Jessica Elrefaei

Pre-pandemic blood profiles predict COVID-19 hospitalization and death a decade later

Laurence A. Jacobs

Derivation and validation of clinical prediction models for viral etiologies of acute diarrhea in North American children presenting for emergency care

Paola Fonseca-Romero
Timothy Smith
Sharia M Ahmed
Anna Jones
Natalya Alekhina
Ben J. Brintz
Jennifer Dien Bard
Kimberle C. Chapin
Daniel M. Cohen
Ara Festekjian
Jami T. Jackson
Neena Kanwar
Chari D Larsen
Amy L. Leber
Rangaraj Selvarangan
Stephen Freedman
Andrew T. Pavia
Daniel T. Leung

COVID-19 diagnosis prediction in emergency care patients: a machine learning approach

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Article activity feed

Calibrated and Interpretable Machine Learning for ICU Mortality Prediction Using First 24-Hour Clinical Data

Pre-pandemic blood profiles predict COVID-19 hospitalization and death a decade later

Derivation and validation of clinical prediction models for viral etiologies of acute diarrhea in North American children presenting for emergency care

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Calibrated and Interpretable Machine Learning for ICU Mortality Prediction Using First 24-Hour Clinical Data

Pre-pandemic blood profiles predict COVID-19 hospitalization and death a decade later

Derivation and validation of clinical prediction models for viral etiologies of acute diarrhea in North American children presenting for emergency care