A Machine Learning Model Incorporating Laboratory Blood Tests Discriminates Between SARS-CoV-2 and Influenza Infections at Emergency Department Visit

Abstract

Introduction

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and influenza virus are contagious respiratory pathogens with similar symptoms but require different treatment and management strategies. This study investigated whether laboratory blood tests can discriminate between SARS-CoV-2 and influenza infections at emergency department (ED) presentation.

Methods

723 influenza A/B positive (2018/1/1 to 2020/3/15) and 1,281 SARS-CoV-2 positive (2020/3/11 to 2020/6/30) ED patients were retrospectively analyzed. Laboratory test results completed within 48 hours prior to reporting of virus RT-PCR results, as well as patient demographics were included to train and validate a random forest (RF) model. The dataset was randomly divided into training (2/3) and testing (1/3) sets with the same SARS-CoV-2/influenza ratio. The Shapley Additive Explanations technique was employed to visualize the impact of each laboratory test on the differentiation.

Results

The RF model incorporating results from 15 laboratory tests and demographic characteristics discriminated SARS-CoV-2 and influenza infections, with an area under the ROC curve value 0.90 in the independent testing set. The overall agreement with the RT-PCR results was 83% (95% CI: 80-86%). The test with the greatest impact on the differentiation was serum total calcium level. Further, the model achieved an AUC of 0.82 in a new dataset including 519 SARS-CoV-2 ED patients (2020/12/1 to 2021/2/28) and the previous 723 influenza positive patients. Serum calcium level remained the most impactful feature on the differentiation.

Conclusion

We identified characteristic laboratory test profiles differentiating SARS-CoV-2 and influenza infections, which may be useful for the preparedness of overlapping COVID-19 resurgence and future seasonal influenza.

Article activity feed

SciScore for 10.1101/2021.08.06.21261713: (What is this?)

Please note, not all rigor criteria are appropriate for all manuscripts.

Table 1: Rigor

Ethics	IRB: This study was approved by the Institutional Review Board (IRB) of Weill Cornell Medicine and deemed IRB exempt by the University of Buffalo.
Sex as a biological variable	not detected.
Randomization	The whole data set was randomly split into a training set (2/3 of cases) and a testing set (1/3 cases) with the same ratio of SARS-CoV-2/influenza cases as the ratio for the overall cases.
Blinding	not detected.
Power Analysis	not detected.

Table 2: Resources

Software and Algorithms
Sentences	Resources
Subsequently, a random forest classifier model was developed incorporating the results of 15 selected laboratory tests and patient age, gender, and race, using the Python scikit-learn package 0.23.2.	Python

SciScore for 10.1101/2021.08.06.21261713: (What is this?)

Please note, not all rigor criteria are appropriate for all manuscripts.

Table 1: Rigor

Ethics	IRB: This study was approved by the Institutional Review Board (IRB) of Weill Cornell Medicine and deemed IRB exempt by the University of Buffalo.
Sex as a biological variable	not detected.
Randomization	The whole data set was randomly split into a training set (2/3 of cases) and a testing set (1/3 cases) with the same ratio of SARS-CoV-2/influenza cases as the ratio for the overall cases.
Blinding	not detected.
Power Analysis	not detected.

Table 2: Resources

Software and Algorithms
Sentences	Resources
Subsequently, a random forest classifier model was developed incorporating the results of 15 selected laboratory tests and patient age, gender, and race, using the Python scikit-learn package 0.23.2.	Python suggested: (IPython, RRID:SCR_001658) scikit-learn suggested: (scikit-learn, RRID:SCR_002577)

Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).

Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:

A study limitation is that our model’s performance has not been validated in a dataset including concurrent SARS-CoV-2 and influenza positive patients as Influenza RT-PCR testing was suspended from March to September 2020 to prioritize resources for SARS-CoV-2 testing. We attempted to collect new data from November 2020 to February 2021, however, there was only one influenza positive case during this time in our hospital ED. This observation was consistent with the extremely low level of seasonal influenza in North America12. Despite a lack of direct comparison, the characteristic profile of SARS-CoV-2 in comparison to influenza infection is still valid and has the potential to impact patient care. The performance of our model could be further improved when it is trained with more concurrent influenza and SARS-CoV-2 patient data.

Results from TrialIdentifier: No clinical trial numbers were referenced.

Results from Barzooka: We did not find any issues relating to the usage of bar graphs.

Results from JetFighter: We did not find any issues relating to colormaps.

Results from rtransparent:

Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
No protocol registration statement was detected.

Results from scite Reference Check: We found no unreliable references.

Read the original source

Peter Kojo Quashie
Joe Kimanthi Mutungi
Francis Dzabeng
Daniel Oduro-Mensah
Precious C. Opurum
Kesego Tapela
Aniefiok John Udoakang
WACCBIP COVID-19 Team
Ivy Asante
Lily Paemka
Frederick Kumi-Ansah
Osbourne Quaye
Emmanuela Amoako
Ralph Armah
Charlyne Kilba
Nana Afia Boateng
Michael Ofori
George B. Kyei
Yaw Bediako
Nicaise Ndam
James Abugri
Patrick Ansah
William K. Ampofo
Francisca Mutapi
Gordon A. Awandare

Reviewed by ScreenIT

A Preliminary Prognostic Model for Predicting Poor Prognosis in COVID-19 Integrating Lung Epithelial Injury (KL-6) with Routine Care Markers

Yunlai Liang
Kun Wang
Lu Long
Qizhuo Hou
Wenze Yu
Kangkang Huang
Bin Yi

Clinical course of hospitalizations with Influenza, SARS-CoV-2 and respiratory syncytial virus (RSV) infections in the season 2024/2025 in a large German primary care centre and comparison with the previous two years

Benno Trautwein
Rudolf A. Jörres
Sebastian Engelhardt
Peter Alter
Kathrin Kahnert
Stephan Budweiser

A Machine Learning Model Incorporating Laboratory Blood Tests Discriminates Between SARS-CoV-2 and Influenza Infections at Emergency Department Visit

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Introduction

Methods

Results

Conclusion

Article activity feed

Trends of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) antibody prevalence in selected regions across Ghana

A Preliminary Prognostic Model for Predicting Poor Prognosis in COVID-19 Integrating Lung Epithelial Injury (KL-6) with Routine Care Markers

Clinical course of hospitalizations with Influenza, SARS-CoV-2 and respiratory syncytial virus (RSV) infections in the season 2024/2025 in a large German primary care centre and comparison with the previous two years

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Introduction

Methods

Results

Conclusion

Article activity feed

Related articles

Trends of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) antibody prevalence in selected regions across Ghana

A Preliminary Prognostic Model for Predicting Poor Prognosis in COVID-19 Integrating Lung Epithelial Injury (KL-6) with Routine Care Markers

Clinical course of hospitalizations with Influenza, SARS-CoV-2 and respiratory syncytial virus (RSV) infections in the season 2024/2025 in a large German primary care centre and comparison with the previous two years