A Machine Learning Model Incorporating Laboratory Blood Tests Discriminates Between SARS-CoV-2 and Influenza Infections at Emergency Department Visit
This article has been Reviewed by the following groups
Listed in
- Evaluated articles (ScreenIT)
Abstract
Introduction
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and influenza virus are contagious respiratory pathogens with similar symptoms but require different treatment and management strategies. This study investigated whether laboratory blood tests can discriminate between SARS-CoV-2 and influenza infections at emergency department (ED) presentation.
Methods
723 influenza A/B positive (2018/1/1 to 2020/3/15) and 1,281 SARS-CoV-2 positive (2020/3/11 to 2020/6/30) ED patients were retrospectively analyzed. Laboratory test results completed within 48 hours prior to reporting of virus RT-PCR results, as well as patient demographics were included to train and validate a random forest (RF) model. The dataset was randomly divided into training (2/3) and testing (1/3) sets with the same SARS-CoV-2/influenza ratio. The Shapley Additive Explanations technique was employed to visualize the impact of each laboratory test on the differentiation.
Results
The RF model incorporating results from 15 laboratory tests and demographic characteristics discriminated SARS-CoV-2 and influenza infections, with an area under the ROC curve value 0.90 in the independent testing set. The overall agreement with the RT-PCR results was 83% (95% CI: 80-86%). The test with the greatest impact on the differentiation was serum total calcium level. Further, the model achieved an AUC of 0.82 in a new dataset including 519 SARS-CoV-2 ED patients (2020/12/1 to 2021/2/28) and the previous 723 influenza positive patients. Serum calcium level remained the most impactful feature on the differentiation.
Conclusion
We identified characteristic laboratory test profiles differentiating SARS-CoV-2 and influenza infections, which may be useful for the preparedness of overlapping COVID-19 resurgence and future seasonal influenza.
Article activity feed
-
SciScore for 10.1101/2021.08.06.21261713: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
Ethics IRB: This study was approved by the Institutional Review Board (IRB) of Weill Cornell Medicine and deemed IRB exempt by the University of Buffalo. Sex as a biological variable not detected. Randomization The whole data set was randomly split into a training set (2/3 of cases) and a testing set (1/3 cases) with the same ratio of SARS-CoV-2/influenza cases as the ratio for the overall cases. Blinding not detected. Power Analysis not detected. Table 2: Resources
Software and Algorithms Sentences Resources Subsequently, a random forest classifier model was developed incorporating the results of 15 selected laboratory tests and patient age, gender, and race, using the Python scikit-learn package 0.23.2. PythonSciScore for 10.1101/2021.08.06.21261713: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
Ethics IRB: This study was approved by the Institutional Review Board (IRB) of Weill Cornell Medicine and deemed IRB exempt by the University of Buffalo. Sex as a biological variable not detected. Randomization The whole data set was randomly split into a training set (2/3 of cases) and a testing set (1/3 cases) with the same ratio of SARS-CoV-2/influenza cases as the ratio for the overall cases. Blinding not detected. Power Analysis not detected. Table 2: Resources
Software and Algorithms Sentences Resources Subsequently, a random forest classifier model was developed incorporating the results of 15 selected laboratory tests and patient age, gender, and race, using the Python scikit-learn package 0.23.2. Pythonsuggested: (IPython, RRID:SCR_001658)scikit-learnsuggested: (scikit-learn, RRID:SCR_002577)Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).
Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:A study limitation is that our model’s performance has not been validated in a dataset including concurrent SARS-CoV-2 and influenza positive patients as Influenza RT-PCR testing was suspended from March to September 2020 to prioritize resources for SARS-CoV-2 testing. We attempted to collect new data from November 2020 to February 2021, however, there was only one influenza positive case during this time in our hospital ED. This observation was consistent with the extremely low level of seasonal influenza in North America12. Despite a lack of direct comparison, the characteristic profile of SARS-CoV-2 in comparison to influenza infection is still valid and has the potential to impact patient care. The performance of our model could be further improved when it is trained with more concurrent influenza and SARS-CoV-2 patient data.
Results from TrialIdentifier: No clinical trial numbers were referenced.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: We did not find any issues relating to colormaps.
Results from rtransparent:- Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
- Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
- No protocol registration statement was detected.
Results from scite Reference Check: We found no unreliable references.
-