Routine Laboratory Blood Tests Predict SARS-CoV-2 Infection Using Machine Learning
This article has been Reviewed by the following groups
Listed in
- Evaluated articles (ScreenIT)
Abstract
Background
Accurate diagnostic strategies to identify SARS-CoV-2 positive individuals rapidly for management of patient care and protection of health care personnel are urgently needed. The predominant diagnostic test is viral RNA detection by RT-PCR from nasopharyngeal swabs specimens, however the results are not promptly obtainable in all patient care locations. Routine laboratory testing, in contrast, is readily available with a turn-around time (TAT) usually within 1-2 hours.
Method
We developed a machine learning model incorporating patient demographic features (age, sex, race) with 27 routine laboratory tests to predict an individual’s SARS-CoV-2 infection status. Laboratory testing results obtained within 2 days before the release of SARS-CoV-2 RT-PCR result were used to train a gradient boosting decision tree (GBDT) model from 3,356 SARS-CoV-2 RT-PCR tested patients (1,402 positive and 1,954 negative) evaluated at a metropolitan hospital.
Results
The model achieved an area under the receiver operating characteristic curve (AUC) of 0.854 (95% CI: 0.829-0.878). Application of this model to an independent patient dataset from a separate hospital resulted in a comparable AUC (0.838), validating the generalization of its use. Moreover, our model predicted initial SARS-CoV-2 RT-PCR positivity in 66% individuals whose RT-PCR result changed from negative to positive within 2 days.
Conclusion
This model employing routine laboratory test results offers opportunities for early and rapid identification of high-risk SARS-CoV-2 infected patients before their RT-PCR results are available. It may play an important role in assisting the identification of SARS-CoV-2 infected patients in areas where RT-PCR testing is not accessible due to financial or supply constraints.
Article activity feed
-
-
-
-
SciScore for 10.1101/2020.06.17.20133892: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
Institutional Review Board Statement IRB: This study was approved by the Institutional Review Board (#20-03021671) of Weill Cornell Medicine. Randomization The first setting was a 5-fold cross validation with the NYPH/WCM data, where all RT-PCR tests were randomly partitioned into 5 equal buckets with the same positive/negative ratio in each bucket as the ratio over all tests. Blinding not detected. Power Analysis not detected. Sex as a biological variable not detected. Table 2: Resources
Software and Algorithms Sentences Resources At NYPH/LMH, Routine chemistry testing including procalcitonin was performed on Abbott ARCHITECT® c SYSTEM ci 4100 and ci 8200 analyzers. Abbottsuggested: (Abbott, RRID:SCR_010477)Th… SciScore for 10.1101/2020.06.17.20133892: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
Institutional Review Board Statement IRB: This study was approved by the Institutional Review Board (#20-03021671) of Weill Cornell Medicine. Randomization The first setting was a 5-fold cross validation with the NYPH/WCM data, where all RT-PCR tests were randomly partitioned into 5 equal buckets with the same positive/negative ratio in each bucket as the ratio over all tests. Blinding not detected. Power Analysis not detected. Sex as a biological variable not detected. Table 2: Resources
Software and Algorithms Sentences Resources At NYPH/LMH, Routine chemistry testing including procalcitonin was performed on Abbott ARCHITECT® c SYSTEM ci 4100 and ci 8200 analyzers. Abbottsuggested: (Abbott, RRID:SCR_010477)The implementation was based on scikit-learn package 0.23.1(22) with the sklearn.model_selection. scikit-learnsuggested: (scikit-learn, RRID:SCR_002577)Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).
Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:There are three potential limitations to the use of this model. First, the model was trained on a dataset generated from a patient cohort who were in the hospital for moderate to life-threatening presentations of COVID-19. Thus, this model may not be applicable to mild COVID-19 cases. Second, the model was developed with a “control group” of ill patients in a metropolitan hospital for other causes. Thus, the model may need further refinement with different populations such as patients seen in a primary care office. Third, clinical application of the proposed model may require modification of laboratory testing practice to include tests that are not currently part of the institutional COVID-like illness (CLI) laboratory test panel. Generally speaking, an ideal training set for a learning-based approach should cover the variability of samples across different demographic and geographic distributions, as well as comorbidities, facilities (e.g. ED, inpatients, out-patient clinics) and to follow their changes over time. In practice, any training set collected within a fixed time period cannot satisfy all these wishes. The deployment of software in medical scenarios cannot be achieved by one stop. It is a continuous learning process that involves model monitoring, updating and customization. The US Food and Drug Administration (FDA) published a white paper (33) last year particularly discussing how to properly regulate the adaptations/modifications of AI/machine learning models as ...
Results from TrialIdentifier: No clinical trial numbers were referenced.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: We did not find any issues relating to colormaps.
Results from rtransparent:- Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
- Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
- No protocol registration statement was detected.
-