A novel specific artificial intelligence-based method to identify COVID-19 cases using simple blood exams

This article has been Reviewed by the following groups

Read the full article

Abstract

The SARS-CoV-2 virus responsible for COVID-19 poses a significant challenge to healthcare systems worldwide. Despite governmental initiatives aimed at containing the spread of the disease, several countries are experiencing unmanageable increases in the demand for ICU beds, medical equipment, and larger testing capacity. Efficient COVID-19 diagnosis enables healthcare systems to provide better care for patients while protecting caregivers from the disease. However, many countries are constrained by the limited amount of test kits available, lack of equipment and trained professionals. In the case of patients visiting emergency rooms (ERs) with a suspect of COVID-19, prompt diagnosis may improve the outcome and even provide information for efficient hospital management. In such a context, a quick, inexpensive and readily available test to perform an initial triage in ERs could help to smooth patient flow, provide better patient care, and reduce the backlog of exams.

Methods

In this Case-control quantitative study, we developed a strategy backed by artificial intelligence to perform an initial screening of suspect COVID-19 patients. We developed a machine learning classifier that takes widely available simple blood exams as input and classifies samples as likely to be positive (having SARS-CoV-2) or negative (not having SARS-CoV-2). Based on this initial classification, positive cases can be referred for further highly sensitive testing (e.g. CT scan, or specific antibodies). We used publicly available data from the Albert Einstein Hospital in Brazil from 5,644 patients. Focusing on simple blood exam figures as main predictors, a sample of 599 subjects that had the fewest missing values for 16 common exams were selected. From these 599 patients, 81 tested positive for SARS-CoV-2 (determined by RT-PCR). Based on the reduced dataset, we built an artificial intelligence classification framework, ER-CoV, aiming at determining if suspect patients arriving in ER were likely to be negative for SARS-CoV-2, that is, to predict if that suspect patient is negative for COVID-19. The primary goal of this investigation is to develop a classifier with high specificity and high negative predictive values, with reasonable sensitivity.

Findings

We identified that our AI framework achieved an average specificity of 85.98% [95%CI: 84.94 – 86.84] and negative predictive value (NPV) of 94.92% [95%CI: 94.37% – 95.37%]. Those values are completely aligned with our goal of providing an effective low-cost system to triage suspect patients in ERs. As for sensitivity, our model achieved an average of 70.25% [95%CI: 66.57% – 73.12%] and positive predictive value (PPV) of 44.96% [95%CI: 43.15% – 46.87%]. The area under the curve (AUC) of the receiver operating characteristic (ROC) was 86.78% [95%CI: 85.65% – 87.90%]. An error analysis (inspection of which patients were misclassified) identified that, on average, 28% of the false negative results would have been hospitalized anyway; thus the model is making mistakes for severe cases that would not be overlooked, partially mitigating the fact that the test is not highly sensitive. All code for our AI model, called ER-CoV is publicly available at https://github.com/soares-f/ER-CoV .

Interpretation

Based on the capacity of our model to accurately predict which cases are negative from suspect patients arriving in emergency rooms, we envision that this framework may play an important role in patient triage. Probably the most important outcome is related to testing availability, which at this point is extremely low in many countries. Considering the achieved specificity, we could reduce by at least 90% the number of SARS-CoV-2 tests performed in emergency rooms, with around 5% chance of getting a false negative. The second important outcome is related to patient management in hospitals. Patients predicted as positive by our framework could be immediately separated from other patients while waiting for the results of confirmatory tests. This could reduce the spread rate within hospitals since in many of them all suspect cases are kept in the same ward. In Brazil, where the data was collected, rate infection is starting to quickly spread and the lead time of a SARS-CoV-2 may be up to 2 weeks.

Article activity feed

  1. SciScore for 10.1101/2020.04.10.20061036: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    Antibodies
    SentencesResources
    In case of a positive result, considering the model’s relatively low sensitivity, priority should be given to this patient for further investigation including confirmatory PCR or antibody-based test for SSARS-CoV-2 and CT scan.
    antibody-based test for SSARS-CoV-2
    suggested: None

    Results from OddPub: Thank you for sharing your code.


    Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
    A limitation of our study is that we were not capable of identifying with high certainty which blood exams contribute the most to the classification, due to the nature of our AI model framework. However, previous studies already identified that C reactive protein 21, leukocytes 22,23, platelets 22, and lymphocytes 24 are altered at different levels in COVID-19 patients. Thus, we envision for future research a detailed study of which blood exams are more informative for differential diagnosis, as well as understanding how the SARS-CoV-2 virus alters blood components. Another possible limitation is that data were collected only at the emergency room, with patients already displaying symptoms compatible with COVID-19. At this point, due to the lack of data from asymptomatic patients, we cannot generalize how our model would perform for a group of individuals that are not compatible with characteristic symptoms of COVID-19. In this paper, we report a novel method for the classification of COVID-19 patients in emergency rooms. The ER-CoV method is low-cost and relies only on simple blood exams that are fast and highly available, and resort to artificial intelligence methods to model such patients. We achieved significant results and foresee many applications of this framework. We thus support additional initiatives such as this one by the Albert Einstein Hospital, since the availability of data on COVID-19 tests allows the proposition of AI-based methods that could support medical...

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.