Validation of a Deep Learning Model to aid in COVID-19 Detection from Digital Chest Radiographs

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Introduction

Using artificial intelligence in imaging practice helps ensure study list reprioritization, prompt attention to urgent studies, and reduces the reporting turn-around time.

Purpose

We tested a deep learning-based artificial intelligence model that can detect COVID-19 pneumonia patterns from digital chest radiographs.

Material and Methods

The deep learning model was built using the enhanced U-Net architecture with Spatial Attention Gate and Xception Encoder. The model was named DxCOVID and was tested on an external clinical dataset. The dataset included 2247 chest radiographs comprising CXRs from 1046 COVID-19 positive patients (positive on RT-PCR) and 1201 COVID-19 negative patients.

Results

We compared the performance of the model with three different radiologists by adjusting the model’s sensitivity as per the individual radiologist. The area under the curve (AUC) on the receiver operating characteristic (ROC) of the model was 0.87 [95% CI: 0.85, 0.89].

Conclusion

When compared to the performance of three expert readers, DxCOVID matched the output of two of the three readers. Disease-specific deep learning models using current technology are mature enough to match radiologists’ performance and can be a suitable tool to be incorporated into imaging workflows.

Article activity feed

  1. SciScore for 10.1101/2022.06.02.22275895: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    No key resources detected.


    Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


    Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
    Our study has some limitations. First, the external test set was sourced from a single institution. It is critical to test the AI algorithm across multiple datasets distributed across geographical regions to ensure that the model’s results are generalizable across cohorts and geographies. Second, we utilized RT-PCR as the gold standard for the diagnosis of COVID-19 infections. However, RT-PCR has a limited sensitivity of approximately 71%, so there may be cases where the person is COVID-19 positive on chest radiographs but negative on RT-PCR results. Third, our study does not incorporate clinical parameters and does not attempt at categorising patients based on COVID-19 severity scores. We avoided providing COVID-19 scores based on chest radiographs as unlike chest CT scans, there is often interobserver disagreement on the extent of lung involvement in radiography for reasons encompassing different acquisition protocols, image quality, and radiologist opinion. The development of an AI system based on the consensus scoring of a few radiologists from an isolated geographical location may not represent the consensus of radiologists globally. However, some researchers did attempt to build such a system, like the one developed by Borghesi & Maroldi (22) on a small dataset of 100 patients. Attempts were also made by Monaco et al. (23) with the dataset of 295 patients and Orsi et al. (24) with the dataset of 155 patients to produce scoring systems for chest radiographs and link them...

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a protocol registration statement.

    Results from scite Reference Check: We found no unreliable references.


    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.