A Hybrid Pipeline for Covid-19 Screening Incorporating Lungs Segmentation and Wavelet Based Preprocessing of Chest X-Rays

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

We have developed a two-module pipeline for the detection of SARS-CoV-2 from chest X-rays (CXRs). Module 1 is a traditional convnet that generates masks of the lungs overlapping the heart and large vasa. Module 2 is a hybrid convnet that preprocesses CXRs and corresponding lung masks by means of the Wavelet Scattering Transform, and passes the resulting feature maps through an Attention block and a cascade of Separable Atrous Multiscale Convolutional Residual blocks to produce a class assignment as Covid or non-Covid. Module 1 was trained on a public dataset of 6395 CXRs with radiologist annotated lung contours. Module 2 was trained on a dataset of 2362 non-Covid and 1435 Covid CXRs acquired at the Henry Ford Health System Hospital in Detroit. Six distinct cross-validation models, were combined into an ensemble model that was used to classify the CXR images of the test set. An intuitive graphic interphase allows for rapid Covid vs . non-Covid classification of CXRs, and generates high resolution heat maps that identify the affected lung regions.

Article activity feed

  1. This Zenodo record is a permanently preserved version of a PREreview. You can view the complete PREreview at https://prereview.org/reviews/10480343.

    This review is the result of a Live Review organized and hosted by PREreview and JMIR Publications on September 2, 2022. The call was joined by 15 people, including reviewers, preprint authors, and facilitators.

    Summary 

    The authors of this study present a novel strategy and methodology to classify patients' chest X-ray (CXR) images as SARS-CoV-2 positive (covid+) and SARS-CoV-2 negative (covid-). The proposed method is presented as an alternative to existing CXR evaluation methods which require trained expertise and sophisticated equipment to be implemented. The methodology is based on an hybrid artificial intelligence (AI) pipeline dominantly using wavelet scattering processing for encoding the principal features of the CXRs. The reviewers indicated this study as innovative and potentially very useful in identifying new positive cases associated with variants that escape RT-PCR-based diagnosis. The hybrid approach is a promising solution, particularly as it is presented as a quicker & more cost-effective solution for a pandemic that is here to stay. However, the reviewers raised some concerns and questions that they believe would be important for the authors to address. Those are outlined below.

    Evidence and Examples: Concerns and Constructive feedback

    • The reviewers expressed some concern around the time in which the CXR images were taken across different patients, the severity of the cases, and other comorbidity factors which could have impacted the results. If that data is available, it would be helpful to compare CXRs across matched time points as it is possible that different acuity phases of the infection lead to different results. 

    • The reviewers wonder if the fact that the contours of the region to analyze were identified by different radiologists may have potentially undermined the assessments of the model's function. It would be useful to discuss the limitations of this approach in more depth.

    • Module 2 was designed to test replacement of learned coefficients with fixed coefficients; however, the alternative proposition of training a new network on the COVID-19 positive and negative CXRs was not explored. Are there implications that would need to be discussed related to this point?

    • Can the model identify CXRs with lung infection and perhaps ARDS patients? That is important to triage COVID patients for death risk.

    • Can this same approach be used for the diagnosis of other pneumonia-like diseases? Is the model specific enough to differentiate true covid+ cases from other similar diseases that mimic SARS-CoV-2-induced lesions? Perhaps the authors can discuss a bit more the specificity of the model and its replicability.

    • Have the authors checked AUCs using for example ResNet50 or any other DL structure? In essence does a VAE encode the same (or close) to the features encoded by the scattering wavelet approach?

    Below are other general suggestions and recommendations to improve on the clarity of the work presented.

    • Reviewers suggest to rewrite the manuscript in the more classic "IMRD" format, and have the discussion section point out limitations and overall conclusions of the study.

    • Regarding figures, reviewers suggest the addition of headers to the figures to help the reader get oriented and better match the figure to the results section.

    • Reviewers found the figures of the "mask" used to identify contours of the portion of tissue analyzed very helpful and suggest the authors refer to that figure the first time they mention the mask in the text.

    • Was the training dataset coming from a diverse group of individuals? If so, it would be great to know. If not, in the discussion it would be nice to see a reflection on the limitations that may arise from the fact that the data may come from an homogeneous group of people.

    • In the Methods section please mention possible IRB exemption or approval and any other ethical considerations.

    Other points and final remarks

    Overall, the reviewers really appreciated the fact that the authors took time to discuss the comparison of their new methodology with current state of the art approaches, as well as the fact that the source code for the analysis was deposited on GitHub—which will hopefully help other groups test the model on other data. The fact that the data is not publicly available, however, makes it hard to reproduce the results presented in the current manuscript.

    Several authors of the preprint were present in the call and contributed to the discussion. The organizing team is grateful to all the participants of the PREreview + JMIR Publications's Live Review and in particular to the subject matter expert, Dr. Hadi Kharazzi, for his contribution to the discussion, and to the preprint authors who engaged in this type of collaborative review.

    Competing interests

    The author of this review is a member of the PREreview team and facilitated the Live Review. They synthesized the notes from the Live Review discussion into this review and also contributed with a few suggestions for the preprint authors.

  2. SciScore for 10.1101/2022.03.13.22272311: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    Ethicsnot detected.
    Sex as a biological variablenot detected.
    Randomizationnot detected.
    Blindingnot detected.
    Power Analysisnot detected.

    Table 2: Resources

    No key resources detected.


    Results from OddPub: Thank you for sharing your code.


    Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    Results from scite Reference Check: We found no unreliable references.


    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.