Deep learning models for COVID-19 chest x-ray classification: Preventing shortcut learning using feature disentanglement
This article has been Reviewed by the following groups
Listed in
- Evaluated articles (ScreenIT)
Abstract
In response to the COVID-19 global pandemic, recent research has proposed creating deep learning based models that use chest radiographs (CXRs) in a variety of clinical tasks to help manage the crisis. However, the size of existing datasets of CXRs from COVID-19+ patients are relatively small, and researchers often pool CXR data from multiple sources, for example, using different x-ray machines in various patient populations under different clinical scenarios. Deep learning models trained on such datasets have been shown to overfit to erroneous features instead of learning pulmonary characteristics in a phenomenon known as shortcut learning. We propose adding feature disentanglement to the training process. This technique forces the models to identify pulmonary features from the images and penalizes them for learning features that can discriminate between the original datasets that the images come from. We find that models trained in this way indeed have better generalization performance on unseen data; in the best case we found that it improved AUC by 0.13 on held out data. We further find that this outperforms masking out non-lung parts of the CXRs and performing histogram equalization, both of which are recently proposed methods for removing biases in CXR datasets.
Article activity feed
-
-
SciScore for 10.1101/2021.02.11.20196766: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
No key resources detected.
Results from OddPub: Thank you for sharing your code and data.
Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:Clinicians should be aware of potential limitations and biases when incorporating model predictions into their clinical assessment. Finally, our approach has potential clinical applications beyond automated diagnosis. CXR diagnostic models that rely on relevant pulmonary findings may be also useful for the development of prognostic models, by combining the CXR information with other clinical and demographic data to predict which …
SciScore for 10.1101/2021.02.11.20196766: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
No key resources detected.
Results from OddPub: Thank you for sharing your code and data.
Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:Clinicians should be aware of potential limitations and biases when incorporating model predictions into their clinical assessment. Finally, our approach has potential clinical applications beyond automated diagnosis. CXR diagnostic models that rely on relevant pulmonary findings may be also useful for the development of prognostic models, by combining the CXR information with other clinical and demographic data to predict which patients are at risk for severe disease.
Results from TrialIdentifier: No clinical trial numbers were referenced.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: We did not find any issues relating to colormaps.
Results from rtransparent:- Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
- No funding statement was detected.
- No protocol registration statement was detected.
-