Classification of Pneumonia from CXR scans using an Ensemble of Weighted CNNs, Attention block, and a Visual Transformer — EWCAVIT
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Pneumonia is a serious respiratory disease that causes high mortality rates worldwide,especially in children under the age of 5 and the elderly. Early detection ofpneumonia is crucial for preventing severe complications and improving patients’conditions. Chest X-Ray (CXR) imaging is a common method to diagnose pneumonia,but it requires a high level of expertise to interpret. Therefore, automaticmethods for pneumonia detection based on artificial intelligence are crucial forimproving the accuracy of the diagnosis This paper proposes the novel modelEWCAVIT for the classification of respiratory diseases from CXR scans. TheEWCAVIT model is based on an ensemble of Convolutional Neural Networks(CNNs), a Spatial and Channel Attention block, and a Visual Transformer (ViT)that operate in a sequential order. The CNN branch extracts the initial local featuresfrom the input image, the Spatial and Channel Attention blocks focus on therelevant features, and the Visual Transformer further captures the global features ofthe image. Moreover, each encoder in the CNN branch also operates on differentimage scales. Therefore, it is possible to capture both local and global features ofthe input image, with scales ranging from 24 × 24 to 112 × 112. Thus, a richerrepresentation of the input image can be obtained. The model is evaluated on thePneumoniaMNIST dataset to classify pneumonia from CXR images. This study’smodel achieves the highest accuracy of 93.6% on the test set, which is comparableto or outperforms the state-of-the-art methods.