Efficacy of lightweight Vision Transformers in diagnosis of pneumonia

Muhammad Tayyeb Bukhari

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Pneumonia is one of the leading causes of death in children under five, particularly in resource-limited settings. The timely and accurate detection of pneumonia, often conducted through chest X-rays, remains a challenge due to the scarcity of trained professionals and the limitations of traditional diagnostic methods. In recent years, Artificial Intelligence (AI) models, especially Convolutional Neural Networks (CNNs), have been increasingly applied to automate pneumonia detection. However, CNN models are often computationally expensive and lack the ability to capture long-range dependencies in images, limiting their efficacy in certain medical applications. To address these limitations, lightweight hybrid models such as Vision Transformers (ViTs), which combine the strengths of CNNs and transformers, offer a promising solution. This study compares the efficacy of two lightweight CNNs (EfficientNet Lite0 and MobileNetV3 Large) with two hybrid ViTs (MobileViT Small and EfficientFormerV2 S0) for pneumonia detection. The models were evaluated on a publicly available chest X-ray dataset using metrics such as accuracy, F1 score, precision, and recall. Results show that the hybrid models, particularly MobileViT Small, outperformed their CNN counterparts in both accuracy (97.50%) and F1 score (0.9664), demonstrating the potential of ViT-based models for medical imaging tasks. The findings suggest that hybrid models provide superior recall, reducing false negatives, which is crucial for medical diagnostics. Further research should focus on optimizing these hybrid models to improve computational efficiency while maintaining high diagnostic performance.

Version published to 10.1101/2024.10.24.24316057 on medRxiv
Oct 24, 2024

An Ensemble Learning Approach using Self-Supervised and Meta-Learning for Few-Shot Pneumonia Detection in Chest X-Ray Images

This article has 3 authors:
1. Atukunda Doreen
2. Waweru Mwangi
3. Petronilla Muriithi
This article has no evaluationsLatest version Aug 26, 2025
Evaluating Lightweight Vision Transformers for Chest Disease Detection in Low-Resource Clinical Settings

This article has 1 author:
1. Mridul Banik
This article has no evaluationsLatest version Oct 22, 2025
Clinical Application of Vision Transformers for Melanoma Classification: A Multi-Dataset Evaluation Study

This article has 5 authors:
1. Antony Garcia
2. Jixing Zhou
3. Gabriela Pinero-Crespo
4. Thomas Beachkofsky
5. Xinming Huang
This article has no evaluationsLatest version Oct 6, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

An Ensemble Learning Approach using Self-Supervised and Meta-Learning for Few-Shot Pneumonia Detection in Chest X-Ray Images

Evaluating Lightweight Vision Transformers for Chest Disease Detection in Low-Resource Clinical Settings

Clinical Application of Vision Transformers for Melanoma Classification: A Multi-Dataset Evaluation Study