Assessing Pneumonia in Chest X-ray Images Using a Modified VGG16 Model: A Comparative Study of Data Sampling Techniques

Sanjivani Joshi
G.S. Thakur

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Background: Imbalanced datasets pose significant challenges in deep learning-based medical image classification, often leading to biased predictions. This study evaluates the impact of various data sampling techniques and Bayesian hyperparameter optimization on the performance of the VGG16 model to diagnose pneumonia from the Chest X-ray Pneumonia dataset. Methods: Six different approaches were assessed: VGG16 with Random Under Sampling (RUS) VGG16 with Equal Sampling VGG16 with Random Over Sampling (ROS) VGG16 with Synthetic Minority Over-sampling Technique (SMOTE) VGG16 with Adaptive Synthetic Sampling (ADASYN) VGG16 with Bayesian Hyperparameter Optimization Each technique was implemented to balance the dataset, and model performance was evaluated based on training accuracy. Results: The training accuracy varied across different sampling techniques, with Random Over Sampling (ROS) achieving the highest at 99.85%, followed by Random Under Sampling (RUS) (99.34%) and ADASYN (99.19%).SMOTE and Bayesian Hyperparameter Optimization resulted in 98.31% and 98.53%, respectively, while Equal Sampling had the lowest training accuracy at 93.86%. These results indicate that oversampling methods, particularly ROS and ADASYN, significantly enhance model learning, while Bayesian optimization offers a stable alternative without surpassing the best-performing oversampling techniques. Conclusion: This comparative analysis highlights the impact of different data-balancing strategies on pneumonia detection using deep learning. Oversampling methods, especially ROS and ADASYN, proved highly effective in improving training accuracy, whereas Bayesian hyperparameter optimization provided stability but did not outperform the top oversampling techniques. Selecting an appropriate sampling strategy is crucial for enhancing the performance of deep learning models in medical image classification.

Version published to 10.21203/rs.3.rs-8070893/v1 on Research Square
Nov 25, 2025

ThoraxSense: Enhanced Thoracic Multi-DiseaseDetection on Chest X-Rays Using DenseNet121 andClass-Imbalance Optimization

This article has 6 authors:
1. Ansh Srivastava
2. Rishi Raj
3. Rishit Sharma
4. Rayan Haque
5. Rashmi K.B
6. Shobana T.S
This article has no evaluationsLatest version Feb 1, 2026
Failure-Aware Robustness Evaluation of Deep Learning Models for Tuberculosis Detection Under Real-World Chest X-Ray Degradation

This article has 3 authors:
1. Nitin Wankhade Nitu
2. Sagar Joshi sagar
3. Nitin Dhawas Nitin
This article has no evaluationsLatest version Jan 6, 2026
Application of Deep Learning Strategies in the Standardization and Diagnostic Efficiency Enhancement of Chest X-ray Imaging

This article has 5 authors:
1. Wen Chang Tseng
2. Yung-Cheng Wang
3. Wei-Chi Chen
4. Sen-Ping Lin
5. Kang-Ping Lin
This article has no evaluationsLatest version Dec 18, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

ThoraxSense: Enhanced Thoracic Multi-DiseaseDetection on Chest X-Rays Using DenseNet121 andClass-Imbalance Optimization

Failure-Aware Robustness Evaluation of Deep Learning Models for Tuberculosis Detection Under Real-World Chest X-Ray Degradation

Application of Deep Learning Strategies in the Standardization and Diagnostic Efficiency Enhancement of Chest X-ray Imaging