Systematic Evaluation of Label Noise Effects on Accuracy and Calibration in Deep Neural Networks
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Label noise is a pervasive issue in real-world datasets that can degrade both the accuracy and calibration of deep neural networks. In this study, we systematically examine how symmetric (random) and asymmetric (class-dependent) label noise influence model accuracy and confidence calibration in image classification using the CIFAR-10 dataset and a ResNet-18 architecture. We apply five levels of label noise (0%, 10%, 20%, 40%, 60%) and evaluate their effects using metrics such as test accuracy, Expected Calibration Error (ECE), and predictive entropy. Our findings show that increasing noise levels significantly degrade classification accuracy and impair model calibration. In particular, asymmetric noise at a 60% corruption level causes test accuracy to drop to approximately 38.7% while ECE surges above 35%, indicating extreme overconfidence in incorrect predictions. By contrast, symmetric noise at the same noise level yields higher predictive entropy (uncertainty) and a comparatively modest miscalibration (ECE ∼9%). These results highlight the importance of distinguishing noise types when assessing model robustness and reliability. All experiments are reproducible, with code and data publicly available to facilitate further investigation.