Towards Explainable Breast Cancer Classification Using SimCLR-Based Self-Supervised Representation Learning
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Precise and interpretable histopathology image-based breast cancer classification is of utmost importance for early diagnosis and efficient treatment planning. Conventional deep learning models tend to rely on large annotated datasets and are not interpretable, undermining clinical trust and deployment. In this work, an SSL method with SimCLR contrastive pretraining is utilized to tap the potential of unlabeled breast histopathology images in the BreaKHis 400x dataset, which is afterward fine-tuned for binary benign–malignant classification. SSL paradigm allows the model to learn invariant and generalized feature representations that strongly counteract morphological variation across tissue samples. For improving explainability and transparency, Grad-CAM, LIME, and bounding box visualization are applied collaboratively to represent model thought and indicate discriminative areas without lesion localization. Experimental results indicate validation accuracy at 98.74% and test accuracy at 98.05%, significantly outperforming baseline supervised approaches with more than 4%. These results establish the applicability of self-supervised learning to design efficient data, accurate, and explainable breast cancer diagnosis models for real-world clinical use.