Advancing Breast Cancer Detection: A Comprehensive Evaluation of Machine Learning Models on Mammogram Imaging

Reshad Al Muttaki
Sadia Afrin
Alvi Ibn Amzad Anil
Mehedi Hasan Shawon

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Breast cancer, which is among the top causes of cancer-related deaths in women worldwide, demonstrates the importance of effective and rapid diagnostic tools, especially in early diagnosis, to enhance the survival level. Although machine learning (ML) advances have had an increasing number of medical imaging applications, limitations of diversity and applicability of datasets, the interpretation and efficiency of models remain a challenge to clinical use. The paper assesses eight of the most popular ML models, such as Convolutional Neural Network (CNN), Kolmogorov-Arnold Network (KAN), k-Nearest Neighbors, Support Vector Machine, XGBoost, Random Forest, Naive Bayes, and a Hybrid model based on the Mammogram Mastery dataset of Iraq-Sulaymaniyah, which consists of 745 original and 9,685 augmented mammogram images. The hybrid model has the best accuracy (0.9667) and F1 Score (0.9444), and the KAN model has the best ROC AUC (0.9760) and Log Loss (0.1421), meaning they are best in terms of discriminative power and proper calibration. Random Forest, which has the lowest false negatives (3) when compared with Fast Multinomial and Fast Text, became most secure in clinical screening since it struck a balance between sensitivity and computing efficiency. The two practical challenges, though, are the slow inference time of the KAN model (0.323 seconds) and the expensive training cost (1009.10 seconds) of the Hybrid model. These insights explain that the Hybrid and KAN models are promising means of improving the accuracy of the diagnostics, and Random Forest can serve as a practically representative tool for reducing the number of missed diagnoses. The context of future research needs to address multi-dataset validation from multiple institutions, speed optimization of inference, multi-classification, and improved interpretability that will be used in clinically integrative settings. By addressing these gaps, ML-based diagnostics have the potential to increase the rate of breast cancer diagnosis, minimizing diagnostic errors and improving patient outcomes in various clinical contexts, which can facilitate the scaling of screening services available across the world.

Version published to 10.1101/2025.10.08.25337620 on medRxiv
Oct 10, 2025

Advancing Breast Cancer-AI Diagnostics: An Explainable Deep Learning Model Using 2D Grayscale Ultrasound Imaging

This article has 3 authors:
1. Ghulam Husain Abbas
2. Shafee Ur Rehman
3. Ghazal Bargshady
This article has no evaluationsLatest version Nov 2, 2025
A Versatile Foundation Model for AI-enabled Mammogram Interpretation

This article has 23 authors:
1. Hao Chen
2. Fuxiang Huang
3. Jiayi Zhu
4. Yunfang Yu
5. Yu Xie
6. Yuan Guo
7. Qingcong Kong
8. MingXiang Wu
9. Xinrui Jiang
10. Shu Yang
11. Jiabo MA
12. Ziyi LIU
13. Zhe Xu
14. Zhixuan Chen
15. Yujie Tan
16. Zifan He
17. Luhui Mao
18. Xi Wang
19. Junlin Hou
20. Lei Zhang
21. Qiong Luo
22. Zhenhui Li
23. Herui Yao
This article has no evaluationsLatest version Oct 9, 2025
Efficient Convolutional Neural Networks for Acute Lymphoblastic Leukaemia Prediction in Computer Vision

This article has 5 authors:
1. S B MOHAN
2. Sathya S
3. Rajalaksmi S
4. G Gurumoorthy
5. Rajkumar Sivanraju
This article has no evaluationsLatest version Oct 31, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Advancing Breast Cancer-AI Diagnostics: An Explainable Deep Learning Model Using 2D Grayscale Ultrasound Imaging

A Versatile Foundation Model for AI-enabled Mammogram Interpretation

Efficient Convolutional Neural Networks for Acute Lymphoblastic Leukaemia Prediction in Computer Vision