Mammo-Bench: A Large-scale Benchmark Dataset of Mammography Images

Gaurav Bhole
S Suba
Nita Parekh

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Breast cancer remains a significant global health concern, and machine learning algorithms and computer-aided detection systems have shown great promise in enhancing the accuracy and efficiency of mammography image analysis. However, there is a critical need for large, benchmark datasets for training deep learning models for breast cancer detection. In this work we developed Mammo-Bench, a large-scale benchmark dataset of mammography images, by collating data from seven well-curated resources, viz ., DDSM, INbreast, KAU-BCMD, CMMD, CDD-CESM, DMID, and RSNA Screening Dataset. To ensure consistency across images from diverse sources while preserving clinically relevant features, a preprocessing pipeline that includes breast segmentation, pectoral muscle removal, and intelligent cropping is proposed. The dataset consists of 74,436 high-quality mammographic images from 26,500 patients across 7 countries and is one of the largest open-source mammography databases to the best of our knowledge. To show the efficacy of training on the large dataset, performance of ResNet101 architecture was evaluated on Mammo-Bench and the results compared by training independently on a few member datasets and an external dataset, VinDr-Mammo. An accuracy of 78.8% (with data augmentation of the minority classes) and 77.8% (without data augmentation) was achieved on the proposed benchmark dataset, compared to the other datasets for which accuracy varied from 25 – 69%. Noticeably, improved prediction of the minority classes is observed with the Mammo-Bench dataset. These results establish baseline performance and demonstrate Mammo-Bench's utility as a comprehensive resource for developing and evaluating mammography analysis systems.

Version published to 10.1101/2025.01.31.25321510v1 on medRxiv
Feb 2, 2025

Comparative Evaluation of Machine Learning-Based Radiomics and Deep Learning for Breast Lesion Classification in Mammography

This article has 10 authors:
1. Alessandro Stefano
2. Fabiano Bini
3. Eleonora Giovagnoli
4. Mariangela Dimarco
5. Nicolò Lauciello
6. Daniela Narbonese
7. Giovanni Pasini
8. Franco Marinozzi
9. Giorgio Russo
10. Ildebrando D'Angelo
This article has no evaluationsLatest version Mar 4, 2025
Predicting Prostate Cancer Without a Prostate: A Potential Problem with AI

This article has 5 authors:
1. Destie Provenzano
2. Murray Loew
3. Yuan James Rao
4. Vivek Batheja
5. Shawn Haji-Momenian
This article has no evaluationsLatest version Mar 4, 2025
Classification and Interpretation of Histopathology Images: Leveraging Ensemble of EfficientNetV1 and EfficientNetV2 Models

This article has 5 authors:
1. Mahdi Azmoodeh Kalati
2. Hasti Shabani
3. Mohammad Sadegh Maghareh
4. Zeynab Barzegar
5. Reza Lashgari
This article has no evaluationsLatest version Feb 19, 2025

Listed in

Abstract

Article activity feed

Related articles

Comparative Evaluation of Machine Learning-Based Radiomics and Deep Learning for Breast Lesion Classification in Mammography

Predicting Prostate Cancer Without a Prostate: A Potential Problem with AI

Classification and Interpretation of Histopathology Images: Leveraging Ensemble of EfficientNetV1 and EfficientNetV2 Models