Synthetic MRI, dynamic contrast-enhanced MRI combined with diffusion-weighted imaging for identifying molecular subtypes of breast cancer using machine learning models
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Objective : To determine whether quantitative parameters from synthetic magnetic resonance imaging (SyMRI), dynamic contrast-enhanced MRI (DCE-MRI), and diffusion-weighted imaging (DWI) can effectively differentiate between molecular subtypes of breast cancer using various machine learning models. Materials and Methods : This retrospective study included 401 patients with suspicious breast lesions who underwent breast MRI examinations, including SyMRI, DCE-MRI, and DWI, from September 2020 to September 2024. Quantitative parameters obtained from SyMRI included T1-Pre, T2-Pre, and proton density (PD-Pre) values of breast lesions before contrast injection, as well as T1-Gd, T2-Gd, and PD-Gd values after contrast injection. Additionally, difference values (Delta-T1, Delta-T2, Delta-PD) and enhancement ratios (T1-Ratio, T2-Ratio, PD-Ratio) were calculated. Two radiologists retrospectively evaluated the morphological and kinetic characteristics on DCE-MRI, using apparent diffusion coefficient (ADC) values of the lesions to assess tumors on DWI. Logistic regression and ANOVA were applied to identify significant parameter differences among the four breast cancer subtypes. Based on these selected parameters by logistic regression, five machine learning models were developed: Logistic Regression (LR), Support Vector Machine (SVM), K-Nearest Neighbors (KNN), Random Forest (RF), and Decision Tree (DT). We plotted Receiver Operating Characteristic (ROC) curves and calculated the area under the curve (AUC) as the primary metric to assess the performance of the best model. We utilized the SHAP library in Python to generate feature importance values for our model's predictions. Results : A total of 292 patients (median age, 53 years; age range, 27–80 years) met the inclusion criteria. Among these, 204 patients (median age, 52 years; age range, 27–78 years) were assigned to the training cohort, while 88 patients (median age, 53 years; age range, 27–80 years) were included in the testing cohort. Eleven parameters were identified across the four breast cancer subtypes( p <0.05). These parameters included two clinical pathological factors: age and menopause( p <0.001); five SyMRI parameters: T1-Gd, T2-Gd, PD-Gd, T1-Ratio, and PD-Ratio( p <0.05); three DCE-MRI parameters: burr sign, time–intensity curve (TIC), and Breast Imaging Reporting and Date System(BI-RADS) grading( p <0.001); and one DWI parameter: ADC-Tumor( p <0.001). The SVM model demonstrated the highest overall performance based on the comprehensive evaluation of multiple metrics in the training set, achieving superior diagnostic performance with AUC, accuracy, specificity, and sensitivity of 0.972, 82.5%, 94.76%, and 82.14%, respectively. This SVM model achieved AUC values of 0.979 for luminal A, 0.925 for luminal B, 0.971 for HER2-enriched, and 0.982 for triple-negative (TN) subtypes in the training set; AUC values of 0.973 for luminal A, 0.873 for luminal B, 0.956 for HER2-enriched, and 0.955 for TN subtypes in the testing set. The Shapley Additive Explanations (SHAP) tool to effectively identify the importance of features contributing to the model, with T2-Gd, PD-Ratio, and burr sign showing the highest contributions, achieving mean absolute SHAP values of 0.418, 0.340, and 0.264, respectively. Conclusion : Quantitative parameters derived from SyMRI mappings, DCE-MRI, and DWI may provide a non-invasive approach for differentiating between the molecular subtypes of breast cancer using various machine learning models.