An agentic multimodal AI framework for end-to-end breast cancer staging and biomarker profiling
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background Breast cancer care spans screening, diagnosis and molecular stratification, yet most AI systems remain siloed into single tasks and single modalities. A unified, agentic system that can invoke specialized models across modalities could streamline end-to-end decision support. Methods We curated a cohort of 923 patients with paired radiology and pathology data and trained five CNN backbones (ResNet, DenseNet, EfficientNet, RegNet and MobileNetV3) for nine clinically relevant tasks, including T/N/M and clinical staging, histological grade, ER/PR/HER2 status and Ki-67 expression. We further implemented a late-fusion transformer and an orchestration layer that selects the appropriate single-modality or fused model per query, and evaluated performance using AUROC, accuracy and F1-score. Results Across tasks, the best single-modality models achieved AUROCs from 0.606 to 0.990, with strong performance for histological grade (AUROC 0.950; accuracy 0.909) and HER2 status (AUROC 0.810; accuracy 0.762). Multimodal fusion consistently improved discrimination over the best single modality (mean ΔAUROC 0.016), reaching AUROC 0.964 and accuracy 0.924 for grade, AUROC 0.831 for HER2 and AUROC 0.806 for N staging. Conclusion Together, these results show that agent-guided multimodal modelling can deliver robust, task-adaptive predictions spanning staging and biomarker profiling within a single framework. By enabling modular deployment—from screening-time risk triage to pathology-informed stratification—this approach provides a practical foundation for scalable, clinically integrated breast cancer decision support.