Cross-Modal Mamba Alignment for Multi-Sequence Brain Tumor Segmentation
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Reliable fusion of multi-sequence MRI remains challenging due to heterogeneous contrast, inconsistent noise patterns, and missing modalities. This work presents X-MambaSeg, a cross-modal alignment framework that integrates long-range contextual modeling with modality-consistent feature learning. The architecture employs a dual-branch Mamba encoder to separately extract representations from FLAIR, T1, and T2 sequences, while a contrastive alignment mechanism encourages structural consistency across modalities. A multi-scale fusion module further enhances boundary sensitivity, and a distribution-calibrated decoder mitigates intensity drift during reconstruction. Experiments on BraTS2023 (1,525 subjects; 1,200 for training and 325 for testing) demonstrate a Dice score of 0.922, outperforming Swin-UNet (0.894, +3.1%) and TransBTS (0.903, +1.9%). HD95 is reduced from 16.8 mm to 11.3 mm (−32.7%), and Boundary-F1 improves from 0.819 to 0.871 (+6.4%). Cross-dataset evaluation on BraTS2021 yields a 7.8% relative Dice gain, and removing the alignment mechanism leads to a 9.8% Dice drop and a 14.6% increase in modality inconsistency.