A Dual-Phase Segmentation Framework Utilizing Gumbel-Softmax and a Cascaded Swin Transformer for Multi-Class Brain Tumor Segmentation

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Segmenting brain tumors from multi-modal 3D MRI scans is a crucial and demanding endeavor due to significant anatomical heterogeneity, class imbalance, and the intricate spatial configuration of tumor subregions. This paper introduces a cascaded segmentation architecture that incorporates several distinct parts to attain precise, interpretable, and class-aware segmentation on the BraTS 2020 dataset. Our architecture comprises two sequential stages: a coarse-to-refinement pipeline employing hierarchical 3D Swin Transformer blocks that segment 3D volumes into non-overlapping windows and implement multi-head self-attention within local contexts to capture both global and fine-grained dependencies. A Class-wise Attention Decoder is implemented during the refining phase to allocate specific pathways for each tumor category— Tumor Core (TC), Edema (ED), and Enhancing Tumor (ET)—resulting in distinct attention maps that are subsequently concatenated to generate a comprehensive attention-guided output. This facilitates class-specific concentration, even in overlapping or unclear areas. A Gumbel-Softmax module is incorporated to improve discrete class prediction and regulate softmax temperature, facilitating differentiable sampling and promoting resilient classification boundaries during training. We additionally include Grad-CAM visualization from several levels of the network, offering interpretability and insight into the regions that impact the model's predictions. Comprehensive experiments indicate that our approach surpasses current methodologies for Dice, recall, and precision across all tumor subregions. Both qualitative and quantitative findings validate the architecture's capacity to localize and define malignancies with enhanced structural detail and consistency.

Article activity feed