PolicySegNet: a policy-based reinforcement learning framework with pretrained embeddings and transformer decoder for joint brain tumors segmentation and classification in MRI
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
PolicySegNet is a novel hybrid deep learning architecture developed for joint brain tumor segmentation and classification using MRI scans. It combines a pretrained SegFormer-B4 encoder (with a MiT backbone, originally trained on the ADE20K dataset) as a fixed feature extractor with a UNet-inspired decoder for segmentation and a lightweight classification head for tumor type identification. Unlike typical fine-tuning approaches, the SegFormer encoder remains frozen, enabling efficient training on limited domain-specific data. PolicySegNet uniquely integrates a policy-based reinforcement learning algorithm—specifically proximal policy optimization (PPO)—to jointly optimize the decoder and classifier based on a reward signal that balances segmentation accuracy with classification performance. The segmentation task involves four distinct binary masks, each representing a tumor class. Experimental results on a multi-class brain tumor MRI dataset demonstrate strong performance: on the training set, the model achieves a segmentation accuracy of 0.9961 and classification accuracy of 0.9133, on the validation set, it achieves a segmentation accuracy of 0.9936 and classification accuracy of 0.9175, and on the test set, it achieves a segmentation accuracy of 0.9924 and classification accuracy of 0.8803. During training, the model attains a reward of 0.7295. These results showcase the potential of combining transformer-based vision features with reinforcement learning strategies for improved medical image analysis, while requiring fewer computational resources due to the fixed encoder and lightweight architectural design.