A Hybrid of the VGG-16 and FTVT-b16 Models to Enhance Brain Tumors Classification Using MRI Images

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Brain tumors pose significant diagnostic challenges due to their complex morphology and heterogeneous presentation in magnetic resonance imaging (MRI). While deep learning has advanced medical image analysis, existing approaches often struggle to balance local feature extraction with global contextual understanding, limiting classification accuracy and interpretability. This study introduces a novel hybrid deep learning framework that synergistically integrates the hierarchical feature extraction capabilities of VGG-16 with the global self-attention mechanisms of a fine-tuned Vision Transformer (FTVT-b16) to enhance brain tumor classification. Leveraging a publicly available Kaggle dataset of 7,023 MRI scans spanning four tumor types (glioma, meningioma, pituitary, and no-tumor), our model achieves a state-of-the-art classification accuracy of 99.46%, significantly outperforming standalone VGG-16 (97.08%) and FTVT-b16 (98.84%) implementations. Rigorous evaluation demonstrates superior performance across key metrics, including precision (99.43%), recall (99.46%), and specificity (99.82%), while attention maps from FTVT-b16 enhance model transparency by localizing discriminative tumor regions. The proposed framework addresses critical limitations of conventional CNNs (local receptive fields) and pure ViTs (data inefficiency), offering a robust, interpretable solution aligned with clinical workflows. These findings underscore the transformative potential of hybrid architectures in neuro-oncology, paving the way for AI-assisted precision diagnostics. Future work will focus on multi-institutional validation and computational optimization to ensure scalability in diverse clinical settings.

Article activity feed