Multimodal deep learning framework for Alzheimer’s stage classification using GAN-augmented MRI and clinical data

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Alzheimer’s disease (AD) diagnosis remains challenging due to the scarcity of annotated medical imaging data, pronounced class imbalance, and the subtlety of early-stage structural brain changes. Traditional deep learning approaches often overfit to majority classes and lack interpretability, limiting their clinical adoption. We propose a multimodal deep learning framework that fuses MRI-derived spatial features with standardized clinical parameters to enhance Alzheimer’s stage classification. The pipeline incorporates a GAN-based augmentation module to synthetically expand minority class samples, validated using MS-SSIM and expert review, thereby mitigating class imbalance. MRI features are extracted using a fine-tuned ResNet50, where lower convolutional layers preserve general spatial feature detectors while higher layers adapt to AD-specific morphological patterns. Clinical features, normalized via z-score scaling, are concatenated with MRI-derived embeddings to form a unified multimodal representation. This joint vector is processed through fully connected layers for four-stage classification Non-Demented, Very Mild Demented, Mild Demented and Moderate Demented. Model interpretability is ensured via Grad-CAM for spatial saliency mapping and clinical feature attribution, offering transparent decision support for healthcare practitioners. Experimentally, the framework achieved 97.56% validation accuracy, 0.98 macro precision, 0.97 macro recall and 0.98 macro F1-score, outperforming a baseline CNN (85.00% accuracy) and a GAN-augmented CNN (95.00% accuracy). Minority class recognition, such as for Moderate Demented improved from near-zero representation to 99% precision and recall due to balanced augmentation. This approach offers a scalable, interpretable, and high-accuracy diagnostic solution with the potential to transform automated neurodegenerative disease detection in clinical and telemedicine settings.

Article activity feed