X-QSViT: Explainable Quantum-Self-Supervised Vision Transformer for Lung Classification

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Context: Histopathological image analysis remains critical for early and accurate diagnosis of lung and colon cancers. However, challenges such as class imbalance, scarcity of labeled data, computational inefficiency, and lack of interpretability hinder the deployment of AI systems in clinical settings. Objective This study proposes a hybrid quantum-classical framework, H-QSVT-X, to enhance classification accuracy, computational efficiency, and clinical explainability in lung and colon cancer diagnosis from histopathological images. Methodology The framework integrates a quantum-inspired self-supervised Vision Transformer, combining Quantum GAN (QGAN) simulated on classical hardware for class imbalance assessment, Masked Autoencoder (MAE), and SimCLR for self-supervised feature extraction, and quantum-inspired self-attention mechanisms for efficient long-range dependency modeling. Additional edge and texture analysis using depth-aware Canny and LBP features augment fine-grained tissue characterization. Grad-CAM is employed for visual explainability. Results The model achieved 98.4% classification accuracy, 98.1% precision, 97.8% recall, and 98.0% F1-score. QGAN reduced the imbalance ratio from 0.6 to 1.0, and MAE attained a reconstruction loss of 0.024. SimCLR yielded a contrastive loss of 0.012, with a latent similarity ratio of 7.58. The quantum attention mechanism improved precision by 4.2% and reduced computational time by 33%. Grad-CAM achieved 97.6% salient region coverage, with a classification confidence increase of 15.3%. Future Scope Future work includes expanding the model for multi-modal cancer analysis, integrating federated learning for privacy preservation, and validation on diverse clinical datasets for improved generalizability.

Article activity feed