A Transformer Driven Hybrid Feature Fusion Framework for Multi-Modal Medical Image Analysis

S. Vidhya
R. Nithya

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Early disease diagnosis greatly depends on strong medical image classification models. In this paper, a hybrid method is proposed to combine handcrafted descriptors (HOG, BoVW) and deep features (VGG19) to form an integrative feature fusion representation. The combined features are then fed into an optimized Vision Transformer (FFXViT), which allows stronger global context modelling while maintaining key local information. Two reference modalities, histopathology images with three classes - adenocarcinoma, squamous cell carcinoma, benign and chest X-ray images with four classes - COVID-19, lung opacity, normal, viral pneumonia, were experimented on. The proposed approach FFXViT attained 99.50% on histopathology and 97.41% on chest X-rays accuracy, a remarkable improvement over state-of-the-art CNNs, transformer and hybrid baselines. The experiment showcases the scalability, robustness, and interpretability of the framework and empirically verify FFXViT as a viable solution for robust cross-modality medical image analysis and clinical decision support.

Version published to 10.21203/rs.3.rs-7712468/v1 on Research Square
Oct 21, 2025

UltraGANet: A Class-Conditional GAN Framework for Breast Tumor Classification in Ultrasound Imaging

This article has 2 authors:
1. Ali Hamza
2. Martin Mezl
This article has no evaluationsLatest version Oct 27, 2025
Pancreatic Tumor Detection in Computed Tomography Images Using Rotary Positional Siamese Vision Transformer

This article has 2 authors:
1. M Abinaya
2. M Kalamani
This article has no evaluationsLatest version Nov 19, 2025
Quantum-Assisted Deep Learning: A Hybrid Approach for Robust COVID-19 Diagnosis in Medical Imaging

This article has 2 authors:
1. Seyedeh Aram Salehi
2. Hanieh Naderi
This article has no evaluationsLatest version Nov 18, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

UltraGANet: A Class-Conditional GAN Framework for Breast Tumor Classification in Ultrasound Imaging

Pancreatic Tumor Detection in Computed Tomography Images Using Rotary Positional Siamese Vision Transformer

Quantum-Assisted Deep Learning: A Hybrid Approach for Robust COVID-19 Diagnosis in Medical Imaging