Enhancing Medical Images Quality Using Vision Transformer Framework

Muhammad Hamza Farooq
Thamer Alshammari
Muhammad Usman Ghani Khan

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

High resolution images are crucial for precise diagnosis and efficient treatment planning in medical imaging. However, prevalence of low-resolution images remains a significant challenge, often limiting the detail and clarity necessary for reliable clinical evaluations. To address this issue, we applied the Vision Transformer Auto Encoder (ViTAE), a specialized Convolutional Neural Network CNN model designed for image enhancement. The study’s dataset, which included a range of medical imaging scenarios, was gathered locally from a computed tomography (CT) scan lab. Over a series of training epochs, the Vision Transformer Auto Encoder (ViTAE) exhibited consistent improvements in peak signal to noise ratio (PSNR), ultimately achieving PSNR of 43.06 decibels dB and Structural Similarity Index Measure SSIM of 0.983. Our proposed model ViTAE also outperforms the other Information eXtraction from Images IXI dataset having a PSNR of 43.72 decibels (dB) and SSIM 0.984 respectively. By optimizing its convolutional layers to extract and refine features from the input images, model progressively enhanced its ability to reconstruct and clarify images. These results underscore potential of ViTAE to significantly improve quality of medical images, offering a promising solution to overcome limitations of low resolution medical imaging.

Version published to 10.21203/rs.3.rs-7173744/v1 on Research Square
Aug 4, 2025

Enhanced Medical Image Segmentation via Wavelet-Deformable Attention Networks

This article has 5 authors:
1. Xuan Zhang
2. Rui Liu
3. Jing Dong
4. Pengfei Yi
5. Xiaopeng Wei
This article has no evaluationsLatest version Jan 20, 2026
Evaluation of a Multimodal Convolutional Neural Network-Based Approach for DICOM Files Classification

This article has 3 authors:
1. Vicent Mabirizi
2. Simon Kawuma
3. William Wasswa
This article has no evaluationsLatest version Dec 18, 2025
Medical Image Generation using Denoising Diffusion Probabilistic Model

This article has 6 authors:
1. Saritha A N
2. Sarvesh Rastogi
3. Shreya Bharamanna Patil
4. Basavaraj Talawar
5. Shreya Soni
6. Setu Mishra
This article has no evaluationsLatest version Jan 12, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Enhanced Medical Image Segmentation via Wavelet-Deformable Attention Networks

Evaluation of a Multimodal Convolutional Neural Network-Based Approach for DICOM Files Classification

Medical Image Generation using Denoising Diffusion Probabilistic Model