Vision Transformer and FFT-ReLU Fusion for Advanced Image Deblurring

Syed Mumtahin Mahmud
Mahdi Mohd Hossain Noki
Prothito Shovon Majumder
Abdul Mohaimen Al Radi
Md. Haider Ali
Md. Mosaddek Khan

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Image deblurring is a crucial task in computer vision, aiming to recover sharp images from blurry inputs caused by camera shake, motion blur, or other factors. Traditional methods often struggle with complex or severe blur, particularly in high-resolution images. Recent advancements in deep learning, particularly Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs), have shown promise but have limitations in capturing long-range dependencies and computational efficiency. In this paper, we propose a novel image deblurring approach that integrates the strengths of Vision Transformers and the Fast Fourier Transform (FFT) with ReLU (Rectified Linear Unit) sparsity. Our method leverages a Vision Transformer architecture designed for image restoration tasks to preprocess blurry images, efficiently capturing both local and global features to reduce blurriness. This is followed by post-processing using FFT with ReLU sparsity, which targets and removes blur-related frequencies while preserving image sharpness and clarity. Extensive experiments on benchmark datasets demonstrate that our method produces sharper, more visually appealing images compared to state-of-the-art models. Furthermore, subjective human evaluations alongside traditional metrics such as PSNR and SSIM provide comprehensive evidence of the practical effectiveness of our deblurring technique. Our results indicate that the proposed method not only excels in quantitative measures but also significantly enhances perceptual image quality, making it highly suitable for real-world applications. The source code and results are available at https://github.com/Dip-to/ Vision-Transformer-and-FFT-ReLU-Fusion-for-Advanced-Image-Deblurring.git

Version published to 10.21203/rs.3.rs-5306286/v1 on Research Square
Oct 24, 2024

Fourier-Enhanced TecoGAN: Advancing Video Super-Resolution with Spectral and Gradient Losses

This article has 2 authors:
1. Md. Asif Hasan
2. Radee Jamil Khan
This article has no evaluationsLatest version Jan 9, 2026
Enhanced Medical Image Segmentation via Wavelet-Deformable Attention Networks

This article has 5 authors:
1. Xuan Zhang
2. Rui Liu
3. Jing Dong
4. Pengfei Yi
5. Xiaopeng Wei
This article has no evaluationsLatest version Jan 20, 2026
Efficient Real-Time 3D Scene Reconstruction of Brain Tumors Using Convolutional Neural Networks and Image Processing Pipelines

This article has 2 authors:
1. DevendraBabu Pidatala
2. Preeti Jha
This article has no evaluationsLatest version Dec 15, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Fourier-Enhanced TecoGAN: Advancing Video Super-Resolution with Spectral and Gradient Losses

Enhanced Medical Image Segmentation via Wavelet-Deformable Attention Networks

Efficient Real-Time 3D Scene Reconstruction of Brain Tumors Using Convolutional Neural Networks and Image Processing Pipelines