SVTSR: Image Super-Resolution Using Scattering Vision Transformer
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Vision transformers have garnered substantial attention and attained impressive performance in image super-resolution tasks. Nevertheless, these networks face challenges associated with attention complexity and the effective capture of intricate, fine-grained details within images. These hurdles impede the efficient and scalable deployment of transformer models for image super-resolution tasks in real-world applications. In this paper, we present a novel vision transformer called Scattering Vision Transformer for Super-Resolution (SVTSR) to tackle these challenges. SVTSR integrates a spectrally scattering network to efficiently capture intricate image details. It addresses the invertibility problem commonly encountered in down-sampling operations by separating low-frequency and high-frequency components. Additionally, SVTSR introduces a novel spectral gating network that utilizes Einstein multiplication for token and channel mixing, effectively reducing complexity. Extensive experiments show the effectiveness of the proposed vision transformer for image super-resolution tasks. Our comprehensive methodology not only outperforms state-of-the-art methods in terms of the PSNR and SSIM metrics but, more significantly, entails a reduction in model parameters exceeding tenfold when compared to the baseline model. As shown in Fig. 1, the substantial decrease of parameter amount proves highly advantageous for the deployment and practical application of super-resolution models. Code is available at https://github.com/LiangJiabaoY/SVTSR.git.