A Lightweight Perceptual-Guided VQVAE for High-Fidelity Image Compression

Zhisong Bie
Yunyang Kuang
Haobo Lei
Hongxia Bie

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

This study addresses the quality-efficiency trade-off in generative models for low-bit-rate image compression by proposing HiRes-VQ, a lightweight perceptual-guided VQ-VAE framework. The framework inherits the efficient hierarchical quantization architecture of VQ-VAE-2 and innovatively introduces a super-resolution codec. By employing parallel pathways for low-frequency structure reconstruction and high-frequency detail restoration, it achieves decoupling of frequency-domain features. Additionally, a multi-scale perceptual alignment loss is adopted to guide the model in learning feature representations aligned with human visual perception. Experiments on the FFHQ-256 and ImageNet-256 datasets demonstrate that, our model significantly outperforms lightweight baseline methods across all metrics, and surpasses high-complexity models in terms of quality-efficiency balance.

Version published to 10.21203/rs.3.rs-9046607/v1 on Research Square
Mar 18, 2026

Fisher-Aware Adaptive Mixed-Precision Ternary Hybrid Quantization

This article has 1 author:
1. vibhor joshi
This article has no evaluationsLatest version Apr 14, 2026
Gated Spatial-Frequency Fusion Enhanced Image Denoising

This article has 2 authors:
1. HENG LU
2. WEI LIU
This article has no evaluationsLatest version Mar 25, 2026
Ro-FusionGAN:An Adversarial Framework for High-Quality Multi-focus image fusion

This article has 4 authors:
1. Yongli Xian
2. Heng Zhou
3. Zhijie Gong
4. Congzheng Wang
This article has no evaluationsLatest version Mar 19, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Fisher-Aware Adaptive Mixed-Precision Ternary Hybrid Quantization

Gated Spatial-Frequency Fusion Enhanced Image Denoising

Ro-FusionGAN:An Adversarial Framework for High-Quality Multi-focus image fusion