A Lightweight Perceptual-Guided VQVAE for High-Fidelity Image Compression

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

This study addresses the quality-efficiency trade-off in generative models for low-bit-rate image compression by proposing HiRes-VQ, a lightweight perceptual-guided VQ-VAE framework. The framework inherits the efficient hierarchical quantization architecture of VQ-VAE-2 and innovatively introduces a super-resolution codec. By employing parallel pathways for low-frequency structure reconstruction and high-frequency detail restoration, it achieves decoupling of frequency-domain features. Additionally, a multi-scale perceptual alignment loss is adopted to guide the model in learning feature representations aligned with human visual perception. Experiments on the FFHQ-256 and ImageNet-256 datasets demonstrate that, our model significantly outperforms lightweight baseline methods across all metrics, and surpasses high-complexity models in terms of quality-efficiency balance.

Article activity feed