Normalized Style Space and Latent Alignment Metrics for Improving Fidelity, Perception, and Editability in GAN Inversion
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
In image synthesis, Generative Adversarial Networks (GANs) have demonstrated impressive generative capability, yet the inversion of real images into the latent space remains inherently challenging. Existing inversion pipelines typically consist of an embedding stage followed by a refinement stage; however, while refinement improves reconstruction fidelity, perceptual quality and editability remain largely constrained by the initial latent codes. This reveals a fundamental problem: obtaining latent representations that simultaneously support high-fidelity reconstruction, perceptual plausibility, and strong editability.In this work, we show that these properties are highly correlated with the alignment between inverted latent codes and the native synthetic latent distribution. Building on this insight, we introduce the Latent Space Alignment Inversion Paradigm (LSAP) , a unified framework that incorporates both a quantitative metric and an inversion solution. We propose the Normalized Style Space (S N space) and the Normalized Style Space Cosine Distance (NSCD) to measure and minimize latent-space disalignment across encoder-based and optimization-based methods. LSAP provides a consistent alignment mechanism that enhances the quality of latent codes produced in both inversion stages.Extensive experiments across diverse datasets demonstrate that NSCD reliably captures perceptual and editable characteristics, and that LSAP achieves state-of-the-art performance, significantly improving fidelity, perceptual realism, and editability in GAN inversion. Code is available at: https://github.com/zxk-priv/LSAP GANInverter.