StycoGAN for Feature Level Temporal Regularization in Perceptually Stable Sequential Image Synthesis

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Style-based generative adversarial networks achieve high spatial fidelity in image synthesis, yet their extension to sequential generation remains challenging due to temporal instability and style inconsistency across frames. Most existing approaches emphasize motion modeling or pixel-level temporal constraints, which often fail to preserve stylistic coherence. This paper proposes StycoGAN, a style-consistent spatial–temporal generative framework that enforces temporal regularization directly in the feature space of a style-based generator. The proposed model integrates a ConvLSTM-based temporal consistency module into an intermediate layer of the StyleGAN2-ADA backbone, enabling modeling of temporal dependencies while retaining high-quality style modulation. Additionally, a Styco-Consistency loss is introduced to suppress undesired stylistic drift across consecutive frames without imposing explicit motion constraints. Experiments on curated sequential image data demonstrate that StycoGAN improves temporal stability while maintaining competitive spatial realism. Quantitative evaluations show enhanced perceptual quality and temporal coherence compared to frame-independent and temporal baseline models, while qualitative results reveal reduced style flickering across frames. These findings indicate that feature-level temporal regularization offers an effective and flexible solution for perceptually stable sequential image synthesis.

Article activity feed