A Survey on Video Generation Technologies, Applications, and Ethical Considerations

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Video generation has rapidly advanced from early GAN-based systems to modern diffusion- and transformer-based models that deliver unprecedented photorealism and controllability. This survey synthesizes progress across foundational models (GAN, autoregressive, diffusion, masked modeling, and hybrids), information representations (spatiotemporal convolution, patch tokens, latent spaces), and generation schemes (decoupled, hierarchical, multi-staged, latent). We map applications in gaming, embodied AI, autonomous driving, education, filmmaking, and biomedicine, and analyze technical challenges in real-time generation, long-horizon consistency, physics fidelity, generalization, and multimodal reasoning. We also discuss governance and ethics, including misinformation, intellectual property, fairness, privacy, accountability, and environmental impact. Finally, we summarize evaluation methodologies (spatial, temporal, and human-centered metrics) and highlight future directions for efficient, controllable, and trustworthy video generation.

Article activity feed