Generative AI Shanshui Animation Enhancement using Perlin Noise and Diffusion Models
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Deep learning models have achieved remarkable advancements in image generation but face persistent challenges in synthesizing traditional Shanshui (mountain-water) landscape paintings due to limited domain-specific training data and the complexity of aesthetic principles. This study integrated Perlin Noise, Stable Diffusion, ControlNet, and AnimateDiff to enhance Shanshui landscape generation and animation. Perlin Noise constructs naturalistic skeletal structures, which are further refined using ControlNet for precise structural control. Advanced prompt engineering with GPT-4 and Textual Inversion improved prompt descriptiveness and mitigated low-quality outputs. Furthermore, LoRA fine-tuning improved the adaptability of our Shanshui landscapes model. Integrating I2V Encoders and AnimateDiff enabled the seamless transformation of static landscape images into dynamic animations, preserving artistic authenticity while introducing motion consistency. The experimental results demonstrated significant improvements in realism, stylistic fidelity, and diversity, addressing key limitations in existing generative approaches. This framework not only advances the field of generative AI in digital art but also offers new opportunities for the creation of multimedia content and cultural preservation through the synthesis of computational Shanshui animation.